Google Sheets, Data Mining, and more
2021-04-17 01:55:20.085893+02 by Dan Lyke 0 comments
So I realize that not all of you are ResearchBuzz News subscribers or fans of Tara Calishain, but holy crap if you're not following along with some of the coolness she's doing with mining Wikipedia using Google Sheets, it's kinda mind-blowing, both to see Sheets as a platform like that, and the ideas she's got for extracting interesting things out of Wikipedia (and other sources).
Some descriptions and screen caps in her Patreon feed, but the real amazeballs is when we get the s00p3r s3krit URLs to the sheets in progress and both get to play with them, and watch other people playing with them.
And think about what might be possible on a more flexible platform.
For instance, she's taking a search term and looking for Wikipedia activity by date, and then feeding that back into Google News searches. So you know those times when you're like "oh, those assholes again", but searches on those assholes are all about *this* time? This is a way of deriving a timeline of the previous time(s).
But it's kinda limited in scope by spreadsheet data types, so it's good for the past few months, but not the things that happen a decade or so apart.
Anyway, someone needing a data scientist exploring this sort of stuff should hire her. Google should hire her to make Google News results better.