An example of Google’s massive mounds of data – the Google Ngram

Data mining (looking for useful or interesting trends in massive amounts of data) is becoming more accessible to individuals as huge databases are being brought online by companies such as Google and Amazon. A good example is the Google Ngram Viewer (https://books.google.com/ngrams). It is a front end search engine that queries Google’s database of public domain books going back several hundred years. You can search for individual words and phrases (e.g. When did the term robot come into common use? According to the Ngram Viewer, is was in the mid-1920s). However, the tool also supports more complex queries. This article discusses some them: http://www.theatlantic.com/technology/archive/2012/10/bigger-better-google-ngrams-brace-yourself-for-the-power-of-grammar/263487/ An interesting query in the article uses basic math in the query to compare how often (as percentages) the term “United States” was used as a singular noun versus a plural noun. In the 1800s, there was a significant shift following the Civil War where the United States became more commonly referred to as a single nation rather than a collection of individual states. As is typical with this type of “big data”, the possibilities are endless.

General Amazon AWS Google Ngram