An example of Google’s massive mounds of data – the Google Ngram
Data mining (looking for useful or interesting trends in
massive amounts of data) is becoming more accessible to individuals as huge
databases are being brought online by companies such as Google and Amazon. A
good example is the Google Ngram Viewer (https://books.google.com/ngrams).
It is a front end search engine that queries Google’s database of public domain
books going back several hundred years. You can search for individual words and
phrases (e.g. When did the term robot come into common use? According to the
Ngram Viewer, is was in the mid-1920s). However, the tool also supports more
complex queries. This article discusses some them: http://www.theatlantic.com/technology/archive/2012/10/bigger-better-google-ngrams-brace-yourself-for-the-power-of-grammar/263487/
An interesting query in the article uses basic math in the query to
compare how often (as percentages) the term “United States” was used as a
singular noun versus a plural noun. In the 1800s, there was a significant shift
following the Civil War where the United States became more commonly referred
to as a single nation rather than a collection of individual states. As is
typical with this type of “big data”, the possibilities are endless.