To be honest, data has always been one of the words that confuses me, especially when writing papers. Its meaning can change depending on the context of the sentence, and it’s also one of those sneaky words, that is plural but doesn’t outright appear to be. So, it’s fair to say that data hasn’t been one of my favorite words to use. However, after 4 lab courses at Colby, all in the natural sciences, I have become more accustomed to working with raw data.
While Professor Aaron Hanlon’s lecture on Revolutions in Big Data initially seemed boring, as I was not very keen on the subject, I was very surprised to be intrigued and fascinated by his presentation. Hanlon’s lecture looked at the evolution of data in several different ways, including meaning, interpretation, and frequency of use.
According to Hanlon, the first recorded use of the the word data was in the early 17th century as “a heap of data” ;describing the word of God. This use of the word in the religious/spiritual context makes it sound as if the word is synonymous with truth, but this is not right. Data isn’t fact, but is the first step in formulating ideas, and builds fact and truth. Data has come a long way since then, and has expanded to mean multiple things.
An example Hanlon used was Hooke’s book Micrographia, which showed small insects and organic material, such as fleas and leaves, blown up in drawings to show very small detail. This in itself was a small revolution as this new form of data, revolutionized the way people thought, as they’d never been able to see creatures in such detail before.
The lens and context in which data is presented is also very important. One of the main concerns Hanlon expressed, is that in this day and age where data is abundant and constantly changing, that it is easy to misconstrue the meaning of data if you don’t have the context. For example, imagine looking at a medical chart for a patient that shows concerning vital signs. If a doctor was to look at this cart without any previous knowledge of the patient, they could easily think that the patient was in a declining state of health. However, what if the patient’s vital signs had been significantly worse an hour ago and they were actually showing signs of improvement? This shows the danger of taking raw data at face value without understanding the context of the situation.
One of the most interesting parts of the lecture though, was when he showed us how the frequency of words varied over the years using the Google n-gram viewer. Not only was I amazed to learn about new software that I could play with, but I surprised to see how much fluctuation there was within the words fact, truth and data. Around the 1850s, the use of the word fact increases and the use of the word truth decreases. This shows that as the meanings of words change, their popularity changes, but also that authors were becoming less concerned with feeling and more concerned with fact. However, the word data surpassed the usage of both of these words, showing that data is more all-encompassing and is the building block of both fact and truth.