Professor Aaron Hanlon’s talk “Revolutions in Data, Big and Little,” made me consider the digital aspects of future history. Professor Hanlon talked through the history of data, from the genesis of the word through the scientific revolution into modern times. The more traditional forms of data, such as Robert Hooke’s drawing of the flea, were translations of observations into visual mediums that others could observe without firsthand experience. Today historians can look back on the pen and paper records that past data collectors have left. However, in today’s world, the bulk of data being produced today, including the n-grams that Professor Hanlon showcased, are digital. With the plethora of data available today, how will future historians be able to easily categorize the information of this age?

In the past, data moved more slowly and methodically between people. The printing press greatly sped up the spread of information because it no longer took months or even years to copy a single book. While undoubtedly a lot of the information produced before the digital age has been lost, even cataloguing what remains is quite a task. However, there is a finite amount of information to be processed.

Now, the digital world contains exponentially more information than has ever been written before. How do we visualize such massive amounts of data? Graphs? Categories? What categories? Who decides? Can a person even comprehend the amount of data the world has produced? What about artificial intelligence?

The question is not only how future historians will process our data, but also how we will process it today. People who use data are often very specialized. Is there a new profession developing of people who deal with big data, cross disciplinarily? In order to comprehend such massive amounts of data, perhaps some new technology is necessary. I know practically nothing about computers or programming, so I cannot wrap my head around such a project. However, I think the results of an organizing scheme would be interesting.

I also wonder if the definition of what constitutes data will change in the digital age. Does the word even encompass such large amounts of information? I suppose the term ‘big data’ has modified the word to redefine its associations.

With the massive amounts of data being produced today, inevitably some things will be lost in the quagmire and others will be given a larger proportion of attention than they deserve. We are beginning to see a trend of people only paying attention to the data that supports their assumptions, especially in politics. How can anyone be sure what the truth is if there is always the possibility of some other data floating out in the ether? Who should have control of what information is promoted? This slightly rambling post presents just a few of the challenges that come from dealing with big data. I guess some of these questions will be answered as more people begin to study big data and the idea of big data itself becomes a larger part of public consciousness.