Thursday, May 31, 2012

Big data 2.0

One of the hottest fields in the software industry today is big data. It seems that everywhere you go you hear about it.

In the past, what constrained big data from being a reality was low capacity storage (or the price of storing large quantities of data), once this obstacle was removed, and Moore's law kept on ticking, extremely powerful machines were at reach for everyone, and hardware was no longer an issue.

Once hardware stopped being the bottleneck, it was time for software to stop acting as such. This was addressed by designing and implementing better and stronger analysis algorithms (think of the fields of machine learning and data mining) as well as finding and inventing new visualization mechanisms. These solutions are aiming at taking data and extracting information out of it, either by analytically detecting a signal in the data or by rendering the data in ways that it is easy for the mind to grasp and perform its own analysis.

The problem with these solutions is that they are at best when there is a definite answer to a well defined question. Still, more than often, that answer is relatively flat and abstracts out the complexity of both the answer and the question.

There is a need to find a new mechanism, something that would tell the story of the solution, that would deliver the idea that captures the complexity of the question and the complexity of the answer. This is not something that is found in nowadays computer science - it is not a combination of CS and math (to yield better algorithms) or CS and cognitive psychology / neuroscience (the forebearers of the information visualization and HCI fields).

My  guess is that it would be a new field, some sort of an hybrid of CS and philosophy. I specifically think that philosophy is the field from which answers would come since one of the key aspects of philosophy is the ability to tell an idea, a real idea,  not just a story (this is the focus of literature) or express a feeling.

When this field is established - it would be big data 2.0