Big Data is like fresh fish…
26/11/2012 Leave a comment
One of the new buzz words is now Big Data. According to IBM, every day we create 2.5 quintillion (1018) bytes of data. Huge quantities of data elements are collected from a wide variety of different sources. Well-known sources are traditional transactional databases from ERP, SCM and CRM systems. Current enterprise systems create and store huge amounts of data, transactional data in the first place but also meta-data, audit data, security data, backup data and so on. Since the prices for storage and processing are ever decreasing, the volumes of databases increase. In order to be able to steer our businesses efficiently and effectively we are in a constant demand for supplementary data. However this data is only estimated at 20% of what could be available for the business. That is why we speak about Big Data.
More and more data is also coming from sources ‘outside’ the company like tweets, email messages, Facebook notifications, pictures, blogs, portfolios, websites, and videos. These data elements sneak into our businesses. Big Data is however not so well structured like data stored in relational databases. Twitter generates more than 7 TB of data every day and Facebook generates more than 10 TB. These are data volumes equivalent to 7000 and 10000 movies of 100 min. respectively. Every day! And these amounts are increasing every day. But what do we do with Big Data? Or better: what can or should we do with Big Data? Some quick answers. Can we do something with Big Data? Yes. What can we do with Big Data? It depends. Should we do something with Big Data? Likely yes.
First of all we have to keep in mind that Big Data comes in all sorts of formats (examples: xml, doc, txt, bmp, json, html, mpeg4, …) making an unified overview and the processing of it very challenging. Most social media platforms offer all kind of API’s to retrieve data out of their huge databases. However most companies do not have the right tools to deal with such a broad variety of formats and cannot deliver comprehensive information or knowledge out of it. Also the capacity to process Big Data is not always present in organizations.
Second, Big Data is like fresh fish. You have to catch it and cook it immediately because fresh fish tend to become very quickly tainted. So does Big Data. What is the purpose of retrieving tweet streams of last week if you want to take countermeasures to a viral wave of bad tweets imposing your brand name? The velocity by which Big Data is growing is fast but the opportunity to harvest data depends on the nature of the data and the kind of information or knowledge we want to obtain. This can be a problem. Do we know what we want? Do we see what kind of information can be delivered from Big Data? The answer is sadly no, not always. Dealing with Big Data means mains harvesting data in motion, throw away the mud and keep the raw diamonds. This takes time and effort and above all, how do we now that we have found a diamond? Businesses do not always have the time to wait for a raw diamond raising from the mud. Still the potential is there. The recent presidential elections in the US were predicted by a professor based on a statistical and historical model. The model works entirely with big data that is available for everyone. However it was not known in advance if the model would have predictable power or not. So, we sure can do something with Big Data and we should give it at least a try to see if we can retrieve knowledge out of it. Big Data is not a gambling game, but it requires the right tools to sift away the mud and to keep the diamonds. These tools are slowly finding their way to the market.