You know Big data is a very hot topic. Everybody talks about it.
As all those media tech buzz words, I hear Big Data a lot but I never understand what exactly it is.
After reading through some good articles, I summarize some key points that I feel will be helpful to understand some basics about Big Data.
Big Data is about four Vs:
Velocity: the data is generated real-time and being processed real-time.
Variety: the data is different types and mostly unstructured ( eg. photos, videos).
Veracity: the data is messy. The quality is low and the accuracy is hard to control.
The picture below from IBM is a very good illustration of the 4 Vs.
The Big Data Technology has been through three generations:
Batch processing: represented by MapReduce
Real-time processing: represented by Storm
Hybrid: represented by MillWheel
The picture below shows the history of Big Data processing technologies.
Does Big Data really have value?
In Bernard Marr’s LinkedIn post, he mentioned another V about Big Data, which is Value.
A golden example about how Big Data is bringing value is Target. Target used Big Data to predict if a customer is pregnant or not based on the lotion and certain type of Vitamin she bought. Then Target sent baby item coupons to these customers, including a girl who did not tell her family that she was pregnant. Another example is that Wall St used Big Data on Twitter to predict investor’s emotions and make decisions on tradings.
I believe there is always some value in data, not matter it is big or small. But the two important things are: 1) Can you justify the value in cost-benefit analysis? 2) Is your organization Agile enough to take advantage of the value?
Hope this is helpful. To learn more, check out the Big Data Guru. Feel free to share your thoughts!