A New Year has arrived and I thought I should open up with my favourite tech topic of 2015. Even though, Big Data has been around for at least 3 years, this will be the year when people will realise its power. Well, this article is about how I would explain to somebody that has no technical background about what it means. Big Data is a term that has been unfortunately overused lately, but is, by true right, the biggest trend in technology we’ll experience this decade.
Data is around us every day and has helped businesses work more proficiently and efficiently for decades. We gather data in order to predict, for example, what are the times in a factory, when the employees are most productive. Correct analysis of the data leads to an increased productivity by improving strategy, which ultimately leads to financial gains.
However, the most difficult part here is to analyse this data. For example, Amazon knows that golf equipment is related to sunscreen. This way, when a user’s searching for golf equipment, they’ll receive recommendation to order sunscreen, too. The computer wouldn’t know these were related, if big sets of older data, wouldn’t have shown that a significant number of users usually order these products together or after a very short period of time.
As Jack Cutts states in his article on “Ghost in the machine: The Predictive Power of Big Data Analytics”, “correlation does not equal causation”. For example, the fact that I drink water every day and I also sleep every day have a strong correlation, but it does not mean that the fact that I drink water is the cause of my sleep. Therefore, here is the beautiful power of computers that would be able to read, diagnose and predict on this data in order to match correlation to causation.
A big threat to Big Data at the moment is the issue of privacy. People are afraid of something collecting data about themselves, as they don’t trust anonymity in this matter. It is very important for people to understand the benefits of this and be sure and confident that all data about themselves is stored anonymously. A lot of companies have made mistakes and have been too intrusive in the user’s personal life, proving to know a bit more than they were supposed to.
The medical industry has been trying for years to store the patients’ data electronically. This would help, for instance, analyse these millions of medical records and predict disease. The machine doesn’t have to know the identity of all the patients in order to read the data and determine any connection between a blood measurement and the incidence of blood-related diseases, such as leukaemia or diabetes. Identities are completely irrelevant. Data such as age or location might be the only relevant factors, but still, they won’t know to whom they belong to. It’s pure statistics enhanced by computing power.
To sum up, I would encourage all people who are a bit sceptic about institutions collecting their data to be more open towards this, as the vast majority of it will only benefit in predicting behaviours and help in prevention. We do share benevolently much more personal data over the Internet on social media websites like Facebook, which are more likely to threaten one’s safety. TV has been monitored for years in order to be able to help set the best advertising rates and the demographic profile of viewers. The power of Big Data is limitless and would only bump into barriers such as bandwidth, computing power or people’s imagination.