Marius Niculae

Authors: Viktor Mayer- Schönberger and Kenneth Cukier

Big Data is an easy to read book even for those who are not so familiar with the subject or with its specific terminology such as predictive analytics, quants, causality, correlations, algorithms, datafication, digitalization or exabytes.

According to Viktor Mayer-Schönberger and Kenneth Cukier, “There is no rigorous definition of big data. Initially the idea was that the volume of information had grown so large that the quantity being examined no longer fit into the memory that computers use for processing, so engineers needed to revamp the tools they used for analyzing it all (…). One way to think about the issue today — and the way we do in the book — is this: big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more.” (Chapter 1, Now, Letting the data speak).

However, in order to cope with the difficulty of defining big data in a traditional way, the authors come up with a lot of practical example of big data usage covering fields such as health care, transportation, aviation, brokerage, on-line commerce, retail, social networking or automotive industry. Consequently, readers are able to relate to such empiric examples and further encouraged to build their own definition of big data and, at the same time, evaluate the impact of its usage.

While reading the book I have noticed one important aspect that we have to keep in mind when discussing the big data phenomena: big data allows us to evolve from a world based on causality to one driven by correlations. Hence, big data is about predictions, it doesn’t tell us why something happens but what is going to happen. As a result, people are able to make multiple use of the latent value of information contained by big data and change the own nature of their daily life, businesses, markets, and society.

At this point, the authors connect the capacity of a country to collect and use big amounts of data with its commitment to development and progress. From this perspective, data is a “building block for new goods and business models” (Chapter 6, Value). By using the available amount of information trapped inside data we are able to change the essence of things happening around us. In this sense, more and more cutting – edge businesses able to come up with unique ideas about ways to tap data to unlock new forms of value are now emerging. Some of these innovative ideas and instruments influence also the way in which public administrations work, going from new methods of fighting organized crime to efficient mechanisms of delivering public services. To support their arguments, the authors use the example of New York City Hall which, in 2009, decided to create a special position for a director of analytics in charge with unmasking the villains of the subprime mortgage scandal. The unit was so successful that mayor Bloomberg they decided to expand its scope of to several other domains of activity.

Although the book speaks a lot about the innovative work in collecting and using big amounts of data, done by IT giants such as Google, Amazon, IBM, Microsoft, Facebook, Twitter, Linkedin, Ivory League universities like Harvard, IMT, Oxford or different start – ups from Silicone Valey, and takes the reader back in time thought captivating short stories about the contribution of primal big data usage to the evolution of mankind, it refrains from giving straightforward examples about the misuse of big data.

Instead, the authors recognize that the age of big data will require new rules to safeguard the sanctity of the individual and that simple changes to the existing ones will not be sufficient to “temper big data’s dark side” (Chapter 9). A first challenge for us will be to learn how to deal with the shift from privacy to probability and its in-built lack of transparency. At personal level we will have difficulties in understanding why something happens but we will have the advantage of knowing what it is going to happen. Such an approach can highly conflict with the way in which our cognitive system functions and with our constant need for finding causal relationships for things affecting our lives. Nevertheless, at the moment we start using a smart phone, a credit card, create an account on a social network or browse the internet we need to understand that we become subjects to data collection and, in this way, direct contributors to a new world dominated by correlations and predictions.

This new paradigm in which people, organizations or institutions are able to access more information and offer fewer explanations requires that big data users become more accountable for their actions. A first step will be to focus public speech not on the ways in which companies or governments collect gargantuan amounts of data but on the ways in which they are using the information extracted from their new found treasure. Further steps should consider principles such as openness, certification or disprovability, able to counteract any predisposition towards endowing the data with more meaning and importance that it deserves and avoid a possible dictatorship of data. In the end, we can never have perfect information our predictions are inherently fallible.

On the other side, we now have new tools to cut-back pollution by identifying the best routes to deliver a product by sea, air or land, prevent the expansion of deadly viruses, fight against organized crime, improve public services or reduce language barriers and poverty, especially when its intrinsic value is used “with a generous degree of humility …and humanity” (Chapter 10, 197)

However, since we do not have yet the means to capture all the information out there, big data is a resource and a tool that doesn’t offer ultimate answers, just well enough ones.

by Marius Niculae & Roxana Damian



Author :