Published daily by the Lowy Institute

The transformative potential of big data

But collecting ever more information might just end up posing ever more difficult questions.

There are copious volumes of data you create yourself — and its praised as the world’s most valuable resource for everything from shopping to surveillance (Photo: Chris Ratcliffe via Getty)
There are copious volumes of data you create yourself — and its praised as the world’s most valuable resource for everything from shopping to surveillance (Photo: Chris Ratcliffe via Getty)
Published 24 Jun 2019   Follow @Miah_HE

We intuitively know the volume of digital information yet the increasing numbers are still staggering. In 2013, 90% of all the data in the world had been generated in the preceding two years. Forecasts suggest that by 2020, there will be as many bits in the digital universe as there are stars in the physical universe. The world’s digital information is forecast to double in size every two years.

This exponential growth in data combined with increased computational power and storage capacity has enabled advanced analysis and is driving a new kind of social change.

Big data analytics have been presented as the panacea for information overload and big data itself as no less than transformative and world changing. In their 2014 book, Viktor Mayer-Schönberger and Kenneth Cukier emphatically declared that big data “will revolutionise the way we live, work and think”. Commentators have praised big data as the new oil of the 21st century, the world’s most valuable resource and the foundation of all of the megatrends that are happening today, from social to mobile to the cloud to gaming.

The ubiquitous mobile phone. Madrid (Photo: Jorge Sanz via Getty)

That big data and associated analytics have become a ubiquitous feature within commercial enterprise within the last 20 years is only too evident. In light of these examples we are now beginning to grasp the implications of mass data collection – and analysis – in every facet of our lives. Big data (and associated analytics) is a technological phenomenon with unprecedented social impact.

While many Western nations grapple with a reconceptualisation of privacy and how to protect the individual, there are also many necessary applications to law enforcement, public safety and delivery of services. If we look to other countries, we see these trends taken to extremes.

There is a lot of debate about whether and how we should adopt these technologies, rather than awareness that we largely already have.

In China, the social credit system is a state-run trial, that when fully operational will act as a personal scorecard for each of the nation’s 1.4 billion citizens. The system is fuelled by vast networks of CCTV cameras (equipped with facial recognition, body scanning and geo-tracking) and continuous flows of data. This data comes from smartphones, companies (such as Alibaba shopping records), services such as financial records and government sources such as education history, medical records and state security assessments which will be fed into individual scores – although exactly how those scores are created remains unclear.

At this point, China’s approach appears to be largely focused on domestic social control and applied to the stated aim of the social credit system, to promote trust and rewards and make life hard for the discredited. However this also appears to include punishments such as enforcing monitoring software on phones, regular biometric checkpoints and extensive visible surveillance in streets of North Western China where many Uighurs reside.

China is an extreme example where, at least from the outside, the delineation between state, military and commercial enterprise are opaque.

However, Australians are not immune from the big data revolution. By 2020, it’s estimated that 1.7 megabytes of data will be created every second for every person on earth – including you.

This data largely starts with the smartphone but certainly doesn’t stop there. There are copious volumes of data you create yourself such as financial transactions, health and phone records, email and data communications, online searches, mapping use, entertainment use (e.g. Netflix and Spotify), fitness trackers, taxi and uber trips, travel bookings and purchases. If you use social media platforms, you are generating data about yourself through platforms such as Facebook, Instagram, LinkedIn, YouTube and Twitter.

In addition to the self-generated data, there are also large volumes of data mostly created and collected by machines which may include you in moments of daily life (that are not all easy to opt-out of). This can range from government and commercial enterprise, such as CCTV footage, toll road tags and number plate recognition points, as well as public transport use, access swipe cards to work places, apartments and hotels. This information is collected and stored – usually in an anonymised fashion – but can be correlated with other data and (inadvertently or purposefully) identify you. Additionally, services often require information and applications to be submitted online, such as for jobs, real estate rentals, gas, electricity and internet establishment.

It’s often not clear where the data you have created or that is collected about you is used, stored and what consent you have given for its further use. It’s hard to know to whom you are giving information and how that is regulated and protected. Informed consent is challenging with long and confusing terms of service that few read and demonstrated by the ongoing issues with Facebook privacy and data sharing.

So many data points. Barcelona (Photo: Miquel Llop via Getty)

Understanding and consent is more complex because the same information can be protected differently if collected or transmitted in different ways. For example, a text communication can be sent through the traditional cellular network which in Australia is heavily regulated. However, users could also send that same communication through WhatsApp (where encryption covers content but not always the metadata) or an almost never-ending variety of platforms such as Signal, Wire, Twitter, LinkedIn, Facebook, Instagram and Viber, all with completely different privacy considerations and terms of service allowing third party use.

One of the appeals of big data for commercial enterprise is the ability to identify consumers to a granular, often individual level.

Smartphone metadata alone reflects a wealth of information at the individual level and when analysed in aggregate. But some argue that the data economy is really built on our personal behavioural data and that our clicks (and lacks of clicks) are monetised, sold and re-sold, and analysed by participants of that economy. In 2017, Alec Ross reported in The Industries of the Future that “private companies now collect and sell 75,000 individual data points about the average American consumer”.

We are not only identified by our name, address and birthdate, but also our behaviour. One of the appeals of big data for commercial enterprise is the ability to identify consumers to a granular, often individual level. That granularity combined with the unique trajectories of our individual movements, desires, relationships and activities, means that individuals can be relatively easily identified. The idea that big data can be anonymised is being proven false as researchers and companies show just how easy it is to identify individuals from seemingly huge data sets.

In 2013, researchers found that human mobility data (geolocated phone records) is surprisingly unique and even coarse data provides little anonymity. Using anonymised mobile phone data of 1.5 million individuals, researchers found that four spatio-temporal points are enough to uniquely identify 95% of the individuals. In 2015, MIT researchers found that four data points — such as dates and locations of purchases — are enough to identify 90% of the people in a data set recording three months of credit-card transactions by 1.1 million users. These findings represent fundamental constraints to individual privacy in an era of big data and have important implications for the design of frameworks and institutions dedicated to protect the privacy of individuals.

The watching cameras. Beijing (Photo: Giulia Marchi via Getty)

Globally, governments and regulatory agencies are scrambling to catch up with the pace of change. The speed of digitisation has seen legislative reform slow to start and struggling to regulate commercial enterprise holdings, sale and use – especially when in foreign jurisdictions. Notably the European Union’s General Data Protection Regulation sought to reshape the handling of EU citizen data and protect EU citizens from privacy and data breaches. In the EU, regulators have made it clear that they are deeply uneasy about the way the data broker industry has been operating.

In Australia, the establishment and ongoing work of agencies such as Information Commissioner, Data61 and Digital Transformation Agency highlight the need to balance harnessing the power of big data with providing adequate protection for Australians. What is clear is that the complex dynamics of big data and its application are confronting our existing ethics, laws, values and social norms.

It’s difficult to avoid data creation and collection about you, even for a short period of time. As we move through each day, we leave digital footprints as data is constantly created by our movements and collected about our activity. Some of this data is individualised and personalised and some of it is collected in huge volumes and supposedly anonymised. Inferences about our most intimate details are able to be deciphered from the data, as well as an absence of data.

For many of us, the big data era presents more questions than answers and more challenges than solutions. It is clear though that living in a data saturated world is transforming the society we live in. Interestingly, there is a lot of debate about whether and how we should adopt these technologies, rather than awareness that we largely already have. 

You may also be interested in