Companies thinking about their ESG goals may be tempted to conclude that good data ethics are a straightforward “win-win”. But when it comes to aligning data protection principles and reducing carbon footprint, things are not always that straightforward.
Although we talk airily of data existing “in the cloud”, and being in a “virtual world”, data is a physical resource like any other, with real-world demands for space and energy. As a result, there is increasing focus on the huge resource costs involved in data access and storage.
So what are the environmental costs of data protection compliance?
Data’s dirty secret
No-one doubts that storing data in the cloud takes a lot more energy than saving onto a hard drive: perhaps (according to Stanford alumni) as much as a million times more. Although use of renewable energy in data centres complicates the equation somewhat, the principle is undeniable.
Estimates of the global emissions from cloud computing are currently between 2% and 4% of the global total and growing. Everyone has different figures (and it’s complicated to measure) but data centres could account for up to 30% of electricity usage in some countries by 2030.
Or put it another way: research from Veritas suggests that storing one petabyte of unoptimised backup data in the cloud for one year could emit 3.5 metric tons of CO2 waste – just about what you would use driving a family car from pole to pole.
On top of that, real space as well as the virtual space has been enthusiastically occupied by the “data hyperscalers” (think Big Tech). There are now over 700 hyperscale data centres in the world, and at least 300 more in the planning. They come at a price: sites have intensive electricity, water and cooling requirements, and there is increasing competition between data centres and bricks and mortar sites already in parts of South-East England.
Don’t be a data guzzler
In the UK and Europe, data protection is founded on the principles of the GDPR. Several of these principles depend on the “use less” model. Ensuring you have a good reason to use personal data and only using what you need – in data protection-speak, having a lawful basis, purpose limitation and data minimisation – can only help to reduce overall data use, and therefore carbon footprint.
On the reverse of this is the data SUV: the large language model. These models consume raw data at an insatiable rate. And at the same time as they are being challenged by data protection regulators for unlawful use of personal data in their development and operation, the environmental costs of these models are also being revealed as eye-watering. So there may be more than one good reason to think very carefully before using them.
Avoiding the data landfill site
The data protection principle of storage limitation is designed to reduce the risk of leaving piles of redundant personal data hanging about, increasing the risks of an unpleasant leak.
Again, better data security can achieve the same goals as energy saving: by not cc-ing your emails you’ll not only reduce the risk of unauthorised processing, but also improve your carbon footprint. Not emailing huge unredacted file attachments around the company will do the same thing.
Even being privacy-aware by blocking tracking pixels can have a significant energy-reducing effect. So far so good.
GDPR’s hidden costs
Delving deeper into the data protection goody bag, there are some harder to digest morsels.
For example, the data accuracy principle. For organisations considering the need to access and correct data, the faster and more flexible (or “hotter”, as the data people say) retrieval options you need. These are predictably more expensive both in terms of money, and environmentally.
And there is the ever-present spectre of the data subject rights request. According to a recent Tech UK report “42% of IT departments [are] avoiding removing data for fears of customer backlash”. So that’s more data which is likely to end up on the data landfill site.
We may be paying an unacknowledged price for the privilege of other data rights. We are only just beginning to understand the huge energy costs of consent management infrastructure, which puts another layer of expensive resource on top of existing data ecosystems.
And the GDPR principles of integrity and confidentiality mean that data may have to be stored safely, and often with multiple backups, over long periods of time, especially in particular sectors such as healthcare. This means that data centres are built with massive overspill capacity and back-up systems (often reliant on diesel generators) to assure their operations.
The perils of international transfer
Exporting data abroad can be problematic from a data protection point of view because of concerns about interference by foreign governments, graphically illustrated by the recent $1.2bn fine of Meta for unlawful transfer to the US. But are we also exporting environmental problems – should we expect other countries to divert precious water resources and highly educated minds into looking after our expired data? Are we effectively sending our dirty plastic abroad?
As a result of the Meta fine, there’s lots of talk at the moment about data localisation within Europe. The required cost-benefit analysis will have to include energy needs and ethical dilemmas, alongside data protection considerations.
Outside the strictly personal data sphere, everyone is thinking about potential data monetisation and the possibilities opened up by artificial intelligence. Again, this drives storage needs: if we don’t yet know what will be useful, we may be tempted to consume resources to keep that unstructured legacy data “just in case”.
We are also seeing exciting research into future data storage capabilities such as chemical storage on synthetic strands of DNA, which can be kept indefinitely at room temperature and enables copies to be made for free.
Data minimisation rules for non-personal data?
While this kind of capability is still in development, there is a good case for the “store less” principles of GDPR to be applied to non-personal, “industrial” data. The possibility of mandatory registration and reporting of energy usage for data centres is already on the horizon – and perhaps we will see organisations compelled to minimise all data for environmental reasons as well.
And (perhaps more controversially) a pushback on data rights, if they are proving too polluting to uphold.
Data Protection Officers are due to be replaced by Senior Responsible Individuals under expected UK data protection reforms. The name change may be apposite: the SRI of the future may have a great many data compliance responsibilities beyond mere “data protection”.