Scaleout Systems

We love AI tools that make us efficient…

Whether we realize it or not, we've all come to appreciate the personal efficiency derived from AI-powered tools. Many of us reclaim hours per day by…

being protected from spam emails and messages,
Intelligent recommendations for what to view, listen to, buy, and choose next,
and prediction tools that know what we want to say and write - next word, next emoji, next sentence.

The data age has brought us all these gifts, granting us time to engage in activities we cherish more than sifting through spam or typing out emails at a snail's pace.

…but we're growing more conscious of data privacy.

However, the data age reveals another facet. Over time, we've realized that our data—the private information trails we leave—is monitored and utilized.

In the early days of digitization, many of us were unconcerned about our personal data and its emerging but quickly progressing potential to describe us; Who we are, what we do, and what preferences we have can all be understood from the data trails we leave behind:

Our surf history,
Our location history,
Our health records and patient journals,
The text and chat messages on our mobile phones,
Our pictures and video recordings,
Our bank account history,
And even our DNA code.

Our data trails have gradually become more clear and easy to track as we live in closer companionship with mobiles, laptops and various gadgets.

It's now evident that some of our data is being analyzed and that it constitutes a valuable asset. The tailored recommendations we get online and the ads we see reflect both our own history and the preferences of others who share our patterns.

While still valuing the AI tools that have been trained on private usage data, we’ve come to care about the privacy of our data and how it is being handled.

Regulatory bodies are taking notice too

It isn't just us, the individual gadget users, who are concerned about data privacy. After having seen industry after industry getting disrupted by AI-savvy incumbents, regulatory authorities have awoken.

Data privacy laws have been formulated to protect the integrity of our person and identity, as well as to shield our health data from being exposed. Principles for handling private and sensitive data have been established.

These new regulations and guidelines, like the GDPR, California Consumer Privacy Act and the AI Act are in place to defend us, ensuring the development of potent and efficient tools that also honor our data privacy.

Data minimization principles: Good hygiene for AI based on sensitive data

A new era with clearer directions on how to treat sensitive data is emerging. It carries a good base level of hygiene standards for how sensitive data should be treated that greatly will benefit the end user.

One example of this: Data minimization is a fundamental privacy principle found in privacy laws worldwide, including the GDPR. It proposes that only relevant, adequate, and necessary personal data should be collected and kept for as little time as possible.

Federated Learning is a cornerstone for data minimization principles

Federated learning, an approach for training machine learning models in distributed settings, aligns well with GDPR's principles of data minimization. By avoiding unnecessary data duplication, federated learning enables the principle of data being limited to what is necessary and can help ensure that the data is used solely for its intended purpose.

With federated learning, machine learning models are trained without the need to transfer private user data off the device. Instead of relocating the data, the model itself is transferred to be trained locally on the device.

In terms of performance, AI developers can expect that a model trained federated ends up performing comparably to centrally-trained models with the added benefit of enhanced data privacy and security.

Read more about how federated learning enables data minimization in an AI context in this blog post; The copy problem in Machine Learning

Privacy-preserving machine learning based on data from mobile phones

Android pioneered ensuring robust data privacy when creating next-word-prediction for mobile phones. They concluded that a user's private text messages should be treated as sensitive and hence data minimization principles were applied.The model was initially, and continues to be, trained using federated learning.

A new era where efficiency and privacy coexist

With federated learning and other privacy-enhancing technologies (PETs) we will no longer have to accept risking privacy of sensitive data when enjoying the benefits of ML-tools; neither the ones we use today nor the ones of the next generation.

Are you developing that next generation ML-tool or service? Or are you concerned about user privacy for an existing solution? Do not hesitate to get in touch with our experts and learn how to enhance the privacy of your efficiency booster.