Ready to learn Artificial Intelligence? Browse courses like Uncertain Knowledge and Reasoning in Artificial Intelligence developed by industry thought leaders and Experfy in Harvard Innovation Lab.
One of the domains where the General Data Protection Regulation (GDPR)will leave its mark prominently is the artificial intelligence industry. Data is the bread and butter of contemporary AI, and under previously lax regulations, tech companies had been helping themselves to users’ data without fearing the consequences.
That will change on May 25, when GDPR comes into effect. GDPR requires all companies that collect and handle user data in the European Union to be more transparent about their practices and more responsible for the security and privacy of their users. Failure to comply will result in a penalty that amounts to 20 million euros or four percent of the company’s revenue, whichever is greater.
Naturally, a stricter set of regulations will challenge the current practices of AI companies, which rely heavily on user data for research and the improvement of their services. But this doesn’t necessarily mean that it will hamper artificial intelligence research and innovation. The industry will (have to) find ways to continue to develop new AI technologies while also remaining respectful of the privacy of users.
Ownership of data
“I give you free access to my online service, and in exchange you let me collect your data.” That’s a simplified version of the deal online services running artificial intelligence algorithms make with their users.
At first glance, it sounds reasonable. In general, users find more value in the few dollars they save than the data they’re giving up. AI companies, on the other hand, can use that data to train their AI algorithms, to create digital profiles of their users, predict their behaviors, provide better services—and make billions of dollars from serving ads and selling that data to other parties. Google and Facebook are two examples that are making huge profits from being able to predict their users’ preferences and serve them relevant ads.
Without legal oversight, tech companies had no obligation to reveal the full extent of data they stored about users, and in their ultimate quest to hone their algorithms they made decisions that came at the expense of their users.
But under GDPR, not only will they have to be transparent about all the data they collect, but they will also have to let users obtain that data or to ask the company to delete it entirely.
Deleting data can prove to be a challenge for two reasons. First, AI companies love to keep user data, even after the users leave their platform. It allows them to compare and predict the behavior patterns of other users. It would be easier for them to keep the data as is. Now they will have to go the extra steps to anonymize the user’s data if they want to keep it for their AI purposes after the user requests to erase it.
The second problem with deleting data is how to track all instances of a user’s data across a company’s backend. AI companies tend to use various tools and platforms to create and train their AI algorithms. Sometimes, they send the data to third party services that run those algorithms on the company’s behalf. They will need to adopt practices and implement tools that will allow them to keep track of their data as it moves and becomes duplicated across the company’s servers and elsewhere.
The black box problem
According to the GDPR’s text, companies must notify users about “the existence of automated decision-making” and provide them with “meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.” This means that if your company runs AI algorithms, you must be explain to your users when they’re subject to the functionality of those algorithms and explain to them the reasons behind the decisions those algorithms make.
Compliance with the first part is not very difficult, but the second part can be especially challenging. Sometimes companies don’t want to reveal the inner workings of their algorithms because they consider them as closely held trade secrets.
And sometimes, they honestly can’t explain why their AI algorithms made a specific decision.
Deep learning, the main technology behind current AI products, makes decisions based on complicated patterns and correlations it finds in large datasets it examines. This is in contrast to classic software, in which human programmers define the rules of behavior.
The problem is, AI algorithms themselves aren’t intelligent enough to explain their behavior. And sometimes their behavior becomes so complicated that even the humans who build them can’t figure out the process and reasoning behind their decisions. This is why deep learning algorithms and deep neural networks are sometimes referred to as black boxes.
The black box problem is becoming more exacerbated as AI finds its way into more critical domains, such as health care, law, loans and education. People will have to be able to challenge the life-changing decisions that AI algorithms make for them, and without a clear explanation, none of those innovations will be able to find their way into the main stream.
Will GDPR prevent AI innovations?
The new restrictions that GDPR will put on the data-hungry algorithms of AI companies will surely challenge their current modus operandi. No longer will they be able to collect and mine user data without their clear and explicit consent. No longer will they be able to test their algorithms on unsuspecting users.
But does it mean that GDPR will hamper AI innovation? Probably not. Previously, the opaque and unfair practices of AI companies have led users to lose their trust in the tech industry. GDPR will force tech companies to move toward more transparent solutions and adopt measures that provide their customers with the necessary assurances about how their data is used. An example is decentralized artificial intelligence, AI innovations that rely on transparency and sharing of knowledge instead of walled garden approaches. Another notable effort is the development of explainable artificial intelligence, AI algorithms that humans can understand and decompose.
Users, on the other hand, will no longer be in the dark and will no longer have to worry about what’s happening in the dark recesses of the servers of companies they entrust with their data. Naturally, GDPR will not solve all our problems overnight, and there will still be actors that will want to make shady uses of user data. But with the penalties under the new rules (20 million euros or 4 percent of revenue, whichever is higher), we can at least rest assured that GDPR will raise the barrier enough to discourage a large percent of entities that are concocting evil schemes that involve abusing users’ trust and data.