Monday, December 11

Machine unlearning: With privacy concerns rising, can we teach AI chatbots to forget?

I HAVE been writing on the internet for more than two decades. As a teenager, I left a trail of blogs and social media posts in my wake, ranging from the mundane to the embarrassing. More recently, as a journalist, I have published many stories about social media, privacy and artificial intelligence, among other things. So when ChatGPT told me that my output may have influenced its responses to other people’s prompts, I rushed to wipe my data from its memory.

As I quickly discovered, however, there is no delete button. AI-powered chatbots, which are trained on datasets including vast numbers of websites and online articles, never forget what they have learned.

That means the likes of ChatGPT are liable to divulge sensitive personal information, if it has appeared online, and that the companies behind these AIs will struggle to make good on “right-to-be-forgotten” regulations, which compel organisations to remove personal data on request. It also means we are powerless to stop hackers manipulating AI outputs by planting misinformation or malicious instructions in training data.

All of which explains why many computer scientists are scrambling to teach AIs to forget. While they are finding that it is extremely difficult, “machine unlearning” solutions are beginning to emerge. And the work could prove vital beyond addressing concerns over privacy and misinformation. If we are serious about building AIs that learn and think like humans, we might need to engineer them to forget.

The new generation of AI-powered chatbots like ChatGPT and Google’s Bard, which produce text in response to our prompts, are underpinned by large language models (LLMs). These are trained …

