SELECT LANGUAGE BELOW

Businesses Are Distributing Employees’ Private Messages and Emails as AI Training Data

Businesses Are Distributing Employees' Private Messages and Emails as AI Training Data

AI Industry Turns to Internal Data from Defunct Companies

Falling companies are beginning to recognize that their internal communications and operation data can be highly valuable in the emerging AI sector.

When Shana Johnson closed cielo24, the transcription and subtitling service she managed, she stumbled upon a surprising revenue source—what she termed the company’s “operational exhaust.” This collection of digital remnants accumulated over 13 years included Slack conversations, Jira IT ticket responses, email exchanges, and extensive Google Drive records showcasing daily employee activities.

Johnson collaborated with SimpleClosure, a startup focused on managing business closures, to navigate typical shutdown tasks like terminating payroll, filing tax returns, securing investor approvals, and completing IRS forms. However, SimpleClosure also enabled cielo24 to market its entire digital archive as training material for AI systems, resulting in substantial profits for the now-defunct company.

This trend isn’t a one-off. It signals a shift in the competitive landscape for AI advancement. As AI firms exhaust available public internet content, they are increasingly drawn to alternative data sources that reflect genuine workplace interactions and decision-making.

According to Ilya Satskeva, a former chief scientist at OpenAI, AI companies could run out of all readily accessible public online content by the end of 2024. He emphasized that such publicly available data is insufficient for developing AI systems capable of managing real-world tasks. In contrast, in-depth records from dissolved companies provide precisely the kind of training resources that would enhance workplace AI agents.

Ali Ansari, whose company micro1 supplies AI research labs with a product known as Roots, highlighted the shifting demands in the field. Ansari noted, “Companies are starting to recognize the necessity of incorporating real-world noise to effectively evaluate their models.” Roots functions as a pseudo-holding company where AI agents simulate tasks such as managing financial services and coordinating complex schedules.

The rising appetite for workplace data has significantly benefited SimpleClosure. CEO Dori Jonah described the overwhelming interest from AI companies as “insane,” indicating a kind of gold rush as they rush to obtain real-world data.

To meet this demand, SimpleClosure is launching Asset Hub, a platform that enables closed companies to sell code repositories, Slack archives, emails, and similar materials. While parts of this platform are still being tested, SimpleClosure is working to ensure all personally identifiable information is stripped from internal records, although this is technically challenging. Over the last year, SimpleClosure has facilitated nearly 100 transactions for bankrupt firms, recouping over $1 million for their founders. Typical payouts range from $10,000 to $100,000 for each company.

Nevertheless, this approach raises significant privacy issues. Marc Rotenberg, founder of the Center for AI and Digital Policy, expressed concern over whether employers should sell internal communications to third parties, even when employees have signed agreements regarding intellectual property. Employees likely didn’t foresee their Slack messages being used for AI training. Rotenberg remarked, “I think the privacy concerns here are substantial. Employee privacy needs serious attention, particularly as reliance on internal communication platforms like Slack grows. This data isn’t just general; it’s personally identifiable.”

Lautenberg’s organization has reached out to the Senate Commerce Committee to raise alarms about personal data protection measures, urging scrutiny from the FTC regarding new AI business practices.

Meanwhile, there’s a growing discourse revolving around the implications of this trade in data, especially as more powerful AI systems emerge in the market. The handling of what was once private business communication is increasingly becoming raw material for AI’s relentless quest for data.

Facebook
Twitter
LinkedIn
Reddit
Telegram
WhatsApp

Related News