$ cat mercor-breach-litellm-4tb.mdx

Mercor hacked - 4TB of AI training secrets for sale

Apr 14, 2026 · #mercor #litellm #cybersecurity #lapsus #training data #ai #supply chain #meta

One poisoned package in a public software repository. That’s all it took to steal training secrets from the biggest AI companies on the planet. A startup valued at $10 billion, personal data of 40,000+ people, and terabytes of source code - all up for auction on the dark web.

Dark server room with rows of server racks illuminated by red light


Mercor is a startup valued at $10 billion that provides training data for OpenAI, Anthropic, and Meta. You’d think a company sitting on that kind of information would have Fort Knox-level security. Turns out, one infected package in a public developer tool repository was enough to bring it all down.

How it started - poisoned LiteLLM

LiteLLM is an open-source library with 97 million monthly downloads. Developers worldwide use it to connect their apps to AI models. That popularity makes it a perfect target.

A hacker group called TeamPCP compromised LiteLLM’s publishing pipeline and on March 27, 2026, pushed two malicious versions - 1.82.7 and 1.82.8 - to the public developer tool repository. Anyone who downloaded and installed them unknowingly let malicious software into their systems.

What did the code steal? Pretty much everything:

  • Cloud access keys and credentials (AWS, Google Cloud, Azure)
  • SSH keys
  • Kubernetes configs
  • Database credentials
  • Secrets from build systems

In short - keys to the entire kingdom.

Mercor - the big target

Through those poisoned LiteLLM versions, the hackers got into Mercor’s systems. And this isn’t some random startup. Mercor provides training data for the largest AI companies. That means their systems could contain information about how OpenAI, Anthropic, and Meta train their models.

Lapsus$ - the hacker group previously known for hitting Uber, Nvidia, and Microsoft - claimed responsibility. They say they’ve got 4 terabytes of data:

  • 939 GB of source code
  • 200+ GB of databases
  • 3 TB of video and verification materials

What might those databases contain? Data selection criteria. Labeling protocols. Training strategies. For competitors - and for nations like China - that’s literally a recipe for building AI models.

40,000 people and Meta’s response

Personal data of over 40,000 people may have been exposed. That’s not just Mercor employees - it’s also people who provided training data and performed labeling tasks for AI companies.

Meta reacted fast - it froze all work with Mercor. No new contracts, no data transfers. Radio silence.

Meanwhile, Lapsus$ put the stolen data up for auction on dark web forums. Who’s buying? Competitors? Other hacker groups? Foreign governments? We don’t know. But the fact that AI giants’ training secrets could end up in the wrong hands should worry everyone.


My take

Here’s the problem the AI industry doesn’t want to talk about. Companies spend billions training models, build increasingly sophisticated security around their data centers - and one infected package in a developer tool repository bypasses all of it.

97 million monthly downloads. One poisoned version. And suddenly the training secrets of three of the world’s biggest AI companies are on the table.

This isn’t a question of whether attacks like this will happen again. It’s a question of how often. The AI software supply chain is only as strong as its weakest link. And right now, that link has 97 million downloads and zero security audits on publishing.

AITU #04 - full episode covering this week’s biggest AI news.

Sources

$ cd ../