Reddit drags Perplexity in a new lawsuit, accusing it of building up a $20 billion company off stolen data

4 hours ago 2

By Jacob Shamsian

Jacob Shamsian hollis unsmiling headshot cassidy edit 2

Follow Jacob Shamsian

Every time Jacob publishes a story, you’ll get an alert straight to your inbox!

By clicking “Sign up”, you agree to receive emails from Business Insider. In addition, you accept Insider’s Terms of Service and Privacy Policy.

and Natalie Musumeci

Jacob Shamsian hollis unsmiling headshot cassidy edit 2

Follow Jacob Shamsian

Every time Jacob publishes a story, you’ll get an alert straight to your inbox!

By clicking “Sign up”, you agree to receive emails from Business Insider. In addition, you accept Insider’s Terms of Service and Privacy Policy.

Reddit artificial intelligence

Reddit is suing Perplexity and other AI firms, accusing them of bypassing digital guardrails to steal valuable data. Illustration by Avishek Das/SOPA Images/LightRocket via Getty Images
  • Reddit has accused Perplexity and other data scrapers of stealing valuable data.
  • A new lawsuit claims the firms bypassed its digital guardrails using Google.
  • Reddit said the companies have been selling its proprietary data to AI.

Reddit filed a lawsuit against Perplexity, along with several other data mining companies, accusing them of stealing the social media platform's valuable data.

The lawsuit, filed Wednesday in Manhattan federal court, said the companies illegally circumvented digital guardrails to obtain data used to train AI models.

Perplexity's AI tools used Reddit comments to generate answers for users, even after the company agreed not to scrape Reddit's data, the lawsuit said.

Reddit said it sent a cease-and-desist letter to Perplexity in May 2024 demanding it stop scraping Reddit data unless it made a deal with the social media company, as Google and OpenAI had done.

Perplexity said it "was not using Reddit content to train any AI models and that it would respect Reddit's robots.txt," according to the lawsuit. Perplexity's citations to Reddit increased "forty-fold after Reddit told it to stop," the lawsuit added.

"Rather than respect Reddit and its users' rights, what Perplexity has done in response is simply come up with increasingly devious schemes to circumvent Reddit's security systems and policies," the lawsuit says.

According to the lawsuit, Perplexity appears to have used third-party data scrapers to circumvent Reddit's digital guardrails by taking Reddit's content through Google's search engine results.

"In other words, Perplexity's business model is effectively to take Reddit's content from Google search results, feed them into a third party's LLM, and call it a new product," the lawsuit says. "While that business model has somehow translated into a $20 billion valuation, it has not resulted in a willingness to pay for what others (including Google) have."

Perplexity spokesperson Jesse Dwyer said the company "will always fight vigorously for users' rights to freely and fairly access public knowledge."

"Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest," Dwyer said.

The other defendants in the lawsuit — Oxylabs UAB, AWMProxy, and SerpApi — are firms that scrape the internet for data and then sell the data to other artificial intelligence companies, according to the lawsuit.

Reddit's lawsuit said Perplexity may have used at least one of those firms, and that they pulled data through Google results of Reddit webpages.

"In a very real sense, these Defendants are similar to would-be bank robbers, who, knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead," Reddit's lawsuit alleges.

A Reddit spokesperson confirmed to Business Insider that the company has spent tens of millions of dollars on anti-scraping systems, which the lawsuit says these companies circumvented.

Representatives for SerpApi and Oxylabs did not immediately respond to a request for comment by Business Insider. AWMProxy, identified in the lawsuit as a former Russian botnet, could not immediately be reached for comment.

In a statement to Business Insider, Reddit's chief legal officer Ben Lee said Oxylabs UAB, AWM Proxy, and SerpAI were "textbook examples" of illegal scrapers.

"Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material," he said. "Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created."

This story is developing and will be updated.

Read next

Your daily guide to what's moving markets — straight to your inbox.

Read Entire Article
| Opini Rakyat Politico | | |