Miguel Afonso Caetano<p>"Eight years ago, SerpApi, a start-up in Austin, Texas, dived headlong into the byzantine world of using robots to “scrape” Google’s search algorithms, so it could collect information to help customers appear higher in search results.</p><p>Then OpenAI’s ChatGPT came along, kicking off an artificial intelligence revolution. As more tech companies began building A.I. chatbots to keep up, they needed large amounts of data to train their A.I. models — data that SerpApi had already gathered.<br>Practically overnight, a class of companies like SerpApi — known as “data scrapers” — found a new business selling data scraped from Google to companies looking to train their A.I. chatbots.</p><p>On Wednesday, the internet message board Reddit decided to fight the data scrapers. It filed a lawsuit in the U.S. District Court for the Southern District of New York claiming that four companies had illegally stolen its data by scraping Google search results in which Reddit content appeared.</p><p>Three of those companies — SerpApi; a Lithuanian start-up, Oxylabs; and a Russian company, AWMProxy — sold data to A.I. companies like OpenAI and Meta, according to the lawsuit. The fourth company, Perplexity, is a San Francisco start-up that makes an A.I. search engine.</p><p>Reddit said it was seeking a permanent injunction against the companies, as well as financial damages, and wanted to prohibit the use or sale of any previously scraped Reddit data."</p><p><a href="https://www.nytimes.com/2025/10/22/technology/reddit-data-scrapers-perplexity-theft.html" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">nytimes.com/2025/10/22/technol</span><span class="invisible">ogy/reddit-data-scrapers-perplexity-theft.html</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/AISearch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISearch</span></a> <a href="https://tldr.nettime.org/tags/WebScraping" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebScraping</span></a> <a href="https://tldr.nettime.org/tags/Reddit" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Reddit</span></a> <a href="https://tldr.nettime.org/tags/DataScrapers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataScrapers</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Chatbots</span></a> <a href="https://tldr.nettime.org/tags/Perplexity" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Perplexity</span></a></p>