MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
designer
Recherche

One rebel’s malicious ‘tar pit’ trap is driving AI web-scrapers insane

mercredi 29 janvier 2025, 17:32 , par PC World
Large language model AI companies have been aggressively scraping content off the web for years, and many of them are known for ignoring things like copyright or the robots.txt files used by sites to stop search engines. One designer decided to send these web crawlers on a wild goose chase — by trapping them in a never-ending spiral of nonsense that burns up their resources and produces useless results.

Ars Technica spoke to the anonymous designer of Nepenthes, a piece of software that he fully admits is aggressive and malicious. The tool, named after a carnivorous pitcher plant, sends AI scrapers hunting after links that lead to other links back within a website. The circular pattern of useless links repeats itself, sending the bot chasing after the same worthless data again and again, requiring human intervention to halt.

According to the designer, it’s possible for this pattern to repeat for “months” if it isn’t caught, wasting vast amounts of resources for an AI company. And, to be fair, it also wastes the resources of whatever service is hosting the website being crawled. The designer says that pretty much every AI crawler has fallen for his intentional “tar pit” trap, with one notable exception: OpenAI, the maker of ChatGPT.

The anonymous designer isn’t shy about his intent to harm both AI companies and models with this tool (which is probably why Ars didn’t publish his real name), and that the Nepenthes tool doesn’t provide any real benefit to whoever implements it. But in addition to forcing AI scrapers into an infinite loop, it can be used to feed them useless data to “poison” their AI models and make their results materially worse.

“WARNING,” says the Nepenthes download site in all caps. “THIS IS DELIBERATELY MALICIOUS SOFTWARE INTENDED TO CAUSE HARMFUL ACTIVITY. DO NOT DEPLOY IF YOU AREN’T FULLY COMFORTABLE WITH WHAT YOU ARE DOING.”

At least one other similar tool, called Iocaine, has emerged based on the same principles after Nepenthes first gained traction on social media. There’s a growing list of such tools, like Quixotic (which deliberately serves up fake content) and Poison the WeLLMs (a reverse proxy system that returns nonsense when it detects known AI bots scraping a site).

The full story on Ars Technica is worth a read for the technical breakdown, but I’m more interested in the motive. The designer says he made the tool out of frustration over what the web is becoming, filled up with more and more AI-generated content… which is, itself, often recycled from other AI-generated content. “I’m just fed up, and you know what? Let’s fight back, even if it’s not successful. Be indigestible. Grow spikes,” he says.

The creator of Nepenthes is hardly alone in this regard. Facebook users are complaining that the entire platform is overflowing with auto-generated “AI slop” while Facebook itself seems to be leaning into AI content. Google Search has gotten so full of AI and targeted advertising that its newly introduced and stripped-down “Web view” became a hit overnight. The inclusion of AI, often forced into products with no indication that anyone actually asked for it, is part of a broader trend being called “Enshittification,” in which a service becomes worse and less useful while charging more.

Web users can perhaps be forgiven a little schadenfreude at the news that OpenAI accused DeepSeek — a Chinese AI model allegedly trained on a tiny fraction of ChatGPT’s cost — of stealing its proprietary technology. Hey ChatGPT, play me a sad song on a tiny, AI-generated violin.
https://www.pcworld.com/article/2592071/one-rebels-malicious-tar-pit-trap-is-driving-ai-scrapers-ins...

Voir aussi

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Date Actuelle
jeu. 30 janv. - 19:42 CET