Wikipedia is under assault: rogue users keep posting AI generated nonsense

ForgottenFlux@lemmy.world · edit-2 1 month ago

Wikipedia is under assault: rogue users keep posting AI generated nonsense

narc0tic_bird@lemm.ee · 1 month ago

Best case is that the model used to generate this content was originally trained by data from Wikipedia so it “just” generates a worse, hallucinated “variant” of the original information. Goes to show how stupid this idea is.

Imagine this in a loop: AI trained by Wikipedia that then alters content on Wikipedia, which in turn gets picked up by the next model trained. It would just get worse and worse, similar to how converting the same video over and over again yields continuously worse results.

8uurg@lemmy.world · 1 month ago

A very similar situation to that analysed in this paper that was recently published. The quality of what is generated degrades significantly.

Although they mostly investigate replacing the data with ai generated data in each step, so I doubt the effect will be as pronounced in practice. Human writing will still be included and even curation of ai generated text by people can skew the distribution of the training data (as the process by these editors would inevitably do, as reasonable text could get through the cracks.)

Blaster M@lemmy.world · edit-2 24 days ago

AI model makers are very well aware of this and there is a move from ingesting everything to curating datasets more aggressively. Data prep is something many upstarts have no idea is critical, but everyone is learning about, sometimes the hard way.

drunkpostdisaster@lemmy.world · 1 month ago

It’s over. We lost.

vext01@lemmy.sdf.org · 1 month ago

Slop!

e$tGyr#J2pqM8v@feddit.nl · edit-2 1 month ago

Sabotage Wikipedia, Ddos the Internet Archive. Makes you wonder if in the future we’re going to forget our past. Will actual history be obscured in a sea of alternative histories unrecognizably presented as the same thing. Maybe we need to keep some books laying around in archives just to be sure.

TachyonTele@lemm.ee · edit-2 1 month ago

The digital dark age will be a real thing, absolutely.

Interesting idea on a sea of alternative histories. That might be a possible threat.
Someone else here called it “AI text apocalypse”. I like that term.

schizo@forum.uncomfortable.business · 1 month ago

Further proof that humanity neither deserves nor is capable of having nice things.

Who would set up an AI bot to shit all over the one remaining useful thing on the Internet, and why?

I’m sure the answer is either ‘for the lulz’ or ‘late-stage capitalism’, but still: historically humans aren’t usually burning down libraries on purpose.

poszod@lemmy.world · 1 month ago

State actors could be interested in doing that. Same with the internet archive attacks.

weeeeum@lemmy.world · 1 month ago

Its because there’s no accountability for cybercrimes. If humans always had a button to burn down libraries, I’m sure they would have. Instead they had to put themselves in harms way to do such things.

People do things cause they can, and fucking with Wikipedia is apparently simple.

Schmoo@slrpnk.net · 1 month ago

historically humans aren’t usually burning down libraries on purpose.

How on earth have you come to this conclusion.

sugar_in_your_tea@sh.itjust.works · 1 month ago

To be fair, it’s usually to effect cultural genocide. It’s not average people burning libraries, it’s usually some kind of authoritarian regime.

SacralPlexus@lemmy.world · edit-2 1 month ago

* looks around and gestures broadly in agreement*

rsuri@lemmy.world · 1 month ago

Yeah but the other thing about humanity is it’s mostly harmless. Edits can be reverted, articles can be locked. Wikipedia will be fine.

Petter1@lemm.ee · edit-2 1 month ago

Maybe a strange way of activism that is trying to poison new AI models 🤔

Which would not work, since all tech giants have already archived preAI internet

schizo@forum.uncomfortable.business · 1 month ago

Ah, so the AI version of the chewbacca defense.

I have to wonder if intentionally shitting on LLMs with plausible nonsense is effective.

Like, you watch for certain user agents and change what data you actually send the bot vs what a real human might see.

kibiz0r@midwest.social · 1 month ago

Unleashing generative AI on the world was basically the information equivalent of jumping headfirst into Kessler Syndrome.

khannie@lemmy.world · 1 month ago

For the uninitiated like me:

The Kessler syndrome (also called the Kessler effect,[1][2] collisional cascading, or ablation cascade), proposed by NASA scientists Donald J. Kessler and Burton G. Cour-Palais in 1978, is a scenario in which the density of objects in low Earth orbit (LEO) due to space pollution is numerous enough that collisions between objects could cause a cascade in which each collision generates space debris that increases the likelihood of further collisions.

Wikipedia link.

kibiz0r@midwest.social · 1 month ago

Good call, thank you.

Also: Referencing Wikipedia in this context is kinda funny.

khannie@lemmy.world · 1 month ago

I did think that. :) It’s just… So good. I hope it never enshitifies. God help us.

sbv@sh.itjust.works · 1 month ago

As for why this is happening, the cleanup crew thinks there are three primary reasons.

“[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,”

That last one. Ouch.

TimLovesTech (AuDHD)(he/him)@badatbeing.social · 1 month ago

“[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,”

I think the main driver behind people misinformed about AI content comes from the fact that outside of tech people, most have no idea that AI will:

100% make up answers to things it doesn’t know because either the sample size of data they have ingested was to small or was bad. And it will do this with the same robot confidence you get for any other answer.
AI that has been fed to much other AI generated content will begin to “hallucinate” and give some wild outputs, very similar to humans suffering from schizophrenia. And again these answers will be given as “fact” with the same robotic confidence.

Wiz@midwest.social · 1 month ago

And then #2 will be copied by other people and AIs, becoming seen as fact.

Wiz@midwest.social · 1 month ago

And then #2 will be copied by other people and AIs, becoming seen as fact.

givesomefucks@lemmy.world · 1 month ago

The vast majority of people think they’re the good guys…

nutsack@lemmy.world · 1 month ago

why the fuck would anyone stick ai shit on wikipedia that doesn’t make any sense

NateNate60@lemmy.world · 1 month ago

“[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,” Lebleu said.