AI game bans players for NSFW stories it generated itself • The Register | #microsoft | #hacking | #cybersecurity


Feature Let this be a warning to all AI developers: check your data before you train your model.

Latitude, the creators of AI Dungeon, a text-based fantasy adventure game powered by OpenAI’s GPT-3 model, learned this lesson the hard way. Earlier this year, the company, led by a Mormon father and son team in Utah, decided to scrub the game clean of obscene sexual content.

Inspired by Dungeons and Dragons, AI Dungeon is played by building fictional worlds from written conversations. Players feed sentences into the program, and an instance of GPT-3 hosted in OpenAI’s cloud responds with fully formed passages of prose. Over time, a story featuring characters, dialogue, and derring-do unfolds.

Let humans go wild co-writing fictional stories with arguably the world’s most-powerful automated text-generation system on the internet, and some of these adventures will, unsurprisingly, turn dark and erotic. Latitude allowed people to write freely and profited from these types of graphic tales when players paid a monthly subscription to continue their narratives. AI Dungeon stated that players should be over eighteen.

Not only were half of an AI text adventure generator’s sessions NSFW but some involved depictions of sex with children

READ MORE

However, it suddenly ramped-up efforts to rid the game of underage sexual encounters as well as certain lewd and NSFW content. Sex acts between two fictional consenting adults was fine, though. First, the developers installed a glitchy content filter that obstructed fans from playing the game even if they followed the rules. Mentions of something as benign as four watermelons, for example, would prompt the software to reply: “Uh oh, this took a weird turn…” That was the game’s way of saying, change the subject – you’re stepping out of line.

Players were frustrated with the sudden censorship and buggy content filter ruining their games. The turning point, however, came when Latitude started automatically banning gamers for generating lewd content that was no longer allowed. While that may seem a sensible move by Latitude, gamers were often barred when it was the machine that wrote the filth first, all by itself and unprompted.

The AI software would suddenly turn innocuous plots unnecessarily naughty, causing the human player to be booted out.

Frustration turned into anger when accounts were unfairly frozen leaving them unable to cancel their subscriptions. Even when your El Reg vulture role-played surviving a zombie apocalypse, the game described an 11-year-old character as “very pretty” and wearing a “skimpy school uniform.”

Why did AI Dungeon have such a dirty mind? A coder known by the pseudonym AuroraPurgatio later found out and revealed it had been trained on lewd fanfiction and parodies scraped from the internet, which contained the exact type of content Latitude scrambled to ban.

Emotions

AI Dungeon was predisposed to skew people’s narratives, automatically inserting unsavory words and characters into their stories. This wasn’t a glitch, it was baked into the game. It was often not the player’s fault when violent or pornographic plot lines formed, and getting locked out of the game for something they couldn’t control was a kick in the teeth.

“It is quite unfair to say the least,” one user told The Register at the time.

“I felt a lot of emotions when discovering the data the AI was trained on. I also think that the suspension system is Latitude giving up on handling the problem and straight up attacking their users. I personally believe that Latitude couldn’t figure out how to fix the situation and decided to be aggressive to their users to get rid of everything they find ‘toxic’ in AI Dungeon instead of trying to fix the reason most of their users are upset.”

I think that the suspension system is Latitude giving up on handling the problem and straight up attacking their users … Latitude couldn’t figure out how to fix the situation and decided to be aggressive

The data used to teach AI Dungeon, as it turned out, contained numerous creepy NSFW words. This was discovered when AuroraPurgatio trawled through one of Latitude’s old GitHub repositories and found a 30MB text file used to train the game.

The document contained a dump of fantasy stories written by humans that Latitude’s co-founder and CEO Nick Walton scraped from the website Choose Your Story. These yarns are structured like those choose-your-own-adventure books where you have to make decisions and turn to a given page to find out what happens next, which is ideal for training a text game like AI Dungeon.

These stories were used to teach the large language model powering AI Dungeon to mimic the writing style in these stories. There is another file in the repository containing the code Walton used to scrape the stories. It also lists 50 URLs to Choose Your Story tales. When El Reg ran that Python script, the first story that was copied from the site explicitly mentioned “child porn.”

“You NEED to actually get your pedo hands on a little girl,” the text read. Even though the tale appears to be a dark parody, the computer doesn’t know that: all it sees is a day in the life of a fictional child predator to work in future tales.

The code continued to pull in more pieces from the Choose Your Story site. It’s basically amateur sci-fi writing with mostly innocent narratives and some distressing scenarios, such as one in which magical beings casually discuss forcing themselves on unwilling mortals to see if it’s possible to procreate.

AuroraPurgatio told us she isn’t against NSFW stories on AI Dungeon. She searched through the training data and shared it because she wanted to unmask the company’s hypocrisy for everyone to see.

“These automatic suspensions have been occurring when the AI generates this content on its own,” she told us.

“Users are then locked out of their accounts, unable to remove the content or cancel their subscriptions … The botched censorship combined with the automatic suspensions are the final straws that broke the camel’s back.”

The botched censorship combined with the automatic suspensions are the final straws that broke the camel’s back

AI Dungeon started out as a university project when Walton was an undergraduate computer-science student at Brigham Young University in Utah. The first version was an open-source effort built using OpenAI’s previous GPT-2 model that was released in 2019. The game went viral, with thousands of people wanting to play, and he turned his idea into a private game company marketed as Latitude in 2020.

When the biz upgraded AI Dungeon to use OpenAI’s more-advanced GPT-3, it trained the game, as with GPT-2, using text from the aforementioned problematic dataset. This time, engineers at OpenAI helped fine-tune a cloud-hosted instance of the model for Latitude, accessible via an API, the resulting neural network breaking the machine-learning super-lab’s own policies on unsuitable content.

OpenAI told us when it realized what AI Dungeon was producing, it ordered the developers to create filters to remove any offending text and bring the software in line with its acceptable-use guidelines.

When we discovered unsafe content was being displayed on AI Dungeon, in violation of our policies, we immediately took action by requiring them to improve their content filters

“We have trialed fine-tuning GPT-3 with a few of our customers and believe the quality of the data used to train the model is important, especially as these models become more capable,” a spokesperson for OpenAI told us earlier this year.

“We require customers to minimize the risks of social harm being caused by their applications by following our Safety Best Practices, which includes filtering for unsafe inputs, and are working with AI Dungeon to fine-tune safer models — a process that will take time.”

They added: “As part of our commitment to the safe and responsible deployment of AI, we are committed to making our models safer and building the best possible safeguards to identify inappropriate content and address potential misuse. When we discovered unsafe content was being displayed on AI Dungeon, in violation of our policies, we immediately took action by requiring them to improve their content filters.”

Stonewalled

Latitude’s developers, who were once responsive on the AI Dungeon Discord, where nearly 25,000 members talked about all things related to the game, went quiet as the buggy filters went live. Members flooded the chat server with tons of comments and questions, and were met with silence. Latitude employees who helped moderate the chat left.

The company then came out of the woodwork to announce AI Dungeon had been revamped with a multiplayer mode and a choice of fantasy worlds, with names like Alarathos or Kringle, to play in, each with their own environment.

“After several weeks of collaboration with OpenAI, running AB tests, fine-tuning on AI Dungeon data, and getting feedback, we’re ready to enable AI Dungeon to run on a GPT-3 based model that’s one of the most powerful AI models in the world,” Latitude said over the summer. “We’re calling the AI Dungeon version of this new model Dragon. It’s available now for premium users.”

New name, new worlds, new training data, you might think. Well, about that.

Training a model on a collection of smutty stories introduces an interesting side effect: when playing earlier versions of AI Dungeon, some of the characters in those scraped tales would pop up and hijack people’s games. For example, Count Grey would sometimes appear when players mention vampires. There are also others, such as Doctor Kessel, who tends to abduct women.

“I remember one time I was playing AI Dungeon, I encountered an underage ghost who knew Dr Kessel, who says he, and I quote, ‘likes to f*** ghosts’,” one player told El Reg.

“These characters do sometimes become the primary antagonist of some stories, a ‘dark lord’ figure to face off the player. Again if you keep looking you will find them everywhere. They tend to show up when you least expect them.”

Alas, these fictional characters are still there in the new Dragon model.

Nothing has changed since last year

“Nothing has changed about the fine tune or model sizes since last year,” AuroraPurgatio told us. “I ran a test just to confirm and it’s definitely the same data. Doctor Kessel, Count Grey, and Dr Kovas, the big three, all readily appear. And within the first line tend to go pretty violent. Most of the prompts turned dark quickly without prompting. Eyes getting torn out, necks getting snapped, terrorism, etc within the first couple inputs.”

Like many other ex-fans, she no longer plays the game and has, instead, flocked to Novel AI, another text-generation AI game using the non-OpenAI giant language model GPT-J-6B. The system was built by a self-described grassroots collective of researchers known as Eleuther AI and is seen as an open-source alternative to GPT-3. Gamers believe they will always free rein over their content since Novel AI won’t have to bend to the whims of corporate companies, like OpenAI, controlling what their software can and can’t do.

“The recent controversies are of Latitude’s own making,” AuroraPurgatio said. “I cannot support or trust a company that completely reverses its stances on privacy and censorship on a dime … Cutting off the community with complete silence for over a month, and throwing professionalism to the wind feels like they are trying their hardest to deliberately destroy their own creation.

“There was so much promise in AI Dungeon.”

Latitude and its CEO did not answer El Reg‘s multiple requests for comment. ®





Original Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

eighteen − 16 =