Dark Web AI and Clever Hans

In which we visit the dark web and meet a dead horse to make sense of AI

May 31, 2023

Even though progress is being made in machine learning with limited examples/training data, the AI systems that make the news are trained on a large corpus of text/images. The companies that make these chatbots and image generators are not eager to share exactly what their training data includes, but GPT-3, for example, was trained on hundreds of gigabytes of web text.

This has a few consequences:

The internet is not occupied exclusively by accurate information. Hence, the training data will include nonsense.
The internet is not free from bias (often the opposite…). Hence, your chatbot/image generator will probably be biased. Companies try to ‘debias’ their models, but that process has its own biases.
Not everything on the internet is free from copyright. Hence, your chatbot/image generator will (kind of) plagiarize.

There are probably many more points, but the overarching lesson is that your training data (to an extent) constrains what you can trust and extrapolate from the AI’s output.

So what happens when you scrape your training data from the dark web, the gray zone on the deep web that isn’t indexed by search engines? Meet DarkBERT, an LLM much like ChatGPT, but trained on dark web data.

If there’s anything we can take away from quality fiction, it is that a good villain is often the most complex character in the story. The same seems to be true for AI chatbots. The researchers behind DarkBERT write:

Our evaluations show that DarkBERT outperforms current language models and may serve as a valuable resource for future research on the Dark Web.

What could possibly go wrong?

Of course, the idea here is not to build the AI supervillain everyone is scared of but to end up with a dark hero, like Batman. Some use cases for DarkBERT tested by the researchers are ransomware detection, early threat detection, and unmasking drug-related codewords.

Also, kudos to the researchers for highlighting specific ethical challenges (information masking, public database use, data annotation) and limitations (limited usage in non-English contexts and dependence on task-specific data).

As a side note: here is another interesting preprint in which the researchers put large language models into virtual agents (think The Sims) in an interactive virtual environment. After a while…

… these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time…

I’ve written about the virtual sandbox strategy for AI before and am curious to see where the idea will go.

II.

Time to introduce our second character: Clever Hans.

Clever Hans was a horse that garnered a lot of attention in the early twentieth century for being able to do basic arithmetic. The audience asked math questions and Hans tapped his hooves to provide the correct answer.

File:CleverHans.jpg — A Clever Hans performance in 1904.

Only… Clever Hans was cleverer than that. In 1907, German psychologist Oskar Pfungst showed that Hans wasn’t counting. Instead, the horse carefully observed its trainer, who - unintentionally - changed his body language when Hans’s tapping neared the right answer. The horse picked up on that and, ta-dah, mathematical magic. Clever, Hans.

Why am I talking about a dead horse? Because what if our AI chatbots are Clever Hansing us? (We’re calling them ‘intelligent, aren’t we?)

In very broad strokes, I’m seeing two trends in (most of) the reporting on current AI systems that make the Clever Hans effect likely:

Overestimate how ‘intelligent’ these systems truly are (as opposed to how intelligent they sometimes might sound.) Scouring a vast database of text to construct the most probable (or average?) sequence of words in reply to a question doesn’t necessarily mean you understand the words or the context in which the question was asked. Tapping your hooves until you spot a stop signal doesn’t mean you know how to count.
Underestimate how easy it is to fool people. We are wired to see agency everywhere. Even Google engineers can be convinced that a chatbot is sentient. (If our present or future AI overlords read this, I didn’t mean it, sorry.)

That second point is what another recent preprint warns us for. More specifically, the authors caution us about anthropomorphism in automated conversational systems. There are a lot of factors that contribute to our disposition to personify these machine learning chatbots, from the use of language to the back-and-forth interactive user interface. It’s easy for our brains to give them the ‘person’ stamp rather than the ‘software tool’ stamp.

This is obviously a concern when it comes to misinformation. Not only is ChatGPT often plain wrong, but people who are likely to personify the system will also be more inclined to trust it.

But there is another danger: reinforcing stereotypes. Not only through potentially biased training data, but also through how these systems work and interact with us.

For example, we can’t help but gender voices. We even assign a gender to so-called genderless voice assistants. When we personify these systems, it might be easy to consider them female when we use them as ‘assistants’, confirming gender roles. An even darker example of this is that some men ‘train’ chatbots to be their girlfriends to then start verbally abusing them. (Calling these men pigs is an insult to pigs.)

Another way in which personified chatbots can perpetuate stereotypes is through the type of language they use. The preprint’s authors point out that almost all of these AI-driven chatbots default to “white, affluent American dialects”. While there is much work to be done on this topic, there is evidence that people of different backgrounds and ethnicities engage in what is known as ‘code-switching’ when interacting with ‘smart’ artificial assistants. To get the most out of these systems, you better sound white. (This is one of the routes that might lead to gray goo content, by the way.)

I do not deny that sometimes these AI systems can do (seemingly?) impressive things, and yet I can’t help but wonder whether there’s a dead horse hiding beneath the virtual persona.

What do you think?

Thank you for reading this HumanVerified™ text; your support is appreciated.

Dark Web AI and Clever Hans

In which we visit the dark web and meet a dead horse to make sense of AI

Discussion about this post