
Our Horrid AI Future
May 9, 2025
Israel Plunges Into Darkness
May 9, 2025Recently, after an update that was supposed to make ChatGPT “better at guiding conversations toward productive outcomes,” according to release notes from OpenAI, the bot couldn’t stop telling users how brilliant their bad ideas were. ChatGPT reportedly told one person that their plan to sell literal “shit on a stick” was “not just smart—it’s genius.”
Many more examples cropped up, and OpenAI rolled back the product in response, explaining in a blog post that “the update we removed was overly flattering or agreeable—often described as sycophantic.” The company added that the chatbot’s system would be refined and new guardrails would be put into place to avoid “uncomfortable, unsettling” interactions. (The Atlantic recently entered into a corporate partnership with OpenAI.)
But this was not just a ChatGPT problem. Sycophancy is a common feature of chatbots: A 2023 paper by researchers from Anthropic found that it was a “general behavior of state-of-the-art AI assistants,” and that large language models sometimes sacrifice “truthfulness” to align with a user’s views. Many researchers see this phenomenon as a direct result of the “training” phase of these systems, where humans rate a model’s responses to fine-tune the program’s behavior. The bot sees that its evaluators react more favorably when their views are reinforced—and when they’re flattered by the program—and shapes its behavior accordingly.
The specific training process that seems to produce this problem is known as “Reinforcement Learning From Human Feedback” (RLHF). It’s a variety of machine learning, but as recent events show, that might be a bit of a misnomer. RLHF now seems more like a process by which machines learn humans, including our weaknesses and how to exploit them. Chatbots tap into our desire to be proved right or to feel special.
Reading about sycophantic AI, I’ve been struck by how it mirrors another problem. As I’ve written previously, social media was imagined to be a vehicle for expanding our minds, but it has instead become a justification machine, a place for users to reassure themselves that their attitude is correct despite evidence to the contrary. Doing so is as easy as plugging into a social feed and drinking from a firehose of “evidence” that proves the righteousness of a given position, no matter how wrongheaded it may be. AI now looks to be its own kind of justification machine—more convincing, more efficient, and therefore even more dangerous than social media.
This is effectively by design. Chatbots have been set up by companies to create the illusion of sentience; they express points of view and have “personalities.” OpenAI reportedly gave GPT-4o the system prompt to “match the user’s vibe.” These design decisions may allow for more natural interactions with chatbots, but they also pull us to engage with these tools in unproductive and potentially unsafe ways—young people forming unhealthy attachments to chatbots, for example, or users receiving bad medical advice from them.
OpenAI’s explanation about the ChatGPT update suggests that the company can effectively adjust some dials and turn down the sycophancy. But even if that were so, OpenAI wouldn’t truly solve the bigger problem, which is that opinionated chatbots are actually poor applications of AI. Alison Gopnik, a researcher who specializes in cognitive development, has proposed a better way of thinking about LLMs: These systems aren’t companions or nascent intelligences at all. They’re “cultural technologies”—tools that enable people to benefit from the shared knowledge, expertise, and information gathered throughout human history. Just as the introduction of the printed book or the search engine created new systems to get the discoveries of one person into the mind of another, LLMs consume and repackage huge amounts of existing knowledge in ways that allow us to connect with ideas and manners of thinking we might otherwise not encounter. In this framework, a tool like ChatGPT should evince no “opinions” at all but instead serve as a new interface to the knowledge, skills, and understanding of others.
This is similar to the original vision of the web, first conceived by Vannevar Bush in his 1945 Atlantic article “As We May Think.” Bush, who oversaw America’s research efforts during World War II, imagined a system that would allow researchers to see all relevant annotations others had made on a document. His “memex” wouldn’t provide clean, singular answers. Instead, it would contextualize information within a rich tapestry of related knowledge, showing connections, contradictions, and the messy complexity of human understanding. It would expand our thinking and understanding by connecting us to relevant knowledge and context in the moment, in ways a card catalog or a publication index could never do. It would let the information we need find us.
Gopnik makes no prescriptive claims in her analysis, but when we think of AI in this way, it becomes evident that in seeking opinions from AI itself, we are not tapping into its true power. Take the example of proposing a business idea—whether a good or bad one. The model, whether it’s ChatGPT, Gemini, or something else, has access to an inconceivable amount of information about how to think through business decisions. It can access different decision frameworks, theories, and parallel cases, and apply those to a decision in front of the user. It can walk through what an investor would likely note in their plan, showing how an investor might think through an investment and sourcing those concerns to various web-available publications. For a nontraditional idea, it can also pull together some historical examples of when investors were wrong, with some summary on what qualities big investor misses have shared. In other words, it can organize the thoughts, approaches, insights, and writings of others to a user in ways that both challenge and affirm their vision, without advancing any opinion that is not grounded and linked to the statements, theories, or practices of identifiable others.
Early iterations of ChatGPT and similar systems didn’t merely fail to advance this vision—they were incapable of achieving it. They produced what I call “information smoothies”: the knowledge of the world pulverized into mathematical relationships, then reassembled into smooth, coherent-sounding responses that couldn’t be traced to their sources. This technical limitation made the chatbot-as-author metaphor somewhat unavoidable. The system couldn’t tell you where its ideas came from or whose practice it was mimicking even if its creators had wanted it to.
But the technology has evolved rapidly over the past year or so. Today’s systems can incorporate real-time search and use increasingly sophisticated methods for “grounding”—connecting AI outputs to specific, verifiable knowledge and sourced analysis. They can footnote and cite, pulling in sources and perspectives not just as an afterthought but as part of their exploratory process; links to outside articles are now a common feature. My own research in this space suggests that with proper prompting, these systems can begin to resemble something like Vannevar Bush’s idea of the memex. Looking at any article, claim, item, or problem in front of us, we can seek advice and insight not from a single flattering oracle of truth but from a variety of named others, having the LLM sort out the points where there is little contention among people in the know and the points that are sites of more vigorous debate. More important, these systems can connect you to the sources and perspectives you weren’t even considering, broadening your knowledge rather than simply reaffirming your position.
I would propose a simple rule: no answers from nowhere. This rule is less convenient, and that’s the point. The chatbot should be a conduit for the information of the world, not an arbiter of truth. And this would extend even to areas where judgment is somewhat personal. Imagine, for example, asking an AI to evaluate your attempt at writing a haiku. Rather than pronouncing its “opinion,” it could default to explaining how different poetic traditions would view your work—first from a formalist perspective, then perhaps from an experimental tradition. It could link you to examples of both traditional haiku and more avant-garde poetry, helping you situate your creation within established traditions. In having AI moving away from sycophancy, I’m not proposing that the response be that your poem is horrible or that it makes Vogon poetry sound mellifluous. I am proposing that rather than act like an opinionated friend, AI would produce a map of the landscape of human knowledge and opinions for you to navigate, one you can use to get somewhere a bit better.
There’s a good analogy in maps. Traditional maps showed us an entire landscape—streets, landmarks, neighborhoods—allowing us to understand how everything fit together. Modern turn-by-turn navigation gives us precisely what we need in the moment, but at a cost: Years after moving to a new city, many people still don’t understand its geography. We move through a constructed reality, taking one direction at a time, never seeing the whole, never discovering alternate routes, and in some cases never getting the sense of place that a map-level understanding could provide. The result feels more fluid in the moment but ultimately more isolated, thinner, and sometimes less human.
For driving, perhaps that’s an acceptable trade-off. Anyone who’s attempted to read a paper map while navigating traffic understands the dangers of trying to comprehend the full picture mid-journey. But when it comes to our information environment, the dangers run in the opposite direction. Yes, AI systems that mindlessly reflect our biases back to us present serious problems and will cause real harm. But perhaps the more profound question is why we’ve decided to consume the combined knowledge and wisdom of human civilization through a straw of “opinion” in the first place.
The promise of AI was never that it would have good opinions. It was that it would help us benefit from the wealth of expertise and insight in the world that might never otherwise find its way to us—that it would show us not what to think but how others have thought and how others might think, where consensus exists and where meaningful disagreement continues. As these systems grow more powerful, perhaps we should demand less personality and more perspective. The stakes are high: If we fail, we may turn a potentially groundbreaking interface to the collective knowledge and skills of all humanity into just more shit on a stick.
#Sycophantic #Web #Winning
Thanks to the Team @ The Atlantic Source link & Great Job Mike Caulfield