These are not the words you want to hear when it comes to human extinction, but I was hearing them: โThings are moving uncomfortably fast.โ I was sitting in a conference room with Sam Bowman, a safety researcher at Anthropic. Worth $183 billion at the latest estimate, the AI firm has every incentive to speed things up, ship more products, and develop more advanced chatbots to stay competitive with the likes of OpenAI, Google, and the industryโs other giants. But Anthropic is at odds with itselfโthinking deeply, even anxiously, about seemingly every decision.
Anthropic has positioned itself as the AI industryโs superego: the firm that speaks with the most authority about the big questions surrounding the technology, while rival companies develop advertisements and affiliate shopping links (a difference that Anthropicโs CEO, Dario Amodei, was eager to call out during an interview in Davos last week). On Monday, Amodei published a lengthy essay, โThe Adolescence of Technology,โ about the โcivilizational concernsโ posed by what he calls โpowerful AIโโthe very technology his firm is developing. The essay has a particular focus on democracy, national security, and the economy. โGiven the horror weโre seeing in Minnesota, its emphasis on the importance of preserving democratic values and rights at home is particularly relevant,โ Amodei posted on X, making him one of very few tech leaders to make a public statement against the Trump administrationโs recent actions.
This rhetoric, of course, serves as good brandingโa way for Anthropic to stand out in a competitive industry. But having spent a long time following the company and, recently, speaking with many of its employees and executives, including Amodei, I can say that Anthropic is at least consistent. It messages about the ethical issues surrounding AI constantly, and it appears unusually focused on user safety. Bowmanโs job, for example, is to vet Anthropicโs products before theyโre released into the world, making sure that they will not spew, say, white-supremacist talking points; push users into delusional crises; or generate nonconsensual porn.
So far, the effort seems to be working: Unlike other popular chatbots, including OpenAIโs ChatGPT and Elon Muskโs Grok, Anthropicโs bot, Claude, has not had any major public blowups despite being as advanced as, and by some measures more advanced than, the rest of the field. (That may be in part because its chatbot does not generate images and has a smaller user base than some rival products.) But although Anthropic has so far dodged the various scandals that have plagued other large language models, the company has not inspired much faith that such problems will be avoided forever. When I met Bowman last summer, the company had recently divulged that, in experimental settings, versions of Claude had demonstrated the ability to blackmail users and assist them when they ask about making bioweapons.
Continue Reading on The Atlantic
This preview shows approximately 15% of the article. Read the full story on the publisher's website to support quality journalism.