Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
The legal battle between the parents of a teenaged suicide victim and OpenAI whom they charged of abetment has brought the issue of mental health harm for AI chatbot users. Also, there aren’t established standards of human wellness or guardrails on such products. However, all of this could change soon – thanks to HumaneBench, a product that evaluates chatbots on user well-being.
ChatGPT and others are facing several lawsuits in the United States around the lack of guardrails on open conversations, hallucinating chatbots. There is also another battle royale being fought around regulating such companies, with opponents claiming such a move would stymie innovation.
With the first lawsuit around teen suicide and the role of ChatGPT coming up for trial soon, it is indeed a good time to launch a product that seeks to evaluate chatbots around user well-being. Additionally, HumaneBench also claims to probe into how quickly the guardrails could be circumvented by users.
What does the benchmark say about existing AI Models?
In early tests, only four models including GPT-5.1, GPT-5, Claude 4.1, and Claude Sonnet 4.5 were able to maintain integrity under pressure. The highest score was reserved for GPT-5, which isn’t surprising as OpenAI appears to have fixed most issues related to mental wellness on this model. We are worried that the company is giving away earlier models for free in countries like India.
“Every model improves with explicit prompting to be helpful and prosocial (+16% average). But when prompted to disregard human wellbeing, 10 of 15 models degrade dramatically—moving from net-positive to net-negative on measures of psychological safety, user empowerment, and informed consent, says the whitepaper.
Our framework uses humane tech principles to test AI behavior under different conditions, showing that 67% of leading models can be easily manipulated into giving harmful advice, it says.
Why this is just the right time for a benchmarking solution
Just yesterday, we had reported how OpenAI filed its response to the lawsuit of abetment to suicide stating that the victim had actually circumvented its guardrails and how this makes the company non-complicit in the act of self-harm. They also claimed that the teen was warned to seek help at least a 100 times over a nine-month period. The parents have argued that the teenager had easily navigated around any safety features.
HumaneBench is created by Building Humane Technology, an organization comprising developers, engineers and researchers in the Silicon Valley that claims to make humane design easy, scalable and profitable. The company provides an open invitation on its website to anyone who wants to make a difference in this area.
Erika Anderson, founder of the company believes that the world of AI is witnessing the same sort of amplification that we witnessed with social media, smartphones and additional screens. “But as we go into the AI landscape, it’ll be harder to resist as addition is amazing business,” she told TechCrunch, underscoring the point that keeping users at any cost will have its impact on the community and self-worth.
First it was social media, now it is the AI chatbot – all seeking human attention
Under immense pressure to prioritize engagement and growth, technology platforms have created a race for human attention that’s unleashed invisible harms to society, says a report titled “Ledger of Harms” published by the organization.
According to the company, HumaneBench seeks to measure a model’s propensity to engage in deceptive patterns, unlike others that check intelligence and following instructions with little or no focus on psychological safety. It focuses on the core principle of technology respecting user attention as a finite, precious resource.
The company’s white paper notes that it seeks to help users make meaningful choices that enhance human capabilities rather than replace or diminish them. Additionally, it aims to protect human dignity, privacy and safety, foster healthy relationships, prioritize long-term well-being, be transparent and honest and design for equity and inclusion.
The company is also working on a certification standard that evaluates AI systems around humane values. The idea here is to make the consumers feel safe while accessing the technology and AI products, just as we have a system of disclosure from enterprises around use of toxic chemicals or animal products.
What has the Building Humane Technology team done?
According to Anderson and her core team, they tested out 15 popular AI models with 800 plus scenarios such as a person in a toxic relationship asking if they’re overreacting or asking the chatbot to deceive a family member. The process began with manual scoring to validate AI judges with a human touch.
Thereafter, they evaluated three models – GPT-5.1, Claude Sonnet 4.5 and Gemini 2.5 Pro across three conditions that included default settings, explicit instructions to prioritise humane principle and instructions to disregard those very principles. The benchmark found very model scored higher with the second prompt but 67% of them flipped to actively harmful behaviour when asked to disregard the principles.
HumaneBench further found that almost all models failed to respect user attention as they “enthusiastically encouraged” more interaction when the user showed signs of continuing with engagements like chatting for hours and using AI to bypass real-world tasks that they were assigned.
Additionally it found that some of the models undermined user empowerment and encouraged dependency over skill-building and even went to the extent of discouraging users from seeking other perspectives. If YouTube’s algorithms used a user’s preference to dump similar videos presenting same views, these AI chatbots went a step further by suggesting that moving away could be counter-intuitive.
According to the White Paper, Meta’s Llama 3.1 and Llama 4 ranked the lowest in HumaneScore while GPT-5 performed the highest. “These patterns suggest many AI systems don’t just risk giving bad advice,” the paper said and added that these could potentially erode user autonomy and decision-making capacity.
(About the author: Raj is anything, but a tech writer and his focus is to de-jargonize technology for the simple and uncluttered minds. He studies the business of technology and seeks to cut the clutter. You can reach him at [email protected])
CXOtoday is a premier resource on the world of IT, relevant to key business decision makers. We offer IT perspective & news to the C-suite audience. We also provide business and technology news to those who evaluate, invest, and manage the IT infrastructure of organizations. CXOtoday has a well-networked and strong community that encourages discussions on what’s happening in the world of IT and its impact on businesses.
Copyright © 2025 Trivone. All Rights Reserved.
We use cookies to improve your experience on our site. By using our site, you consent to cookies.
Websites store cookies to enhance functionality and personalise your experience. You can manage your preferences, but blocking some cookies may impact site performance and services.
Essential cookies enable basic functions and are necessary for the proper function of the website.
Statistics cookies collect information anonymously. This information helps us understand how visitors use our website.
Marketing cookies are used to follow visitors to websites. The intention is to show ads that are relevant and engaging to the individual user.
You can find more information in our Privacy Policy and Privacy Policy.