New AI Tool Beats OpenAI by Teaching Codebots How to ‘See’ – BriefGlance

Stay ahead in the ever-evolving world of Artificial Intelligence with our curated selection of insightful blogs. Explore the latest trends, breakthroughs, and expert perspectives on AI technology. From machine learning to robotics, our handpicked collection of AI blogs offers a glimpse into the future of innovation. Dive into the realm of artificial intelligence and discover the transformative power of cutting-edge ideas and advancements in this dynamic field.
Causal Dynamics Lab’s Cielara Code is outperforming giants by fixing a core flaw: AI agents can write code, but they can’t find where to put it.
SAN FRANCISCO, CA – May 05, 2026 – In a significant challenge to the titans of artificial intelligence, San Francisco-based startup Causal Dynamics Lab (CDL) has unveiled a new product that outperforms coding models from OpenAI and Anthropic on several key benchmarks. The product, Cielara Code, addresses a critical and costly flaw in the current generation of AI coding assistants: their inability to understand the context of the software they are modifying.
While AI coding tools can generate code at an astonishing rate, this speed has come at a price. The 2025 DORA report, a respected industry benchmark for software development performance, noted a 7.2% drop in deployment stability directly attributable to the use of AI coding tools. This growing issue, which AWS CTO Werner Vogels has termed "dynamic verification debt," highlights a dangerous gap between an AI's ability to write code and its ability to understand the consequences of that code in a live production environment.
CDL's Cielara Code doesn't aim to replace models like OpenAI’s Codex or Anthropic’s Claude Code, but to act as an essential safety layer on top of them. New benchmark results suggest this approach is not only effective but superior in tackling one of the hardest parts of software development.
According to research from Causal Dynamics Lab, the primary bottleneck for today's AI coding agents isn't a lack of programming skill, but a profound lack of awareness. After instrumenting thousands of AI-driven coding sessions, CDL found that a staggering 56.8% of an agent's actions were spent simply reading files, with another 24.2% dedicated to using basic search commands like grep. Less than 1% of their actions involved the task they were built for: actually editing code.
The data reveals that AI agents are effectively fumbling in the dark, treating complex software architecture as flat text. They struggle to see how files connect, how functions call one another, or how a change in one place might cause a cascade of failures elsewhere. This problem is particularly acute in large, complex codebases; CDL's study found that when a fix required changes to more than six files, an agent's ability to recall the necessary information plummeted while the compute power wasted on failed attempts quadrupled.
"Every coding agent out there today uses grep, which is like a surgeon operating without imaging," said Hasibul Haque, CEO at Causal Dynamics Lab and former head of platform engineering at Uber. "We created Cielara Code to help agents see better: it provides a clear understanding of the working environment, making the reasons behind each change clear and verifiable."
This issue is not just theoretical. A well-documented issue on GitHub (#42796) for Claude Code illustrates the problem at scale, showing how current agents fail to grasp the interconnected nature of modern software, leading to flawed or incomplete solutions.
Cielara Code’s breakthrough lies in its novel approach to representing software. Instead of letting an AI agent wander through a directory of files, it first builds a comprehensive map of the entire system. This is achieved through a proprietary "Production World Model," represented as a 6-layer causal graph.
This graph is more than just a dependency tree. It encodes deep, contextual information about the software, including what the code does, why it was created, who owns it, its operational constraints, where it runs, and its runtime behavior. When a failure occurs, the system can trace the problem back not just to the line of code that broke, but to the developer who approved it and the business reason for the change.
This contextual map allows Cielara Code to guide AI agents with unprecedented precision. Across three independent benchmarks, it achieved an overall code localization accuracy of 0.774, surpassing both Claude Code (Opus-4.6) at 0.738 and OpenAI Codex (GPT-5.4) at 0.707. On the MULocBench, a test suite of over 1,000 issues, Cielara reduced task time while cutting compute costs by 30 to 40 percent.
The engine powering this is REASONARA, a graph-structured causal memory layer. It can store an immense context of over 125 million tokens but intelligently retrieves only the thousand or so tokens relevant to a specific task. This represents a context-lookup reduction of up to 98% compared to brute-force methods, enabling faster and more efficient analysis.
The market has been quick to respond. Cielara Code is already in use by 11 Fortune 100 and over 40 Fortune 500 companies, who see it as a critical piece of validation infrastructure for their increasingly automated development pipelines.
For enterprise leaders, the tool addresses a growing anxiety. "Board members and auditors expect more proactive risk management," said the CISO of one of the largest law firms in the United States, who is a Cielara Code customer. "Leaders now want proof that security can anticipate risks caused by fast-moving AI and automation, instead of just reacting after incidents."
This sentiment is echoed by other industry leaders who see the technology as a necessary evolution. Phillip Miller, Vice President and Global CISO at H&R Block, described CDL's technology as a "generational leap towards the original promise of AI." He added, "Enterprises need solutions to problems they cannot solve with people alone… When I wrote, Hacking Success, I described a world where AI needs strong, directive policy (not rules / guardrails) to be safe and effective. Enterprises now have an option to leverage Cielera's models to oversee deployments of AI agents, models, and their supporting infrastructure."
Causal Dynamics Lab, founded by a team of Uber platform veterans and AI researchers from Microsoft Research and Emory University, views its current products as just the first step. The underlying Production World Model serves as a foundation for a much broader vision.
"AI has already changed how people find information," noted Matt Fisher, former Co-Founder and CTO of Daydream. "The next step is to change how people make decisions by exploring possibilities, comparing options, and understanding the outcomes before making a choice. That shift towards exploring outcomes is what CDL is focusing on."
The company's roadmap involves expanding its simulation capabilities to predict the full impact of changes not just in code, but across infrastructure, policy, and operations. The ultimate goal is to create a permanent, enterprise-wide reasoning layer that any AI agent can consult before making a change, ensuring that the speed of AI development is finally matched by a commensurate level of safety and understanding.
Are you a relevant expert who could contribute your opinion or insights to this article? We’d love to hear from you. We will give you full credit for your contribution.
The Patterson Analysis
A popular podcast is tackling the rise of misinformation by asking a bold question: Do conspiracy…
Walker on Progress
A new AUA census reveals a specialty grappling with severe access gaps, an aging workforce, and t…
Laura Harris: Unfiltered
Infomercial king Kevin Harrington invests in Edison Interactive, betting that hotel TVs and golf …
Gary Clark: Unfiltered
Celebrating 50 years with record profits, JKB unveils a bold strategy to channel global investmen…
Get your daily brief delivered to your inbox every morning at 7am
Research-driven business journalism with editorial perspective. Original analysis across specialized industry columns.
© 2026 BriefGlance. All rights reserved. | Far beyond the business news.
We use cookies to enhance your experience, analyze site usage, and personalize content. Necessary cookies are always enabled. Learn more about our cookie policy.
We use cookies to enhance your experience. You can choose which types of cookies to allow. Note that blocking some types of cookies may impact your experience.
Essential for website functionality and security. Cannot be disabled.
These cookies are necessary for the website to function and cannot be switched off. They include session cookies, security tokens, and load balancer cookies.
Help us understand how visitors interact with our website.
These cookies collect information about how you use our website, such as which pages you visit and if you experience any errors. This data is used to improve our website.
Used for advertising and personalized content (currently not used).
These cookies would be used to make advertising messages more relevant to you. We currently do not use marketing cookies but may in the future for sponsored content.
Learn more: Privacy PolicyCookie Policy

source

Scroll to Top