PromptQL Partners with UC Berkeley to Develop New Data Agent Benchmark for Reliability of Enterprise AI Agents
PromptQL, a platform for reliable AI, today announced a strategic research collaboration with the University of California, Berkeley to develop the first comprehensive data agent benchmark for enterprise reliability specifically designed to evaluate general-purpose AI data agents in enterprise environments.
Also Read: Is LoRa the Backbone of Decentralized AI Networks?
A recent McKinsey study revealed that 78% of organizations use AI in at least one business function, however, more than 80% say their organization hasn’t seen a tangible impact on enterprise-level Earnings Before Interest and Taxes (EBIT). The partnership – led by Aditya Parameswaran, Professor and Co-Director of UC Berkeley’s EPIC Data Lab, along with his students – addresses this fundamental challenge organizations face when deploying AI systems in business-critical environments.
While existing agentic data benchmarks like GAIA, Spider, and FRAMES test specific AI tasks, they overlook the complexity, reliability demands, and messy, siloed data that define real business environments. The forthcoming data agent benchmark aims to offer a solution by creating a framework that reflects real-world complexities.
“Our customer conversations reveal a clear pattern—they’re ready to move from proof-of-concepts to production AI, yet they lack the evaluation tools to make confident deployment decisions,” said Tanmai Gopal, CEO of PromptQL. “The data agent benchmark changes that by using representative datasets from our work in telecom, healthcare, finance, retail, and anti-money laundering to reflect the real complexity of enterprise AI.”
Also Read: Upgrading to Smart Meeting Rooms with AI Integrations
UC Berkeley’s EPIC Data Lab brings expertise to this collaboration. Professor Parameswaran is a leading authority on the use of AI for next-gen usable data analysis tools and has received numerous prestigious awards. His research group has created widely-adopted data tools with tens of millions of downloads.
“Current benchmarks suffer from what I call the ‘1% problem’—they’re built for tech giants and ignore the 99% of organizations grappling with real-world data complexity,” Parameswaran said. “The data agent benchmark marks a shift toward evaluating AI based on the reliability, transparency, and practical value enterprises actually need. This collaboration bridges academic rigor with the production insights PromptQL brings from real deployments.”
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Comments are closed.