New framework and set of capabilities for benchmarking and red teaming AI evaluation
DataRobot, the enterprise AI platform leader, today announced the integration of LLM evaluation measures aligned with a new initiative from the Singapore Government Agency, Infocomm Media Development Authority (IMDA). The “Project Moonshot” initiative unveiled at the Singapore Asia Tech x Summit offers new capabilities that help AI practitioners and system owners manage LLM deployment risks by providing a common framework for benchmarking and red teaming evaluation.
“At DataRobot, our focus is addressing the confidence gap and helping organizations scale responsible use of generative AI,” said Jay Schuren, Chief Customer Officer, DataRobot. “We’re excited to announce that our latest product release incorporates Project Moonshot’s testing toolkit and its benchmarking and evaluation tests. The result is that LLM evaluations are more accessible and help scale the responsible use of generative AI, enabling practitioners to turn on and configure guard models to change the behavior and responses of LLMs.”
Project Moonshot delivers three core capabilities for AI practitioners and system owners:
- Automated evaluation tools for generative AI solutions that easily integrate into CI/CD pipelines.
- A benchmark repository allowing teams to run evaluations relevant to their applications by curating the right benchmarks.
- A one-stop tool for AI red teaming, from jailbreaks to customized attacks.
“The development of Project Moonshot, one of the world’s first open-source tools to bring red teaming, benchmarking and baseline testing together in an easy-to-use platform, would not have been possible without the contribution of partners such as DataRobot,” said, Dr Ong Chen Hui, Chair of the Governing Committee at AI Verify Foundation. “Project Moonshot will provide developers with an intuitive toolkit to test their LLM applications. This new toolkit signals Singapore’s continued commitment to advance the global open-source efforts toward addressing generative AI safety concerns.”
“We are proud to support our portfolio company, DataRobot, on its growth journey in Southeast Asia,” said Paul Ng, Chief Executive Officer, EDBI. “The company has not only expanded its footprint in Singapore but has also fostered collaborations that benefit the local innovation ecosystem, such as this partnership with IMDA. Project Moonshot provides local enterprises with the right tools to deploy generative AI technologies confidently. As a strategic investor, we are committed to creating value for our portfolio companies while enhancing Singapore’s innovation capabilities.”
DataRobot is an IMDA accredited company and a member of the AI Verify Foundation which launched the world’s first AI Governance Testing Framework and Toolkit in 2022.
Explore AITechPark for the latest advancements in AI, IOT, Cybersecurity, AITech News, and insightful updates from industry experts!
The post DataRobot joins IMDA to Make LLM Evaluation More Accessible to AI Builders first appeared on AI-Tech Park.
#AI #aidataanalytics #aiinformation #ailearning #aitechnology #aitechnologycompanies #AItechnews #artificialintelligence #artificialintelligencechatbot #artificialintelligencerobot #generativeartificialintelligence [Source: AI Techpark]