Haize Labs

haizelabs.com

Founded Year

2023

Stage

Series A | Alive

Valuation

$0000

Mosaic Score
The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.

+74 points in the past 30 days

About Haize Labs

Haize Labs develops artificial intelligence-based development tools designed to automate large language models and stress testing at scale. It helps users identify how their large language models can fail and then stress-test them to find vulnerabilities, enabling the development of more robust LLMs and a new generation of reliable LLM applications. The company was founded in 2024 and is based in New York, New York.

Headquarters Location

222 Broadway

New York, New York, 10007,

United States

Research containing Haize Labs

Get data-driven expert analysis from the CB Insights Intelligence Unit.

CB Insights Intelligence Analysts have mentioned Haize Labs in 2 CB Insights research briefs, most recently on Mar 6, 2025.

Mar 6, 2025

The AI agent market map

Feb 28, 2025

What’s next for AI agents? 4 trends to watch in 2025

Expert Collections containing Haize Labs

Expert Collections are analyst-curated lists that highlight the companies you need to know in the most important technology spaces.

Haize Labs is included in 3 Expert Collections, including Artificial Intelligence.

Artificial Intelligence

10,195 items

Generative AI

2,793 items

Companies working on generative AI applications and infrastructure.

AI agents

376 items

Companies developing AI agent applications and agent-specific infrastructure. Includes pure-play emerging agent startups as well as companies building agent offerings with varying levels of autonomy. Not exhaustive.

Latest Haize Labs News

GenAI’s Last Mile: Haize Labs Tackles Brittleness with Intelligent Fuzzing

Aug 24, 2025

August 24, 2025, 2:17 pm IDT The allure of rapid AI prototyping often obscures the profound challenges of deploying reliable, production-grade generative AI applications. Leonard Tang, Founder & CEO of Haize Labs, articulately addressed this “last mile problem” at the AI Engineer World’s Fair, dissecting why GenAI’s inherent brittleness demands a radical shift in how we approach evaluation and testing. His presentation introduced “Haizing,” a novel methodology for rigorously validating AI systems before they ever reach the market. Tang posits that while it’s “very, very hard” to transition an LLM application from a proof-of-concept to a robust, enterprise-ready solution, this difficulty stems not from non-determinism, but from brittleness. He illustrates this with examples of chatbots exhibiting “Lipschitz discontinuity,” where “seemingly similar Input B” can lead to “Wildly Unexpected Output B.” A trivial change in phrasing, a slight perturbation, can cause an AI to hallucinate discounts or provide dangerous advice, underscoring the critical need for a more comprehensive evaluation paradigm. Traditional evaluation methods, relying on static “golden datasets,” prove woefully inadequate in this GenAI era. “A static dataset tells an incomplete story about AI application reliability due to lack of coverage,” Tang asserts. These datasets, manually curated and narrow, cannot possibly encompass the infinite permutations of user input or the subtle contextual shifts that trigger catastrophic failures. Furthermore, defining objective metrics for subjective human judgment remains an elusive, often brittle, task. Haize Labs’ solution, Haizing, directly confronts these limitations by simulating large-scale user interactions and automatically analyzing responses for anomalies. This iterative process, akin to fuzz testing in software, involves intelligent input generation paired with sophisticated output judging. Instead of relying on predefined test cases, Haizing dynamically explores the vast input space to proactively uncover vulnerabilities. A core insight lies in Haize’s approach to judging: moving “beyond LLM-as-a-Judge.” Recognizing that even an LLM acting as a judge can be “brittle & sensitive” and prone to “hallucinations” or “syntactic biases,” Haize developed “Verdict.” This agent-based framework, utilizing stacked GPT-4o mini models, achieves superior expert QA judge accuracy at a fraction of the cost and latency of larger, less efficient models. For even finer-grained control and tailored criteria, Haize employs RL-tuned judges. By training models with techniques like Self-Principled Critique Tuning (SPCT), they create judges that generate coherent rationales and score based on unique, instance-specific criteria, effectively acting as “LLM Unit Tests.” This advanced method allows for training smaller models to achieve performance competitive with much larger, frontier models on complex reward benchmarks. The practical impact of Haizing is significant, particularly for highly regulated industries. For a Fortune 500 bank deploying outbound voice agents, Haize’s platform uncovered unknown bugs that violated Consumer Financial Protection Bureau rules. “This took us 3 months; Haizing only took 5 minutes,” a bank representative noted, highlighting the transformative efficiency of this approach. This demonstrates how advanced simulation and judging can unlock production paths by ensuring compliance and robust performance at speed.

Jun 25, 2025

Cyera估值达60亿美元背后：安全不是AI的加分项，而是落地的必要一环

Jun 7, 2025

從拒絕關機到洩漏隱私，AI 行為失衡是否預示科技黑暗面？

Oct 25, 2024

Stage 2 Capital Welcomes New Cohort to Catalyst, the GTM accelerator for Early Stage B2B Founders

Oct 16, 2024

Security threats to AI models are giving rise to a new crop of startups

Haize Labs Frequently Asked Questions (FAQ)

When was Haize Labs founded?
Haize Labs was founded in 2023.
Where is Haize Labs's headquarters?
Haize Labs's headquarters is located at 222 Broadway, New York.
What is Haize Labs's latest funding round?
Haize Labs's latest funding round is Series A.
Who are the investors of Haize Labs?
Investors of Haize Labs include General Catalyst and Soma Capital.
Who are Haize Labs's competitors?
Competitors of Haize Labs include LlamaIndex and 5 more.

Compare Haize Labs to Competitors

Phidata

Phidata specializes in large language models (LLMs) offering a framework for building AI assistants with added memory, knowledge, and tool integration capabilities within the artificial intelligence sector. It provides solutions that enable AI assistants to maintain long-term conversations, access contextual knowledge, and perform real-time data operations such as API calls, database queries, and file management without requiring specific product names or technical jargon. The company was founded in 2021 and is based in New York, New York.

G42

G42 focuses on artificial intelligence and cloud computing, operating across various sectors. The company specializes in artificial intelligence (AI) research, data center operations, and cloud computing services. G42's offerings include health information exchange platforms, autonomous ride-hailing technology, and precision weather forecasting. It was founded in 2018 and is based in Abu Dhabi, United Arab Emirates.

Cohere

Cohere operates as an enterprise artificial intelligence (AI) company building foundation models and AI products across various sectors. The company offers a platform that provides multilingual models, retrieval systems, and agents to address business problems while ensuring data security and privacy. Cohere serves financial services, healthcare, manufacturing, energy, and the public sector. It was founded in 2019 and is based in Toronto, Canada.

MI2.ai

MI2.ai focuses on machine learning predictive models in the data science and artificial intelligence sectors. The company provides services related to responsible machine learning practices, including research and consulting. It serves the academic community and businesses interested in implementing AI practices. The company was founded in 2016 and is based in Warszawa, Poland.

LangChain

LangChain specializes in the development of large language model (LLM) applications and provides a suite of products that support developers throughout the application lifecycle. It offers a framework for building context-aware, reasoning applications, tools for debugging, testing, and monitoring application performance, and solutions for deploying application programming interfaces (APIs) with ease. It was founded in 2022 and is based in San Francisco, California.

Altrata

Altrata provides intelligence and data solutions within the finance and business sectors. The company offers executive and company data, wealth management insights, and strategic relationship mapping, focused on client acquisition, deal-making, fundraising, research, executive search, and risk mitigation. Altrata serves sectors that require insights into business leaders, wealthy individuals, and their networks for informed decision-making. It is based in New York, New York.

CBI websites generally use certain cookies to enable better interactions with our sites and services. Use of these cookies, which may be stored on your device, permits us to improve and customize your experience. You can read more about your cookie choices at our privacy policy here. By continuing to use this site you are consenting to these choices.

How VCs Use CB Insights

Professional Services

Platform Overview

Haize Labs

Founded Year

Stage

Valuation

Mosaic Score
The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.

About Haize Labs

Headquarters Location

Research containing Haize Labs

Expert Collections containing Haize Labs

Artificial Intelligence

Generative AI

AI agents

Latest Haize Labs News

Haize Labs Frequently Asked Questions (FAQ)

Compare Haize Labs to Competitors

How VCs Use CB Insights

Professional Services

Platform Overview

Founded Year

Stage

Valuation

Mosaic Score The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.

About Haize Labs

Headquarters Location

Research containing Haize Labs

Expert Collections containing Haize Labs

Artificial Intelligence

Generative AI

AI agents

Latest Haize Labs News

Haize Labs Frequently Asked Questions (FAQ)

Compare Haize Labs to Competitors

Mosaic Score
The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.