Venture Bytes #132: Building the Case for Vertical AI

Building the Case for Vertical AI

One of the most interesting debates in AI today is whether there is still an application layer left to build. As frontier models continue to improve, many founders are asking a difficult question. If companies like OpenAI and Anthropic are becoming increasingly capable, will they eventually absorb most of the value in the software stack?

In a recent essay, "Avoiding Death on the Yellow Brick Road," a16z partner Joe Schmidt argues that the answer is no. His central thesis is that while foundation model companies will dominate horizontal workflows that improve with raw model capability, a vast opportunity remains in more complex, domain-specific applications. The argument is compelling. Better models do not automatically solve messy workflows, regulatory requirements, or industry-specific processes.

But the more interesting distinction may not be between the "Yellow Brick Road" and the "rest of Oz." It may be between intelligence and execution.

Intelligence is rapidly becoming abundant. But businesses do not buy intelligence in the abstract. They buy execution in the form of claims processed, invoices reconciled, policies underwritten, and contracts reviewed. McKinsey estimated the technology could unlock between $2.6 trillion and $4.4 trillion in annual economic value across industries. Across the 63 use cases McKinsey examined, less than a third of the potential impact is directly attributable to the model itself. The rest depends on workflow redesign.

The gap between what a model knows and what an enterprise needs accomplished is far larger than many observers appreciate. That gap consists of workflows, approvals, integrations, compliance requirements, exception handling, and institutional knowledge accumulated over decades.

This distinction matters because systems of execution compound in ways models alone do not. Every transaction creates feedback loops. Every human correction becomes training data. Every exception reveals another edge case. Over time, these systems accumulate process knowledge that exists nowhere on the public internet and cannot simply be recreated by releasing a better model.

Viewed through this lens, the future of enterprise AI looks less like a battle between foundation models and application companies and more like a division of labor. Foundation model providers will compete to supply increasingly commoditized intelligence. Application companies will compete to orchestrate that intelligence inside the real-world systems where work actually gets done.

While much of the discussion around AI has focused on obvious targets such as healthcare, legal services, and financial services, some of the most compelling opportunities may lie in industries that receive far less attention. Property management is one such area. On the surface, the industry hardly looks like a fertile ground for a venture-scale software company. The market is fragmented, workflows are highly manual, and property managers rely on a patchwork of legacy systems, leasing agents, maintenance teams, contractors, and call centers.

That complexity, however, is exactly what makes the opportunity attractive. A general-purpose AI assistant may be able to answer questions or draft emails, but running the operations of an apartment portfolio requires much more than intelligence. It requires integrations with property management systems, approvals, audit trails, escalation paths, and thousands of edge cases accumulated over years of operating experience.

This helps explain the emergence of companies like EliseAI. Based in New York, the company began automating leasing conversations for apartment operators. But by embedding itself deeper into daily operations, EliseAI gradually expanded into maintenance, collections, renewals, and resident engagement across the tenant lifecycle.

In 2025, the company reached roughly $100 million in ARR and serves approximately one in every eight apartments in the US. Valued at $2.2 billion in its latest Series E round, the company generates roughly $670,000 of revenue per employee, more than double the efficiency achieved by many leading software companies.

Sales provides another illustration of how value in AI increasingly accrues to systems of execution rather than intelligence itself. At first glance, sales appears to be one of the most vulnerable functions to foundation model companies. Prospecting leads, writing emails, and answering objections seem like tasks that increasingly capable models should eventually absorb. But real-world sales workflows are considerably more complex. Success depends not only on intelligence, but on an accumulation of judgment acquired through thousands of interactions.

Based in New York, Clay is an exciting start-up to watch in this space. The company automates sales prospecting, lead enrichment, and go-to-market workflows for revenue teams. Clay achieved $100M ARR by 2025, representing a 20x increase from $5M in 2023 and 200x growth from $500K in 2022. This trajectory supported a valuation jump from $500M to $3.1B between mid-2024 and mid-2025.

Insurance provides another illustration of why abundant intelligence does not necessarily commoditize applications. Insurers operate within a web of regulations, legacy systems, approval processes, and carrier-specific rules. More importantly, much of the industry's knowledge exists not in databases, but in the judgment accumulated by experienced underwriters over decades.

Founded by former Uber and NVIDIA engineers, FurtherAI is an emerging start-up applying AI to insurance operations. The company helps automate underwriting and policy workflows, but its larger opportunity lies in capturing the institutional knowledge embedded within those processes. Every exception, override, and claim outcome creates a feedback loop. Over time, FurtherAI is not merely teaching machines how to process policies but teaching them how insurers think about risk.

AI Math Models Hold the Key to Mission Critical AI Accuracy

Over the past three years, the AI industry has been primarily focused on advancing model capabilities. Each new generation has delivered stronger performance in coding, workflow automation, and increasingly complex tasks. Yet despite these advances, hallucinations remain a fundamental challenge.

A peer-reviewed study published in late 2025 concluded that hallucinations are "structurally inevitable under existing LLM architectures." Even the most advanced models in 2026 continue to exhibit measurable error rates: 0.7% on summarization tasks, 10-20% on more demanding benchmarks, 18.7% on legal queries, and 15.6% on medical questions. The economic impact is significant. Between 2023 and 2025, the industry spent an estimated $12.8 billion on hallucination detection and mitigation solutions. In 2024 alone, AI hallucinations contributed to an estimated $67.4 billion in global business losses.

AI math models represent a fundamentally different approach. Unlike traditional LLMs, they do not rely on probabilistic text generation alone. Every step in the reasoning process must satisfy strict verification rules, enabling formal validation of intermediate and final outputs. As a result, AI math models eliminate hallucinations within domains where correctness can be mathematically verified. For instance, hallucination in a casual chatbot conversation is largely harmless. However, in domains such as semiconductor design workflow, drug discovery, or defense application, among others, hallucinations can have serious consequences. As organizations deploy AI into increasingly high-stakes environments, the industry’s bottleneck is gradually shifting from generating answers to verifying whether those answers are correct.

As AI evolves from general-purpose intelligence to specialist intelligence, accuracy and reliability are becoming as important as raw capability. This shift helps explain why a new category of startups, including Axiom Math and Harmonic, has attracted multi-billion-dollar valuations despite being pre-revenue.

AI math models are also solving one of the key bottlenecks in scientific discovery. Throughout history, some of humanity’s most important technological breakthroughs began as abstract mathematics. For instance, number theory ultimately enabled internet cryptography. Yet mathematical progress has always been constrained by the scarcity of exceptional mathematicians. Breakthroughs often emerge from a handful of extraordinary individuals, making scientific progress difficult to scale. The central thesis behind startups like Axiom Math and Harmonic is that mathematical reasoning itself can be industrialized. Rather than relying exclusively on human intuition, AI systems could generate, test, verify, and prove mathematical ideas at machine speed, potentially accelerating scientific discoveries across numerous industries.

We believe that AI math startups have created a defensible position for themselves, as specialized companies continue to outperform much larger incumbents in highly technical markets. Foundation model companies optimize for breadth. Their systems must simultaneously excel at conversational AI, coding assistance, enterprise productivity, search, agents, and consumer applications. AI math startups, by contrast, optimize for a completely different objective. Their focus is not generating plausible answers; it is generating provably correct answers. While the distinction appears subtle on the surface, it creates entirely different technical requirements, research priorities, evaluation benchmarks, and product architectures. Just as cybersecurity evolved into a standalone category despite every software company caring about security, verification will emerge as its own category despite every AI company caring about accuracy.

The implications extend far beyond mathematics itself. Any industry where mistakes are costly could potentially benefit from formal verification systems. Software developers could mathematically verify critical code before deployment. Semiconductor companies could validate increasingly complex chip designs. Drug discovery platforms could improve confidence in scientific simulations. Aerospace and defense organizations could reduce risk in mission-critical systems. Robotics, quantum computing, and advanced manufacturing all represent potential beneficiaries as well. Viewed through this lens, the opportunity is not really about mathematics. It is about creating infrastructure for industries where being wrong is expensive.

Several independent trends have converged simultaneous qly to make the rise of AI math models possible. First, modern AI infrastructure has finally become capable enough to engage meaningfully with advanced reasoning tasks. Second, the formal mathematics ecosystem has matured significantly over the past decade. Platforms such as Lean, Coq, and Isabelle have evolved from niche academic tools into scalable verification environments. Finally, the rapid adoption of AI across enterprises has created an urgent need for verification.

Axiom Math and Harmonic stand out as AI math startups. Axiom Math, based in California, commanded a valuation markup of 433% in its latest funding round. It also achieved 12 out of 12 in the William Lowell Putnam Mathematical Competition. This was only the sixth perfect score in 98 years, achieved out of over 150,000 attempts. Marquee investors in the startup include Menlo Ventures, B Capital, Greycroft, and Toyota Ventures, among others. Harmonic, also based in California, commanded a valuation markup of 66% in its latest funding round. Aristotle, Harmonic’s math AI model, significantly outperforms O4, GPT 4.1, Sonnet 3.7, and Gemini 2.5 flash, on VERINA Benchmark, resolving 96.8% problems while incumbents were stuck between 3 to 22%. Marquee investors in the startup include Kleiner Perkins, NVIDIA, and Sequoia Capital, among others.

‍

What’s a Rich Text element?

Heading 3

Heading 4

Heading 5

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.

Ready to partner with MVP?

Get In Touch