Building an AI copilot from 0 to $1B

Design

Product Management

AI SaaS

Voice AI

Company

Eve - Series B unicorn backed by a16z

Role

Product & Design Lead

From initial insight to discovery all the way through design, launch, iteration, and GTM enablement.

Team

Tech Leads - Jack Burns, Kenny Williams

Product Engineers - Joel Wang, Jimmy Phan, Shiva

We also worked with partners in marketing, sales, customer success, and our infrastructure team. As well as our fearless CPO, Matt Noe.

Impact

Shipped a strategically critical, company level bet from 0 to 1.
50% voluntary adoption Of AI agents over human staff.
Zero missed calls For beta firms - previously the #1 source of revenue loss.

Protected Content

What's the magic word?

Main Quest

Side Quests

SOLUTIONHighlight

THE PROBLEM

The Scale Paradox

Plaintiff law is a high-stakes, high-volume business. Firms spend millions on marketing, but their intake process - the source of all their cases - remains fundamentally unscalable.

The Trade-off

Every firm faces the same brutal choice:

Outsource: Call centers answer fast but can't distinguish a $10M catastrophic injury from a fender bender. High-value cases slip through.
In-house: More control, but management overhead, QA burden, and churn eat into margins.
The Black Box: Partners can't calculate ROAS-settlements take 1–3 years. They optimize for "call volume" instead of "case value."

The Data

64% of prospective clients who didn't hire a lawyer said the firm never responded. (Clio Legal Trends Report)
75% of callers prefer a guaranteed callback over waiting on hold. 54% hang up within 8 minutes. (Nextiva)
Even high-performing firms audit only ~5% of intake calls. The rest is operational guesswork.

The result: millions in settlement fees lost annually to missed calls, poor qualification, and slow follow-up.

PRODUCT STRATEGY

The Labor Replacement Bet

We saw an opportunity to replace the labor layer agentically - not just augment it.

Competitive Landscape

The market had two types of players, and neither solved the real problem:

Legacy intake platforms bolting on AI features ad hoc. Bloated UX, not AI-native. Strong data gravity and workflow lock-in kept them entrenched.
Point solutions tackling small pieces of intake without a path from intake to settlement.

The gap - no one offered a single system that captures calls, qualifies leads, coaches staff, and provides operational visibility. All without hiring more humans.

Our Bet

We hypothesized that voice models and LLMs would soon be capable of handling law firm intake end-to-end better than human staff on average at a fraction of the cost. They're quirky now, but already outperform humans on:

Consistency - Never has a bad day, never forgets the script
Scalability - You can scale up or down your capacity based on demand
Knowledge - Has much more knowledge on what makes a good case
Objection handling - Trained on thousands of successful conversations

This was contrarian. Competitors were building "AI-assisted" tools for humans. We were building AI that replaces humans.

SEQUENCING

How did we get there?

I considered many different approaches to building out our vision for a unified, AI native intake engine. I settled on this build order so we could prove out our core hypotheses more quickly, deliver our core value sooner, and tackle edge cases later on.

1. Voice AI

Our entire product hinged on the strategic bet that a voice AI agent could replace human intake staff within 6 to 12 months.

2. Intake Intelligence

Intake managers need observability, value analysis, and insight regardless of whether humans or AIs are capturing new business for them.

3. Live AI Call Coaching

There will always be high value cases that need human attention. That's why we decided to invest in live AI call coaching: To help human staff navigate the complexities of connecting with, assessing, and closing their most valuable clients.

DESIGN

Crafting an AI native insight engine

I owned the full product loop-from understanding what intake specialists actually do, to writing requirements, to designing the product, to evaluating whether the AI was good enough to ship.

Designing the Voice Agent

Voice AI Sample

0:00 / 0:00

I talked to dozens of intake specialists, managers, and law firm partners. I took intake calls myself to understand the real workflow-the pacing, the empathy required, the edge cases that trip up even experienced staff.

I iterated on the voice agent prompt directly with clients, tuning tone, pacing, and question flow based on what callers actually needed. Then I evaluated the agent through simulated testing and hours of calling it myself, hunting for edge cases: confused callers, ambiguous injuries, cases that didn't fit neat categories.

The key factors that matter on an intake call:

Demonstrating patience and empathy
Gathering critical information
Value-added communication
Asking good follow up questions
Managing expectations

Actionable, High-Value Insights

For our dashboard, I focused on the questions that kept coming up in my discovery interviews:

What's customer service quality like at scale?
Are our staff doing a good job with the case types we typically handle?
Are we handling the right volume of leads?
What is the quality of our funnel from a lead value perspective?

Every chart answers a specific operational question. No vanity metrics, no dashboards for the sake of dashboards.

Progressive Disclosure

Early versions of our lead overview showed everything up front. A summary, a full rationale for the lead score, structured data about the lead, calls. Users drowned.

They couldn't quickly ascertain whether the lead was valuable and worth attending to. So I redesigned the lead review flow around progressive disclosure:

Level 1 - A high-level summary with a color-coded lead score. Attorneys trust "green" leads instantly.
Level 2 - In depth rationale for the score. Liability factors, damages, case strengths.
Level 3 - A full transcript and audio playback for deep auditing.

The Scoring UX

Firms traditionally use 1–100 scoring. But intake staff can't parse that granularity in real-time. They need to know: is this a good case or not?

I kept the 1–100 score since partners expect it, but color coded ranges: High (green), medium (yellow), and low (red). Staff get instant triage. Partners retain the familiarity they need.

Managing Real-Time Cognitive Load

Our live call coaching functionality had two major goals: Surface critical insights live on calls without distracting or overloading intake staff.

Stack metaphor: Insights (green flags, red flags, objection responses, follow up questions) slide in one at a time, only when fully baked. Staff are never confronted by a wall of text or a barrage of UI elements.
Typing animation: Form fields animate as they auto filling, drawing the eye only when action is needed. Staff make micro corrections without breaking their flow.

The goal: hold a natural conversation. Let the AI work in the background. Intervene only when necessary.

METHODOLOGY

How I work

Problem Identification

I start with discovery interviews to build a rich mental model of my customers and their domain. This gives me the ability to find high-value feature opportunities, prioritize effectively, and develop an informed vision for what to build.

Designing in Code

During this project, I shifted my workflow from Figma to Cursor. The result is vastly greater speed, vastly higher fidelity prototyping, and much clearer validation. Design polish is about your standards and your taste. Not about your tools. Every image in this portfolio is from a code-based prototype, not from Figma.

Live Iteration

Because I prototype in code, I can deploy fully-interactive, data-backed, stateful prototypes to customers for a level of realistic testing that is impossible in traditional design tools. I even sometimes refactor the prototype live during feedback sessions. By the end of a feedback call, customers can literally see the updates they requested and interact with them.

Launch & Lifecycle

I deploy features to high-signal beta customers to test them in the wild. Post-launch, we use usage data and qualitative conversations to make the hard call: iterate on the solution, leave it as-is, or sunset the feature if it doesn't prove its value.

TURNING A PROFIT

Unit Economics & Pricing

Creating a great product isn't just about pixels - it's about margins. Voice AI is expensive. High-volume LLM tokens add up fast. I needed to ensure the product was profitable per unit before we launched.

The Analysis

I dug into Snowflake to model unit economics across our beta cohort. The data revealed clear patterns in call behavior, token consumption, and volume distribution that shaped our pricing strategy.

Human Call

0min

Average duration

AI Call

0min

33% faster

Cost / Lead

Lead intelligence

Voice AI

$0/min

Per minute cost

Monthly Infrastructure Costs

By firm size

Voice AI

Lead Intelligence

Pricing at Target Margins

Monthly price by firm size

Small

Medium

Large

Pricing Strategy

Pricing had to feel fair to small firms (~50 leads/month) while scaling profitably with large ones (500+ leads/month). I worked with Product Marketing and the CRO to inform "Per Minute" + "Per Lead" pricing that covered infrastructure costs with healthy margins across all segments.

We considered outcome based pricing, but until we own the workflow end to end fully and our voice AI agent is able to sign leads, usage based pricing was simple, transparent, and easy to implement as a starting point.}

Preventing Bill Shock

The drawback of usage based pricing is anxiety. "How much am I spending?" becomes a constant question. I designed our usage dashboard to answer this proactively:

Transparency - Not just "minutes used" but what they got for it-leads captured, matters created, conversion rates.
Value - Show the ROI: "You spent $X on AI minutes → captured Y leads → Z became matters."
Invoice clarity - Worked with Finance to understand co-term vs. non-coterm contracts and designed how billing appears in the UI.

Usage Dashboard showing lead creation, calls, and minutes over time

GTM ENABLEMENT

Designing for the Sale

A product only matters if you can sell it.

I partnered with sales, engineering, and PMM to operationalize the GTM strategy:

Pilot Process Strategy: Co-designed the sales pilot workflow alongside sales Engineers, product engineers, and sales leadership to align technical capabilities with sales goals.
Robust Enablement Assets: Created a comprehensive library of materials, including video walkthroughs, interactive Arcade click-through demos, and knowledge base articles.
Demo Infrastructure: Established a sandbox environment with realistic data for daily sales calls and helped think through bespoke, high-impact demos for marketing events.
Training & Support: Executed "train the trainer" workflows with Product Marketing, held recurring office hours for CS/Sales, and fielded many, many direct field questions to unblock deals.

OUTCOMES

Validated Labor Replacement

We launched the beta program with several high-profile firms. The data validated our hypothesis-AI could replace, not just assist, human intake.

50% Voluntary Adoption

When given the choice between "Wait for a human" or "Talk to our AI assistant," half of callers chose the AI. Both during business hours and after hours.

Zero Missed Calls

Beta firms reduced after-hours missed calls to near zero-previously their #1 source of revenue loss.

$200K+ ARR in Month One

Closed hundreds of thousands in revenue within weeks of launch.

The Competitive Edge

"The fact that we're so far ahead on intake is why we're closing all these $200K–$300K deals. Control the source of cases, you win the market."

- Matt Noe, Chief Product Officer, Eve

Customer Validation

"Are you afraid of a voice AI picking up the phone, or are you afraid of losing a $10M case? Somebody has just had a massive accident. Weigh the trade-offs. I do think Eve will be answering all of our calls very, very soon."

- Mike Morse, Founder & CEO, Mike Morse Law Firm

Operational Impact

"I can keep this open on my desk and pop in and out. My old process: find the phone number, copy it, go into Zoom contact center logs, change the date range, search, click to recording, download... this fixes that whole problem. If Audrey's call scores drop 12% this week, I just click in."

- Dan, Glassman Law Firm

REFLECTION

What I Learned

Realistic Data is Critical

Early code based prototypes used clean conversations, expected lengths, typical case types. When we plugged in real legal conversations, layouts broke.

The power of code-based prototyping is that you can back your designs with real data dynamically. Use that to stress-test the UI in ways traditional tools never can. I now generate adversarial test data before finalizing any layout.

Balancing the Sizzle

There are features that provide basic utility and improve quality of life. And there are features that sizzle on a sales demo.

Great products balance both. You need a product that is both desirable and usable. Testing and iteration helped us find the balance.

Owning the Last Mile

Traditionally, my design QA process involved producing Notion docs with annotated screenshots explaining design deviations.

Now I just open an MR with all my design polish items baked. It saves an incredible amount of time and significantly leveled up the polish of our product.

Prompt Designing

As time went on, I became much more involved with prompt engineering on this product.

I would prototype the prompts as much as the UI, since they dictate so much of the user experience.

PART II