How Voice AI Cut Our Speed-to-Lead from Minutes to Seconds

Leads do not wait for business hours. We built an outbound AI voice agent that dials new submissions within seconds, qualifies prospects with BANT, and books meetings even at 2:11 AM. The result: speed-to-lead dropped from minutes to seconds without adding headcount.

This post walks through the architecture, the guardrails that keep it compliant, and the operational lessons we learned while deploying it across multiple tenants.

Architecture overview

Lead detection (Edge Function)

- Evaluates new or updated leads against business rules and cadence windows.

- Enqueues a message per eligible lead into a queue with company-scoped context.

Dial orchestration (Edge Function)

- Reads one message per invocation to avoid timeouts and thundering herds.

- Atomically claims the company-scoped call record and validates tenant context.

- Builds a TwiML webhook URL and asks Twilio to originate the call from the tenant’s subaccount number.

TwiML + media stream (Edge Function)

- Twilio requests TwiML; we return a <Connect><Stream/> that points to a WebSocket endpoint hosted by the same Edge Function.

- The WebSocket handler bridges audio to ElevenLabs and runs OpenAI-based dialogue policy with BANT guardrails.

Status callbacks and recovery (Edge Function)

- Twilio status callbacks update call lifecycle; a recovery worker backfills/transitions incomplete calls and handles audio post-processing when needed.

Persistence and analytics

- All call state is stored in company-scoped records.

- Dashboards read company-scoped metrics for conversion rate and speed-to-first-contact.

Form submission → Eligibility check (enqueue) → Queue → Dial orchestration → TwiML Connect → ElevenLabs stream + OpenAI → BANT capture → Calendar booking → CRM sync

Guardrails that matter

Consent + local time windows — every call checks TCPA-safe hours per company time zone before the agent dials.
Company validation on every write — all RPCs require company_id context before mutating data.
Storage isolation — recordings land under company_id/agent_calls/... and signed URLs expire quickly.
Retry semantics — queues store idempotency keys so retries cannot double-call the same lead.
Webhook authenticity — Twilio signature validation can be enforced per environment; subaccount auth tokens preferred when available.

Handling concurrency spikes

Nightly marketing campaigns produced sudden call bursts. Our fix:

Rate limits per company with graceful degradation (queue instead of reject).
Backpressure signals so the voice agent can ask the caller to hold for a few seconds if slots are full.
Dedicated monitoring on queue depth and booking latency.

Reducing no-shows

Voice AI is only valuable if meetings happen. We tuned two levers:

Offer three time slots based on historical conversion data.
Send confirmation + reminders via email/SMS (through company-approved channels) that reiterate the scheduled time.

No-show rates fell once we paired the right slot suggestions with timely confirmation nudges.

Multi-tenant hygiene

All Supabase queries include .eq("company_id", companyId).
Edge Functions (call orchestration + calendar syncing) validate company membership before touching external APIs.
Audit trails record which AI agent booked each meeting, which helps with debugging and compliance.

Results we track

Speed-to-first-contact: seconds
Booking completion rate: up and to the right
Voice agent NPS: trending positive as scripts improve

Want the redacted diagram + ops checklist?

Reply “voice” on our LinkedIn post or email hello@invoicifyai.com. I’ll send the architecture diagram, queue retry policy, and our compliance checklist.