Skip to content

How we built business calling on Telnyx (and why we didn't roll our own SBC)

Marcus Williams7 min readEngineering

Why Hey Quad runs on Telnyx instead of bare metal - call control, SIP trunks, multi-tenant routing, and the trade-offs of building voice in 2026.

Every "we built a phone system" blog post eventually arrives at the same question: did you roll your own SBC, or did you build on top of someone else's network? We didn't roll our own. This is the post explaining why that was the right call, and what it actually meant in practice.

The choice you're making

When you build voice, the spectrum looks roughly like this:

Three ways to build business voice — and the trade-offs of each.
Three ways to build business voice — and the trade-offs of each.
ApproachWhat you operateWhen it makes sense
Bare-metal SBCSIP routing, RTP, peering, fraud controls, failoverYou have a voice team and >100M minutes/year
Cloud SBC + carrier PSTNAsterisk or FreeSWITCH, BYOC trunks, cloud networkingYou want control without metal
CPaaS (Telnyx/Twilio/Plivo)Their API, their network, your product workflowYou're building a product, not a phone company

The mistake we see startups make is believing tier 1 makes them special. It doesn't. Customers don't care which kernel handled their RTP. They care whether the call connects, sounds clean, and routes to the right person. The tier-3 path gets you to all three faster.

Why Telnyx specifically

We tried the obvious three. The decision came down to four things that matter when you're multi-tenant from day one:

  1. Programmable Voice that's actually programmable. Telnyx's Call Control gives us a TCP/WebSocket event stream and per-call commands we drive from a Node worker. Our flow engine is essentially a state machine reacting to those events. No proprietary scripting language to learn, no vendor-flavored XML.
  2. Real-time billing data. Every call leg returns a CDR with carrier costs broken out. We mark those up per-tenant and bill in our own UI without reconciling against a bill that arrives 30 days later. Twilio buries this.
  3. Number portability without drama. LOAs go through their portal, port-out completes get webhook'd back, and rejected ports come back with a rejection code we can map to plain English for the customer.
  4. Wholesale-friendly pricing. When we resell, our agency partners pay close to wholesale carrier rates rather than a "communications tax." That's the difference between a 70% partner margin and a 35% one.

What we actually run

The runtime stack is easier to understand as two planes:

Media on Telnyx, control flow on us.
Media on Telnyx, control flow on us.

Media plane: Browser softphone, desk phone, or PSTN endpoint connects through Telnyx. The audio path stays off our servers.

Control plane: Telnyx emits call-control events, our worker turns those events into flow-engine transitions, and Postgres stores tenant state, flow definitions, and billing facts.

The worker is the interesting part. Every call event - call.initiated, call.answered, call.hangup, call.dtmf.received - lands on a queue and a state-machine processor decides what happens next based on the flow definition for that tenant. The flow editor in the UI compiles down to that same definition.

We deliberately do not keep call media on our infrastructure. RTP flows between Telnyx's network and the endpoint. We're a control plane, not a data plane. That means:

  • No SBC to operate, no DDoS surface for SIP scanners.
  • No PCI-style scope expansion when calls get recorded - recordings are Telnyx-stored, fetched on demand by our worker.
  • We can add a region tomorrow by changing a config, not by deploying bare metal.

The hard parts the docs don't tell you

Three things ate more time than we expected:

Three production problems the docs don't warn you about.
Three production problems the docs don't warn you about.

Idempotency on call events. Telnyx will redeliver a call.answered event if the worker doesn't ACK in time. The naive thing - "respond to every event" - leads to double-processed states. We key every state transition by call_control_id + sequence. Cheap, prevents the worst class of bug.

Multi-tenant DID hygiene. When you have 200 tenants and 8,000 numbers, "this number rang, who does it belong to?" needs to be O(1) on the hot path. We cache the DID-to-tenant map in the worker process, hydrate on event, and reload it on number.assigned webhooks. Doing this in Postgres on the hot path was our first 99th-percentile latency issue.

10DLC throughput. SMS deliverability is a function of campaign trust score, not raw API capacity. A new campaign starts at low throughput and ramps as carriers see clean traffic. We surface throughput limits and historic delivery rates in the dashboard so customers don't blame the platform when really they need to age their campaign.

What rolling our own would have cost

We did the math. To replicate the Telnyx pieces we use - global PSTN peering, 10DLC submission, LRN dipping, carrier escalation contacts - the floor is roughly:

What rolling your own voice stack costs at minimum.
What rolling your own voice stack costs at minimum.
  • 1 senior voice engineer ($300k+ fully loaded)
  • ~$8-15k/month in transit + colo for two regions
  • 6-9 months before parity with what we'd ship in week one on Telnyx

For a customer base under tens of millions of monthly minutes, that math never closes. Above that, you re-evaluate. We're nowhere near that inflection.

What we'd tell another team

If you're building a voice product and you don't have a voice team, the honest answer is: pick a CPaaS, build on top, treat the carrier as a boring dependency, and spend your engineering budget on the product your customers actually see - the softphone, the flow editor, the analytics, the white-label experience.

The hard problems in business telephony in 2026 aren't on the wire. They're in the UX.

The modern phone for modern teams.
Bring calls, SMS, softphones, and insights together on one platform.

Related articles

Company

Welcome to the Hey Quad blog

What we are building, who the blog is for, and how we will write about white-label voice, Telnyx, call flows, support, and resale.

3 min read