PTT Dispatch Console

A commercial push-to-talk dispatch console, rebuilt from a proof of concept into software an operator runs all shift.

Role: Sole developer (commercial product) Year: 2026 Domain: Desktop app · C++ / Qt · Real-time audio · Push-to-talk (PoC) · VoIP · Mission-critical comms

Dispatch console: live list of users and groups, map of field users, group and one-to-one PTT, and messaging

Problem

The starting point was a proof of concept. It could log in, join a group and make a call. That is enough to prove the idea and nowhere near enough to put in front of a dispatcher.

A control-room operator depends on this software for a full shift, sometimes at the exact moment something has gone wrong in the field. The gap between a demo that makes a call and a tool an operator trusts is the entire project: voice that holds up when field users are on cellular, floor control that is never ambiguous about who is speaking, an SOS that reaches the operator and puts the person in distress on a map, and recovery that is invisible when the connection drops and comes back.

On top of that, the console talks to the PTT backend over a custom in-house protocol rather than a standard off-the-shelf stack, so every layer from the wire up had to be built and owned.

Constraints

Real-time, half-duplex voice. One person holds the floor at a time, and the interface has to make that obvious and never let two talkers believe they both have it.
The network is not friendly: field users are on cellular, so the voice path has to survive packet loss and reconnects without dropping the session.
It speaks a custom in-house protocol to the PTT backend, not SIP or a standard PTT stack, so the product owns every layer from the wire up.
Runs a full shift on control-room Windows PCs and has to recover cleanly from network drops and from audio devices being unplugged mid-call.
Commercial software with real users, shipped as a white-label product, so the same codebase has to build into more than one branded application.
An operator has to read the whole situation at a glance and talk in a single action, with little or no training.

Architecture

C++20 with Qt 6 Widgets on Windows. A dense, dockable desktop console built on the Qt Advanced Docking System, so every panel (users and groups, map, PTT, messaging, event log, replay, geofencing) can be floated, tabbed and rearranged, and the layout is saved and restored between sessions.
Core networking and Opus/AMR voice send and decode run off the GUI thread on a dedicated Boost.Asio io_context worker thread, reaching the UI only through queued Qt signals and slots, so a network hiccup never freezes the console.
A dedicated protocol layer owns the wire: a custom binary length-prefixed TLV protocol (POC-Lite) carrying both signaling and voice over one TCP connection, optionally wrapped in TLS 1.2+ with certificate pinning, plus a heartbeat and automatic reconnect.
Signaling and voice media share one POC-Lite stream and are multiplexed as command items. A separate WebSocket channel carries live location, talk-burst replay and geofence updates, and HTTP JSON-RPC handles login and group control.
Real-time voice pipeline: capture and playback through Qt Multimedia (QAudioSource and QAudioSink at 8 kHz mono), with two selectable codecs, AMR-NB by default and Opus (libopus 1.5.2) as an alternative, plus software gain control and live input and output level meters.
Model/View (QAbstractTableModel and filter proxies) drives the live list of users and groups, presence, group members, event log and call history, so hundreds of users update efficiently without rebuilding the UI.
Embedded live map of field-user locations rendered with Leaflet inside a Qt WebEngine view, bridged to C++ over QWebChannel, with operator-drawn geofences, location history and movement replay.
Local persistence: SQLite for the event log, geofence crossing history and per-conversation chat history, and QSettings for operator and application configuration.

Outcome

A proof of concept turned into commercial software running in real control rooms.
From one screen a dispatcher can see every user and group, talk to a person or a whole group, follow field users on a map, message them, handle SOS and emergency alerts, and replay any recorded call.
Voice holds up over real cellular networks, floor control stays correct under load, and the session recovers on its own after a drop.
Owned end to end: rebuilt from scratch, with every function added by hand, from login and presence through mapping, geofencing, recording and replay.
Ships as a white-label product. One codebase builds into multiple branded dispatchers, and a built-in auto-updater keeps deployed clients current.

Stack

C++20, Qt 6 (Widgets, Multimedia, WebEngine, WebSockets, Sql) · AMR-NB and Opus (libopus 1.5.2) voice codecs · Custom binary POC-Lite TLV protocol over TCP/TLS, plus WebSocket and HTTP JSON-RPC · Boost.Asio for off-GUI-thread networking and audio · Leaflet map embedded via Qt WebEngine and bridged with QWebChannel · SQLite (event log, geofence crossings, chat history), QSettings · Qt Advanced Docking System, white-label Windows desktop builds

From proof of concept to product

The proof of concept did the one thing that proves the idea: log in, join a group, make a call. That is the easy eighty percent. The twenty percent left over is the entire reason the product exists. What the operator sees the instant they press to talk, and whether the floor was actually granted. How the console behaves when the connection dies and comes back. What happens when someone unplugs the headset mid-call. What an operator needs the moment an SOS comes in. None of that shows up in a demo, and all of it decides whether a dispatcher trusts the tool.

So I rebuilt it from scratch and grew it function by function into a commercial product: presence and a live list of users and groups, group, one-to-one and dispatcher broadcast calling, a live map of field users with geofencing and movement replay, text and media messaging, call alerts, SOS and emergency handling, full call recording with searchable replay, and the reconnection logic that makes it dependable for a full shift. The interesting work was never the feature list. It was making each feature hold up on a real network in a real control room.

Getting real-time voice and floor control right

Push-to-talk is half-duplex: one person holds the floor, everyone else listens. That sounds simple and is the part most worth getting exactly right. The console has to show, with no ambiguity, who holds the floor right now. When the operator presses to talk, the interface updates instantly: the button depresses, the indicator switches to “You are talking”, a start tone plays, and the microphone begins capturing immediately. The client sends the speech-start request and starts recording in parallel rather than sitting silent waiting for a grant. If the server denies the floor, an error tone plays and the talk-burst stops.

I should be precise about how floor control actually works, because the honest version is the credible one. It is not a local state-machine engine. The client sends speech-start on press and speech-end on release, and reacts to the server’s denial codes (floor taken, user busy, time exceeded, and so on). Arbitration lives on the server, and on the receive side an audio-source manager locks onto one talker per burst and discards competitors, so the operator never hears two people at once.

The voice path is the other hard half. Field users are on cellular, so the path has to survive loss and reconnects. Qt Multimedia handles capture and playback at 8 kHz mono; on top of it sit two selectable codecs, AMR-NB narrowband by default and Opus as an alternative, with software gain control and live input and output level meters. I want to be straight about what the app does and does not do here: there is no jitter buffer, no acoustic echo cancellation and no noise suppression layered on by the application. Incoming packets are decoded on arrival and drained to the system audio sink through a short prebuffer. Resilience comes from the codecs, the prebuffer and self-recovering audio devices that reopen when a sink goes idle or a headset is unplugged. The core voice and protocol work runs off the GUI thread on a Boost.Asio worker, because the one thing a dispatch console cannot do is stutter while someone is trying to talk.

More than voice: map, geofences and emergencies

A dispatcher needs the whole picture, not just audio. The console embeds a real interactive map (Leaflet running inside a Qt WebEngine view, bridged to C++ over QWebChannel) where every field radio appears as a status-coloured marker that moves as new GPS fixes arrive. Operators can switch between OpenStreetMap, satellite imagery and Google tiles, replay any user’s past movements as an animated track with play, pause and speed controls, and draw circular or polygonal geofences per talk group. Geofence crossings are logged to a local database, surfaced as Windows toast notifications, and exportable to CSV.

When something goes wrong, the emergency path takes over: incoming SOS, man-down and group-emergency alerts raise an always-on-top window that sounds an alarm, shows who is in distress and drops their position on an embedded map at their exact coordinates, with acknowledge and mute controls. Alongside that there is one-to-one and group messaging with image, video and audio attachments, call alerts that can carry a recorded voice note, and a central event log that records logins, talk-bursts, messages, presence changes, geofence crossings and emergencies for audit and replay.

Built around a custom protocol, shipped as a product

The console talks to the PTT backend over a custom in-house protocol rather than a standard stack. I kept that wire behind a single protocol layer: a compact binary, length-prefixed TLV format (POC-Lite) that carries both signaling and voice over one TCP connection, optionally wrapped in TLS 1.2+ with the server certificate pinned to a known fingerprint, authenticated with a hashed login and a token-based re-auth for the stream. A heartbeat watches the link and the client reconnects on its own after a drop. The rest of the application never touches raw bytes; it reacts to signals instead: a user came online, the floor was granted, a talk-burst was recorded.

That boundary is what let me keep adding features quickly. And because it is a real product, it ships as a white-label build: the same codebase compiles into multiple branded dispatchers, each with its own name, icon and update server, kept current by a built-in auto-updater that checks for a new version on launch, downloads it and hands off to a helper to install. The interesting work, start to finish, was turning a demo that makes a call into something an operator can lean on for a full shift.