How does business VoIP work?
VoIP calls may sound indistinguishable from those made on a landline or mobile phone, but the underlying processes are entirely different, and the capabilities far superior. This guide provides a clear, step-by-step explanation of how VoIP technology works and why it is so effective in the business environment.
Step-by-step business VoIP call:
- Turning voice into digital data
- Navigating the local area network
- Navigating wide area networks
- Call reception at destination
- VoIP call management
Understanding how VoIP works
VoIP (Voice-over-Internet-Protocol) is a digital telephony technology that is rapidly replacing traditional phone calls, both landlines and mobile.
In fact, approximately half of UK calls already rely on VoIP, and apps like WhatsApp, Microsoft Teams, and Zoom have been using VoIP tech from the start.
VoIP technology works by converting your voice into small packets of digital data and sending them over the internet. Your phone, app, or headset captures speech, converts it into data, and transmits it to the recipient’s device, where it’s reassembled back into audio in real time.
This all happens in milliseconds, making VoIP calls indistinguishable from traditional calls. It’s fully compatible with traditional landlines, mobiles, and other VoIP platforms, delivering a seamless calling experience across the board.
VoIP requires little internet bandwidth, but call quality can rapidly degrade if the connection is unstable or has high latency. For this and other reasons, continued improvements to the UK’s broadband infrastructure are key to full nationwide adoption.
How VoIP works
VoIP calls may seem simple, but behind the scenes, they depend on a series of complex, sensitive operations over the internet. This is particularly true in enterprise deployments, where maintaining both high call quality and uptime is critical.
Here is a detailed, step-by-step breakdown of what happens during a VoIP call, tracing a voice packet from caller to recipient:
1. Turning voice into digital data
The first task of VoIP is to turn voice sounds into digital data. This can be summarised in four steps:
- Voice capture: When a VoIP device (e.g. desk phone, headset, or softphone app) captures speech, the microphone converts the sound waves into analogue electrical signals.
- Voice digitisation: These signals are sampled thousands of times per second and digitised into binary code (1s and 0s) within the device or app.
- Compression and optimisation: Next, a codec (coder/decoder) such as G.711, G.729, or Opus processes the audio. It compresses the data to reduce bandwidth while preserving voice quality.
- Packetisation: The digital audio stream is then split into small, specialised audio packets, typically around 20 milliseconds of voice data per packet. Each packet includes header information (source, destination, sequence number, codecs, timestamp) for routing and reassembly, as defined by the RTP (Real-Time Transport Protocol)
2. Navigating the local area network
VoIP packets leaving the calling device must first pass through the local area network (LAN) before reaching the public internet.
In small business VoIP setups, this journey is usually short and straightforward: packets travel via Wi-Fi directly to the router or firewall.
In enterprise environments, where traffic volumes are higher and often sensitive, voice data is typically given its own ‘lane’ and must traverse several points before exiting to the wider web.
Here is what that journey looks like:
Local VoIP traffic transmission
Audio packets traversing the LAN are transmitted either over WiFi (wireless) or Ethernet cables (wired).
Ethernet is the more stable option, delivering consistent low latency and minimal interference. This makes it ideal for high-demand environments with fixed desk phones, such as call centres and large offices.
WiFi is more convenient but more susceptible to congestion and signal fluctuations. Mesh networks improve reliability by distributing the signal across multiple nodes, reducing VoIP packet dropouts and roaming delays.
Newer standards like WiFi 6 and 6E are further narrowing the gap with better traffic management and lower latency for wireless VoIP, though Ethernet remains more dependable.
Navigating local switches and VLANs
In larger office networks, VoIP packets are often routed through managed network switches that support VLANs (Virtual LANs). VLANs segment voice traffic from general data traffic, effectively creating a dedicated lane for VoIP.
This separation reduces congestion, ensures Quality of Service (QoS) policies can be applied consistently, and improves security by isolating sensitive voice streams from wider network activity.
It is considered a best practice in enterprise environments where both voice performance and VoIP security are critical.
Receiving QoS settings at the router
Once VoIP packets reach the business broadband router, QoS settings (configured manually or via policy) give them priority over less time-sensitive traffic.
This ensures that large file transfers, software updates, or video streaming don’t impact call quality, particularly important in shared networks. Advanced routers may also support Dynamic QoS, which adjusts priorities in real time based on current traffic conditions.
Proper QoS implementation is one of the most effective ways to maintain clear, interruption-free VoIP calls, especially in offices with mixed workloads.
Passing local firewalls
Enterprise firewalls inspect both incoming and outgoing traffic, and will often process VoIP packets quickly without introducing significant latency.
However, misconfigured firewalls can disrupt VoIP traffic. Symptoms include one-way audio, dropped calls, or failed registrations caused by packets being blocked or misrouted.
VoIP traffic is rarely routed across VPN tunnels, as these add too much latency. Instead, businesses typically use Session Border Controllers (SBCs).
An SBC acts as a specialised firewall for VoIP, securing call signalling and media streams, enforcing policies, and protecting the network against threats such as denial-of-service (DoS) attacks or toll fraud.
Entering the internet (Exiting the LAN)
VoIP packets leave the local network via a broadband or Ethernet connection. The type and quality of this connection have a significant impact on VoIP performance.
High stability, reliability, and low latency are more important than raw bandwidth (speed). While each VoIP call uses only around 100 kbps, symmetrical speeds (equal upload/download) help prevent issues like choppy audio or dropped calls, especially during congestion.
For more details on the role of connectivity in VoIP quality, see our section on what makes VoIP work well for businesses.
3. Navigating wide area networks
When VoIP packets leave the local network, they follow a predetermined route across the internet to reach the recipient device. The path depends on the type of call and the recipient’s location.
Here’s how VoIP packets move in typical VoIP scenarios:
Standard VoIP call to an external recipient
For standard internet-based calls, business VoIP providers route the call directly to the recipient over the public internet. This direct path, bypassing the VoIP server, is chosen to minimise latency.
Providers still orchestrate the call from the cloud using SIP signalling, but this takes place independently of the point-to-point transfer of voice data.
Internal VoIP call between business sites
Internal calls between employees within the same business are usually routed across private business wide area network (WAN) links such as MPLS, point-to-point leased lines or dark fibre. These dedicated routes ensure high call quality and security.
Voice packets of employees working remotely need to be routed over public internet links, but can still be optimised and secured through business SD-WAN solutions that find and route voice data over the optimal path in real time.
In highly regulated sectors such as finance and healthcare, this approach is often required to comply with cybersecurity regulations for protecting sensitive call data. On-premise VoIP servers typically orchestrate these setups.
Group VoIP calls
VoIP is also used for calls with multiple participants, such as conference calls, live presentations with video, or collaborative sessions on shared platforms.
In these cases, VoIP data needs to be routed through a centralised VoIP server (either cloud-based or on-premises) to coordinate communication between all parties.
This centralised routing involves more steps, so performance depends on optimised device-to-cloud connections. Employees connecting from the office generally enjoy better quality than remote participants, as their streams travel through dedicated private Ethernet links to the VoIP servers and back.
4. Call reception at destination
Once routed by the VoIP server, the call reaches the recipient’s endpoint, whether that’s another VoIP desk phone, a softphone, or a mobile app.
If the call is to a landline or mobile, it will already have been converted into the appropriate format by a VoIP–PSTN gateway before reaching the handset.
Packet reassembly and playback
At VoIP-enabled devices, incoming data packets are collected, ordered, and smoothed by a jitter buffer.
The codec then decodes the digital stream back into audio, which is played instantly through the speaker or headset, making the process indistinguishable from a traditional phone call.
Handling packet loss and delays
Modern VoIP systems use techniques such as jitter buffers and packet loss concealment (PLC) to keep conversations flowing naturally, even when packets are delayed or lost.
These methods fill in gaps or smooth timing variations so the user hears continuous speech rather than broken audio.
5. VoIP call management
VoIP servers (sometimes known as PBX systems) orchestrate the entire journey of the data packets from caller to recipient and back, and host advanced VoIP call management and security features.
Here’s more detail about its roles:
Call setup and control (SIP)
VoIP systems use SIP (Session Initiation Protocol) to handle the instructions that make the call possible:
- Verifies caller identity
- Sets up and ends the call
- Negotiates which codec to use (e.g. G.711, Opus)
- Coordinates features like call hold, transfer, or forwarding
These instructions are sent to the VoIP devices to ensure the data packets are sent according to the call requirements.
Advanced VoIP functionality
VoIP servers also host key functionality that is only available to business internet calling. This includes VoIP features such as:
- Interactive Voice Response Systems
- Call queuing, forwarding and recording.
- VoIP analytics and VoIP integrations.
- Hosting of group calls, conference calls, and video presentations.
- Hosting Session Border Controllers (SBC) for secure calls and network compatibility.
- A VoIP-PSTN/ISDN gateway, for calls to traditional landlines or mobiles.
How VoIP calls connect to landlines and mobiles
Traditional landline and mobile calls run on networks separate from the internet, and therefore from VoIP packets.
Although they often share the same medium (e.g. copper wires for landlines, the airwaves for 4G/5G), they use different standards and protocols.
To enable interoperability between VoIP and landline or mobile calls, VoIP servers use a VoIP gateway. These specialised components translate between the two in real time.
Take, for instance, the packet in the diagram below. Once the VoIP server instructs the call to begin via SIP, VoIP data packets travel to the VoIP server, where they are converted at the gateway into mobile network signals and transmitted to the recipient.
To end the call, the server instructs both the gateway and the device to stop. For both the caller and the recipient, the call was like any regular call.
What makes VoIP work well for businesses
The performance and reliability of VoIP solutions in a business setting depend heavily on the quality of the network it runs on. While the underlying technology is efficient, real-world performance is shaped by how the business configures its broadband, LAN, and WAN infrastructure.
Sufficient bandwidth
VoIP is remarkably lightweight. Standard codecs like G.711 or Opus use around 100 kbps (0.1 Mbps) per call. On most business-grade connections, this makes voice traffic a negligible fraction of total bandwidth.
However, when VoIP is bundled into UCaaS platforms with video, screen sharing, or live collaboration, demand can spike. For example:
- A standard video stream can easily use 700 kbps
- A 4K video feed can require 8 Mbps or more
This becomes more of a problem on asymmetrical connections, where upload speeds are significantly lower than download, or when the network is busy with large transfers or backups.
Even though QoS can prioritise calls, businesses should plan for scalable bandwidth to handle growth and concurrent usage without degradation.
Consistently low latency
Latency (i.e. the delay between sending and receiving packets) directly affects voice call quality. VoIP can tolerate up to 150ms of latency, but lower is always better.
Even copper-based and satellite broadband connections meet this threshold, but consistency is key. Business networks can reduce jitter and delay by:
- Implementing business broadband redundancy to ensure failover and load balancing.
- Using Ethernet instead of WiFi where possible
- Supporting the latest WiFi standards (e.g. WiFi 6/6E)
- Ensuring strong on-site coverage via mesh networks
- Segmenting VoIP with VLANs
- Prioritising voice traffic using QoS policies
- Using SD-WAN for remote and multi-site optimisation
- Integrating cloud VoIP directly into the WAN
Appropriate Quality of Service (QoS)
Quality of Service (QoS) configurations on routers and switches ensure that VoIP packets are prioritised over general internet traffic.
This prevents disruptions like choppy or delayed audio when the network is busy with large downloads, cloud syncs, or video conferencing.
In larger networks, VLANs work alongside QoS to isolate VoIP traffic and ensure more precise prioritisation and monitoring.
Strong last-mile connectivity
The last mile (i.e. the final leg between the business premises and the provider’s nearest access point) is often the weakest link in VoIP performance.
Here’s how different last mile broadband and Ethernet technologies compare:
Best: Carrier-grade Ethernet
Dedicated, fibre-based business Ethernet services are best for VoIP because they offer consistently low latencies and are backed by robust SLAs.
This includes:
- Leased line business broadband, the top fibre-based Ethernet service offered by many business broadband providers.
- Variants such as wireless leased lines and Ethernet First Mile (EFM)
Good: Fibre-based business broadband
Shared broadband is suitable for small offices, but it may experience issues due to business broadband contention during peak times. This includes:
- Full fibre business broadband offers the best and most stable connection.
- Cable broadband, the part coaxial, part fibre connection offered exclusively by Virgin Media.
- SoGEA business broadband, the lowest tier, part copper, part fibre connection. Suitable only for micro businesses with infrequent use of phones.
Least stable: Wireless broadband
Wireless broadband is the least desirable. Modern 4G, 5G and satellite networks like Starlink or OneWeb-Eutelsat offer low-enough average latencies, but the connections are inherently too unreliable and unstable for enterprise VoIP.
Wireless options include:
- Business mobile broadband, which gives internet via 4G, LTE and 5G cellular networks. Best for business broadband failover, as performance is affected by the weather.
- Business satellite broadband is suitable for rural or remote use, but not if other options are available.