Christian M. 5 min read

How does business VoIP work?

VoIP calls may sound indistinguishable from those made on a landline or mobile phone, but the underlying processes are entirely different, and the capabilities far superior. This guide provides a clear, step-by-step explanation of how VoIP technology works and why it is so effective in the business environment.

Step-by-step business VoIP call:

  1. Turning voice into digital data
  2. Navigating the local area network
  3. Navigating wide area networks
  4. Call reception at destination
  5. VoIP call management

Understanding how VoIP works

VoIP (Voice-over-Internet-Protocol) is a digital telephony technology that is rapidly replacing traditional phone calls, both landlines and mobile.

In fact, approximately half of UK calls already rely on VoIP, and apps like WhatsApp, Microsoft Teams, and Zoom have been using VoIP tech from the start.

VoIP technology works by converting your voice into small packets of digital data and sending them over the internet. Your phone, app, or headset captures speech, converts it into data, and transmits it to the recipient’s device, where it’s reassembled back into audio in real time.

Diagram of how VoIP works in three steps: 1. Voice capture: Speech is converted into digital data packets. 2. Data transmission: Packets are sent over an internet connection. 3. Data reception and decoding: The recipient’s device reassembles the packets into audio for the call.

This all happens in milliseconds, making VoIP calls indistinguishable from traditional calls. It’s fully compatible with traditional landlines, mobiles, and other VoIP platforms, delivering a seamless calling experience across the board.

VoIP requires little internet bandwidth, but call quality can rapidly degrade if the connection is unstable or has high latency. For this and other reasons, continued improvements to the UK’s broadband infrastructure are key to full nationwide adoption.


How VoIP works

VoIP calls may seem simple, but behind the scenes, they depend on a series of complex, sensitive operations over the internet. This is particularly true in enterprise deployments, where maintaining both high call quality and uptime is critical.

Here is a detailed, step-by-step breakdown of what happens during a VoIP call, tracing a voice packet from caller to recipient:

1. Turning voice into digital data

The first task of VoIP is to turn voice sounds into digital data. This can be summarised in four steps:

  • Voice capture: When a VoIP device (e.g. desk phone, headset, or softphone app) captures speech, the microphone converts the sound waves into analogue electrical signals.
  • Voice digitisation: These signals are sampled thousands of times per second and digitised into binary code (1s and 0s) within the device or app.
  • Compression and optimisation: Next, a codec (coder/decoder) such as G.711, G.729, or Opus processes the audio. It compresses the data to reduce bandwidth while preserving voice quality.
  • Packetisation: The digital audio stream is then split into small, specialised audio packets, typically around 20 milliseconds of voice data per packet. Each packet includes header information (source, destination, sequence number, codecs, timestamp) for routing and reassembly, as defined by the RTP (Real-Time Transport Protocol)

Diagram explaining how VoIP converts voice into digital data, showing capture, digitisation, compression with codecs and packetisation into small RTP packets for transmission.


2. Navigating the local area network

VoIP packets leaving the calling device must first pass through the local area network (LAN) before reaching the public internet.

In small business VoIP setups, this journey is usually short and straightforward: packets travel via Wi-Fi directly to the router or firewall.

In enterprise environments, where traffic volumes are higher and often sensitive, voice data is typically given its own ‘lane’ and must traverse several points before exiting to the wider web.

Here is what that journey looks like:

Diagram explaining how VoIP packets travel through a local area network, showing WiFi or Ethernet connections, VLAN segmentation, router and firewall handling with QoS, and the impact of broadband or Ethernet uplink quality on VoIP performance.

Local VoIP traffic transmission

Audio packets traversing the LAN are transmitted either over WiFi (wireless) or Ethernet cables (wired).

Ethernet is the more stable option, delivering consistent low latency and minimal interference. This makes it ideal for high-demand environments with fixed desk phones, such as call centres and large offices.

WiFi is more convenient but more susceptible to congestion and signal fluctuations. Mesh networks improve reliability by distributing the signal across multiple nodes, reducing VoIP packet dropouts and roaming delays.

Newer standards like WiFi 6 and 6E are further narrowing the gap with better traffic management and lower latency for wireless VoIP, though Ethernet remains more dependable.

Navigating local switches and VLANs

In larger office networks, VoIP packets are often routed through managed network switches that support VLANs (Virtual LANs). VLANs segment voice traffic from general data traffic, effectively creating a dedicated lane for VoIP.

This separation reduces congestion, ensures Quality of Service (QoS) policies can be applied consistently, and improves security by isolating sensitive voice streams from wider network activity.

It is considered a best practice in enterprise environments where both voice performance and VoIP security are critical.

Receiving QoS settings at the router

Once VoIP packets reach the business broadband router, QoS settings (configured manually or via policy) give them priority over less time-sensitive traffic.

This ensures that large file transfers, software updates, or video streaming don’t impact call quality, particularly important in shared networks. Advanced routers may also support Dynamic QoS, which adjusts priorities in real time based on current traffic conditions.

Proper QoS implementation is one of the most effective ways to maintain clear, interruption-free VoIP calls, especially in offices with mixed workloads.

Passing local firewalls

Enterprise firewalls inspect both incoming and outgoing traffic, and will often process VoIP packets quickly without introducing significant latency.

However, misconfigured firewalls can disrupt VoIP traffic. Symptoms include one-way audio, dropped calls, or failed registrations caused by packets being blocked or misrouted.

VoIP traffic is rarely routed across VPN tunnels, as these add too much latency. Instead, businesses typically use Session Border Controllers (SBCs).

An SBC acts as a specialised firewall for VoIP, securing call signalling and media streams, enforcing policies, and protecting the network against threats such as denial-of-service (DoS) attacks or toll fraud.

Entering the internet (Exiting the LAN)

VoIP packets leave the local network via a broadband or Ethernet connection. The type and quality of this connection have a significant impact on VoIP performance.

High stability, reliability, and low latency are more important than raw bandwidth (speed). While each VoIP call uses only around 100 kbps, symmetrical speeds (equal upload/download) help prevent issues like choppy audio or dropped calls, especially during congestion.

For more details on the role of connectivity in VoIP quality, see our section on what makes VoIP work well for businesses.


3. Navigating wide area networks

When VoIP packets leave the local network, they follow a predetermined route across the internet to reach the recipient device. The path depends on the type of call and the recipient’s location.

Here’s how VoIP packets move in typical VoIP scenarios:

Standard VoIP call to an external recipient

For standard internet-based calls, business VoIP providers route the call directly to the recipient over the public internet. This direct path, bypassing the VoIP server, is chosen to minimise latency.

Providers still orchestrate the call from the cloud using SIP signalling, but this takes place independently of the point-to-point transfer of voice data.

Internal VoIP call between business sites

Internal calls between employees within the same business are usually routed across private business wide area network (WAN) links such as MPLS, point-to-point leased lines or dark fibre. These dedicated routes ensure high call quality and security.

Voice packets of employees working remotely need to be routed over public internet links, but can still be optimised and secured through business SD-WAN solutions that find and route voice data over the optimal path in real time.

In highly regulated sectors such as finance and healthcare, this approach is often required to comply with cybersecurity regulations for protecting sensitive call data. On-premise VoIP servers typically orchestrate these setups.

Group VoIP calls

VoIP is also used for calls with multiple participants, such as conference calls, live presentations with video, or collaborative sessions on shared platforms.

In these cases, VoIP data needs to be routed through a centralised VoIP server (either cloud-based or on-premises) to coordinate communication between all parties.

This centralised routing involves more steps, so performance depends on optimised device-to-cloud connections. Employees connecting from the office generally enjoy better quality than remote participants, as their streams travel through dedicated private Ethernet links to the VoIP servers and back.


4. Call reception at destination

Once routed by the VoIP server, the call reaches the recipient’s endpoint, whether that’s another VoIP desk phone, a softphone, or a mobile app.

If the call is to a landline or mobile, it will already have been converted into the appropriate format by a VoIP–PSTN gateway before reaching the handset.

Packet reassembly and playback

At VoIP-enabled devices, incoming data packets are collected, ordered, and smoothed by a jitter buffer.

The codec then decodes the digital stream back into audio, which is played instantly through the speaker or headset, making the process indistinguishable from a traditional phone call.

Handling packet loss and delays

Modern VoIP systems use techniques such as jitter buffers and packet loss concealment (PLC) to keep conversations flowing naturally, even when packets are delayed or lost.

These methods fill in gaps or smooth timing variations so the user hears continuous speech rather than broken audio.


5. VoIP call management

VoIP servers (sometimes known as PBX systems) orchestrate the entire journey of the data packets from caller to recipient and back, and host advanced VoIP call management and security features.

Here’s more detail about its roles:

Call setup and control (SIP)

VoIP systems use SIP (Session Initiation Protocol) to handle the instructions that make the call possible:

  • Verifies caller identity
  • Sets up and ends the call
  • Negotiates which codec to use (e.g. G.711, Opus)
  • Coordinates features like call hold, transfer, or forwarding

These instructions are sent to the VoIP devices to ensure the data packets are sent according to the call requirements.

Advanced VoIP functionality

VoIP servers also host key functionality that is only available to business internet calling. This includes VoIP features such as:


How VoIP calls connect to landlines and mobiles

Traditional landline and mobile calls run on networks separate from the internet, and therefore from VoIP packets.

Although they often share the same medium (e.g. copper wires for landlines, the airwaves for 4G/5G), they use different standards and protocols.

To enable interoperability between VoIP and landline or mobile calls, VoIP servers use a VoIP gateway. These specialised components translate between the two in real time.

Take, for instance, the packet in the diagram below. Once the VoIP server instructs the call to begin via SIP, VoIP data packets travel to the VoIP server, where they are converted at the gateway into mobile network signals and transmitted to the recipient.

Diagram showing how a VoIP call is connected to a mobile network through VoIP servers, SIP signalling, and a VoIP–PSTN gateway.

To end the call, the server instructs both the gateway and the device to stop. For both the caller and the recipient, the call was like any regular call.


What makes VoIP work well for businesses

The performance and reliability of VoIP solutions in a business setting depend heavily on the quality of the network it runs on. While the underlying technology is efficient, real-world performance is shaped by how the business configures its broadband, LAN, and WAN infrastructure.

Sufficient bandwidth

VoIP is remarkably lightweight. Standard codecs like G.711 or Opus use around 100 kbps (0.1 Mbps) per call. On most business-grade connections, this makes voice traffic a negligible fraction of total bandwidth.

However, when VoIP is bundled into UCaaS platforms with video, screen sharing, or live collaboration, demand can spike. For example:

  • A standard video stream can easily use 700 kbps
  • A 4K video feed can require 8 Mbps or more

This becomes more of a problem on asymmetrical connections, where upload speeds are significantly lower than download, or when the network is busy with large transfers or backups.

Even though QoS can prioritise calls, businesses should plan for scalable bandwidth to handle growth and concurrent usage without degradation.

Consistently low latency

Latency (i.e. the delay between sending and receiving packets) directly affects voice call quality. VoIP can tolerate up to 150ms of latency, but lower is always better.

Even copper-based and satellite broadband connections meet this threshold, but consistency is key. Business networks can reduce jitter and delay by:

  • Implementing business broadband redundancy to ensure failover and load balancing.
  • Using Ethernet instead of WiFi where possible
  • Supporting the latest WiFi standards (e.g. WiFi 6/6E)
  • Ensuring strong on-site coverage via mesh networks
  • Segmenting VoIP with VLANs
  • Prioritising voice traffic using QoS policies
  • Using SD-WAN for remote and multi-site optimisation
  • Integrating cloud VoIP directly into the WAN

Appropriate Quality of Service (QoS)

Quality of Service (QoS) configurations on routers and switches ensure that VoIP packets are prioritised over general internet traffic.

This prevents disruptions like choppy or delayed audio when the network is busy with large downloads, cloud syncs, or video conferencing.

In larger networks, VLANs work alongside QoS to isolate VoIP traffic and ensure more precise prioritisation and monitoring.

Strong last-mile connectivity

The last mile (i.e. the final leg between the business premises and the provider’s nearest access point) is often the weakest link in VoIP performance.

Here’s how different last mile broadband and Ethernet technologies compare:

Best: Carrier-grade Ethernet

Dedicated, fibre-based business Ethernet services are best for VoIP because they offer consistently low latencies and are backed by robust SLAs.

This includes:

Good: Fibre-based business broadband

Shared broadband is suitable for small offices, but it may experience issues due to business broadband contention during peak times. This includes:

Least stable: Wireless broadband

Wireless broadband is the least desirable. Modern 4G, 5G and satellite networks like Starlink or OneWeb-Eutelsat offer low-enough average latencies, but the connections are inherently too unreliable and unstable for enterprise VoIP.

Wireless options include:

Compare Business VoIP

Get the best deals from our experts

Related