Voice Over IP
The MGU provides Voice over IP according to the RTP and SRTP (secure VoIP) protocols. MGU also supports DTMF relay in VoIP channels according to RFC 2833/RFC 4733.
VoIP channels are obviously used to convert media between SIP/H.323 terminals and circuit switched devices, but also to interconnect media gateways, known as “inter-GW calls”. Thus, a “circuit switched” call between e.g. two analoge phones in two different gateways may take a path over some VoIP channels, causing an additional round-trip delay of about 100 ms that might need to be considered.
The VoIP channels in MGU are dynamic resources in MSP and the amount of available resources depends on the actual MSP load and the configuration of particular channel. The MSP load for a particular VoIP channel varies very much depending on choice of codec, packetization interval, VAD/CNG, echo canceler settings, crypto-suite, etc. For instance, use of VAD/CNG significantly reduce DSP load (and network bandwidth) and use of crypto-suite significantly increases load.
Furthermore, the selection of codec, packetization interval and crypto-suite also have great impact on the speech latency (especially packetization size). In general, latency sensitive installations should consider smaller packetization interval. Note that latency affects audio quality perception as well, especially when there is an echo situation
Note: VoIP channels is also used for inter-GW media (links between Media Gateways) and that configuration means of these is not same as configuring e.g. SIP/H.323 endpoints.
RTP
The Media Stream Processor encodes the PCM audio data from
TDM timeslots (from TDM switch) into packets to the streaming interfaces, and decodes packets from the streaming interfaces to output PCM TDM line.
The audio coding (codec) standard is used for both the encoder and the decoder.
MGU supports Voice Activity Detection (VAD) and Comfort Noise Generation for all supported codecs. The VAD function can be fine tuned for emphasis on bandwidth saving or audio quality. Too high bandwidth saving might cause audible audio artifacts as choppy speech.
MGU supports configuring the update rate of the comfort noise payload. The update rate can be config- ured to be between 0 and 60 seconds, where 0 means that the CN packet is sent only when there is a significant change in the background noise characteristic. For information about configuring the update rate, see VoIP.
- G.711 A- and µ-law, Appendix I (PLC) and II (VAD/CNG).
- G.729a with G.729 annex B (VAD/CNG).
- Clear Channel (bitwise exact transfer between TDM and IP, without echo canceler). Clear Channel is only intended for data or fax traffic. It shall never be used for voice, where echo canceling might be needed. Clear Channel uses dynamic payload type (PT in RTP header) as set by the MX-ONE Service Node from end-point negotiation
Selection of codec, PLC, VAD/CNG can be made on a per call basis from the MX-ONE Service Node based on SIP/H.323 media negotiation. MGU does not participate in any negotiation of media more than it reports its capabilites to the MX-ONE Service Node. It can be noted that use of VAD/CNG ("Silence Suppression") may lower DSP load and thus increase channel density (and also lower network load). On the other hand VAD/CNG might affect voice quality, so the use is a trade-off between density and voice quality. Unless the higher density or bandwidth saving is required it is a recommendation to disable VAD/CNG for best voice quality.
The MGU also supports Fax Pass Through. At modem and fax tone detection on the TDM side the MGU automatically switch to Pass Through mode using a predefined configuration which will set the RTP channel to G.711 codec and fixed jitter buffer, to be able to relay modem and fax data.
MGU allocates port numbers dynamically for RTP and RTCP from a port range that can be set by MX-ONE Service Node. RTCP port number is always RTP port + 1. When a new port number pair is allocated, always a pair with subsequent higher numbers is used. When highest configured port number is reached, the lowest is re-used again.
DTMF detection and relay in RTP channels
Each RTP channel provides a DTMF detection and relay feature. In-band DTMF tones may be detected at incoming TDM side and may be relayed to packet side in one of three ways:
- Transparent, DTMF tones are passed as tone in the codec. This option is only useful when codec is G.711.
- As named telephone events (NTE) according to RFC 2833 / RFC 4733. In this mode in-band DTMF tones are removed from the TDM side and converted to events at the IP side.
- Not relayed at all. Detected DTMF tones will be removed from the TDM side.
The selection of detection and relay mode is made by MX-ONE Service Node based on e.g. SIP/H.323 negotiation with remote end-point/gateway or use-case. Note that DTMF detection can be enabled or disabled by MX-ONE Service Node, regardless of relay mode selected.
Note: When the RTP channel removes a DTMF tone there might be a leakage of less than 20 ms, i.e. may be audible but not detectable by any standard compliant DTMF receiver. By configuration it is possible to enable "complete removal" at the expense of longer channel latency (see Configuration). Complete removal is however only able to completely remove qualified DTMF digits.
Secure RTP (SRTP)
See Security.
Jitter Buffer
RTP packets sent over the IP network are subject to random variation in delays, out-of-sequence arrival, and a risk to be dropped. These artifacts decreases audio quality, and the jitter buffer is used to mitigate this. However, while the jitter buffer can improve audio quality it does this to the cost of increased voice delays. Long delays, especially in combination with echoes at far end (e.g. caused by 2 to 4 wire hybrids in analogue lines) makes echoes more noticeable and disturbing. Although it might be tried to minimize delays in echo situations (if not the source of the echo could be removed) it must be understood that affecting voice quality due to e.g. dropped packets might have negative impact on the echo canceler. Echo Canceler.
In MGU, the jitter buffer can be configured in adaptive or non-adaptive mode, and there are configuration parameters to adjust for actual network conditions.
Note: Configuration is per MGU and will affect all VoIP calls in that MGU, including inter GW media over IP.
The configuration of the jitter buffer will be a trade off between audio quality and delays. By default, the jitter buffer in MGU is adaptive with settings for a fairly "normal" network, to preserve audio quality over minimizing delays. For a very delay sensitive installation, where audio quality could be negotiated and/or network is very good, re-configuration might be considered.
Although primarily the jitter buffer is for adapting to artifacts caused by network, also VoIP endpoints (phones, gateways, proxies, etc.) is part of the network and can cause these. For example, soft SIP clients with no dedicated HW (e.g. DSP) for VoIP media will have substantially more jitter in outgoing RTP packets than a HW dito. This can cause the jitter buffer to increase and thus to increase the delays even further. In those, and similar scenarios it might appear that the delay through MGU is longer than expected.
For non-voice calls over RTP, i.e. Fax/Modem, it is recommended to have non-adaptive jitter buffer, which is automatically set if Clear Channel codec has been selected or if fax/modem tones have been detected in a voice call.
Echo Canceler
The Echo Canceler (EC) eliminates the possible echo of send signal from the return signal (see figure below). Echo is normally caused by reflections in transition from 2-to-4 wires, but also acoustic echo in telephones can occur. The EC is only used for calls over packet switched network (VoIP) as depicted in figure below.
There are various configurations of the EC which is described further on as well as in the Configuration.
Note: Configuration is per MGU and will affect all VoIP calls in that MGU, including inter-GW media over IP. Thus, the choice of EC settings might have to be a compromise.
The reason for using EC on VoIP calls is that echo in combination with (long) delays as caused by packet switched network is much more disturbing than echo when there is no or very little delay as can be expected in the circuit switched network (TDM).
The experience of how echo is perceived by the user on the packet side is a sum of echo source, echo canceler operation and packet network delays included delays in media gateways (encoding, decoding, jitter buffers), network (switches and routers) and endpoints (encoding, decoding, jitter buffers).
MGU supports two EC types for VoIP GW calls: Standard EC and Dual Filter EC (DFEC). EC type and settings very much impacts the MSP load.
Note: Changing EC type will cause a restart of the Media Control Application (MCA) and the MSP.
The EC also includes a Non-Linear Processor (NLP), a sub function that is capable of handling non-linear part of residual echo that the linear filter of the EC cannot cancel. By default, NLP is disabled and is only recommended to be enabled when needed.
When NLP is enabled and if engaged, it generates comfort noise (CNG) towards IP. CNG can also be changed to generate silence if CNG is not wanted.
Standard EC
The Standard EC echo tail length can be
configured from 8 to 128 ms in 8ms steps, and the filter window is fixed to 24 ms. The advantage of the Sparse EC is MIPS savings which corresponds to higher channel density. The Standard EC can be improved by enabling Echo Path Change Detection (EPCD) that improves adaptation on filter window position.
Note: Setting EPCD will cause restart of Media Control Application.
Dual-Filter EC (DFEC)
The DFEC echo tail length can be configured from 8 to 128 ms in 8 ms steps. DFEC advantages:
- Avoids increased echo level caused by filter divergence during double-talk.
- Robust and fast detection of echo path changes.
The main disadvantage is that the DFE reduces DSP channel density, see further section Capacity and Limitations.
Standards Compliance
- The Standard EC without Echo Path Change Detection (EPCD) enabled is compliant
- with G.168/2002
- The Standard EC with ECPD enabled is G.168/2004 compliant for single reflection
- with echo dispersion less than 24ms and echo tail span less than 128 ms.
- The DFEC is G.168/2004 compliant for echo span up to 128ms (highest complexity).
A few recommendations in case of echo problems
Analyze the traffic scenario to make sure echo canceler is turned on. Of special interest is the media path where the transition IP to TDM takes place (for instance, the Media Gateway where the call is routed to/from public ISDN trunk). Check that ClearChannel (ClearMode) codec is not used in RTP which always have EC turned off.
Turn on the NLP (Non-Linear Processor, a sub-function of the EC). By default NLP is disabled. The NLP will be able to take care of non-linear residual echoes that the linear filter cannot.
Turn on the DFE (Dual-Filter EC). This is a more complex EC algorithm and significantly improves echo canceling. Note however that DFE has some impact on MGU capacity, as further described in section Capacity and Limitations.
If echo canceling is improved after (re-)configuration, but comes back again, make sure the configuration that was made is persistent, i.e. parameters are marked "reload" in the MX-ONE Service Node and that proper data backup has been made.
Sometimes the echo delay path is longer than the echo canceler can handle (default is 64 ms, but can be changed up to 128 ms). The echo delay is the time from media (talker) is sent out on TDM and when it returns (attenuated) back to TDM. Usually, this is quite short, e.g. less than 10 ms but might be much longer in some situations. If the echo path is bigger than 64 ms but less than 128 ms the EC filter lengths may be adjusted. Note however that a longer filter length will have some impact on MGU capacity.
SSRC Generation, Detection, and Collision Handling
For the outgoing (audio) RTP stream in a VoIP call, MGU creates a random 32-bit SSRC (Synchronization Source) value. This value is used for all RTP packets in that stream throughout the stream is active. If a call is put on hold, or any setting of the actual RTP stream is changed (e.g. DTMF relay mode is changed) by the MX-ONE Service Node, the current stream is closed and replaced by a new. Hence, a new SSRC value is created for the new stream.
On corresponding incoming RTP stream, the MGU validates all received RTP packets. Packets with any SSRC value will be accepted as long as two packets with consecutive numbers and same SSRC are received. This allow the sender to change the SSRC value for a RTP stream. However, if the SSRC value changes too often during a shorter time period (about 1 second), this is referred to as a "SSRC violation", and will cause the used RTP port to be blocked for a while to avoid violating port to be re-used. This situation is usually caused by two or more interleaved RTP streams towards same RTP port. If this happens there is probably a RTP sender that has not properly closed its RTP stream. In situations where such violations are expected for a longer time, the "ssrc_violation_filter_time" can be increased (or disabled), see further section System Configuration Files.