How sending number tones (DTMF) works on VOIP

Home Forums The Show Discussion How sending number tones (DTMF) works on VOIP

Tagged: , ,

This topic contains 0 replies, has 1 voice, and was last updated by  RChandra 3 months, 3 weeks ago.

Viewing 1 post (of 1 total)
  • Author
  • #5092


    Back in the days when VOIP was developed, the bandwidth for the typical Internet connection was paltry by today’s standards. For example, back in 2000 when I first got DSL, the upstream was 90 Kbps. Using the usual codec for the Public Switched Telephone Network (PSTN), G.711u, requires 64000 bps. (They’re the equivalent of ~ 11 bit samples compressed down to 8 bits taken 8000 times per second.) IP, UDP, and RTP each add their own (comparatively small) overhead. Nonetheless, 90 Kbps is enough for only one voice channel (so a single call). A three way call (two participants plus yourself) would not be possible.

    Therefore other codecs were developed, mostly for the budding digital mobile phone market, which sacrifice some quality for reduced bit rate requirement. The problem is, Touch-Tones (DTMF) would not go through the coding then decoding process without severe distortion, therefore being unrecognizable. So what mobile and VOIP set up for reduced bandwitdth usage do is continuously sample the outgoing data stream, before the codec, for DTMF. The recognized digit is subtracted from the audio signal, THEN fed to the codec. The sending system then sends merely a message that the digit was pressed. (In the case of mobile phones, there is no DTMF recognition, because the handset doesn’t produce the tones, except to speaker for the handset user; it never sends that tone over the mobile netowrk.)

    Normally many VOIP systems (such as analog telephone adapters, or ATAs) listen for a while to find out for how long the user is pressing the digit, and sends an indication of from 100 to 5000 ms to the other end of how long the digit is supposed to last. The protocol also allows for sending just the digit pressed, and assumes 250 ms (1/4 sec.).

    For the caller, I cannot explain why tones would be latched “on.” I could not find a reference to the protocol saying there is a message to turn generation of the digit on and another to turn it off.

    This also explains how, if you’re talking with someone on VOIP, you hear “random” DTMF being sent while the person to whom you’re listening is talking. Their voice is approximating some digit, the ATA is (erroneously) detecting it as the talker pressing a number on their keypad, and sending the “digit pressed” message to their VOIP provider. The VOIP provider dutifully generates that digit once it hits the PSTN. If you’re both on the same VOIP provider, the message may be passed (basically) unaltered to your own ATA, and your ATA may generate DTMF.

Viewing 1 post (of 1 total)

You must be logged in to reply to this topic.

Comments are closed.