The mobile phone effect in linguistics

Today, most Europeans own a mobile phone and communicate over vast geographical distances. In his Telsure project, Labov showed how useful this fact is for sociolinguistic data gathering, when he used landline telephones to collect his speech samples. But, it was not addressed how the telephone transmission could affect speech.  This has been investigated especially in forensic phonetics and this has been coined the telephone effect. The telephone effect might ring a bell for most linguists and to some even its acoustic implications are not unfamiliar. However, technology has evolved, and landline telephones have been replaced by digital, non-physically connected mobile phones. So, can we then assume that the mobile phone effect has the same implications for speech as the telephone effect? You might have guessed it, but the short answer is no.

My aim with this blog entry is to elaborate on why the answer to the question above is indeed no, and why we as linguists should care about this fact. I will do so by first introducing you to some of the technicalities of telephone transmission and why it is fundamentally different from mobile phone transmission of speech – trust me, technology is not only for engineers.

When you are equipped with this technical understanding of telephone versus mobile phone transmission of speech, I will go a bit more in depth with what constitutes the telephone and the mobile phone effect as linguistic concepts. Finally, I will take a moment to consider how linguistics as a field could benefit from a better understanding of the mobile phone effect.

Now, the technical differences between telephones, which use landline transmission, and mobile phones, which use the Global System for Mobile Communication (GSM) for transmission, are in broad strokes parallel to the differences between analogue and digital signals.

Landline transmission transfers the input speech signal in a continuous manner while maintaining wave structure. This entails that a significant limitation in bandwidth is present and that the signal cannot be handled in terms of background noise or other types of interference. The latter is what causes the difficulty when speaking on a mobile phone by a busy road to someone on a landline telephone. If you have ever tried this, you will know that the listeners on the other end very quickly will tell you that it sounds like you are in the middle of the motorway. I will leave this point for now, but return to it when considering mobile phones, where this situation rarely occurs. The bandwidth on the other hand is essential to consider further.

A spectrogram with 0 to 10,000 Herz frequencies

In telephones the bandwidth is limited between 300 Hz and 3400 Hz. It is clear from a linguistic perspective that this is far from adequate to transfer all possible speech sounds. It is in fact not even adequate to transfer all vowels. Studies concerned with the telephone effect have found vowel formants to be compressed and vowel quality to be altered because of the limited bandwidth. Consequently, this has become the primary feature of the telephone effect.

Mobile phone transmission presents a completely different set of components to consider, when talking about bandwidth and speech alterations. Firstly, the GSM network is digital, which both introduces improvements and significant challenges from a linguistic perspective. Generally, this means that the speech input needs to be translated into numbers in order to be transferred. The process of translating the original speech input into numbers is called encoding and the interpretation back into an intelligible speech signal at the receiving end is called decoding. For this process to work properly the speech input is divided into smaller sequences of 20 milliseconds each and transferred successively. In that way, the constraints on bandwidth are completely different from telephones. Under optimal conditions, the GSM network presents a bandwidth between 50 Hz and 7000 Hz.

However, this quality is not guaranteed to be stable as the speech sequences need to be carried between the phones, and the capacity to do so is not a constant – as you might have noticed if you have ever traveled to the lovely Danish countryside. If not, I can tell you that the quality you will experience of your mobile phone call is definitely not guaranteed to maintain the standards, you are accustomed to in the city. The reason for this is the limited availability of cell phone towers and hence capacity to carry the speech sequences to the receiving phone. It is in fact also possible to experience variation in signal quality in the city based on location and the amount of traffic on the network.

Since the speech sequences are 20 milliseconds long, it is in theory possible to find variation in signal quality every 20 milliseconds. So what are the linguistic consequences of this?

Studies have found that vowel quality is also affected by mobile phone transmission, but in comparison to the telephone effect, the mobile phone effect is far more unpredictable. And, this is when my point about background noise and interference again becomes relevant. The GSM network contains a sub-installation called the Adaptive Multi-Rate (AMR) codec. This codec filters the speech input and only allow frequencies which pass a pre-set threshold to be encoded. These frequencies are meant to be the ones recognised as speech so that no non-speech sounds take up space in the transmission or disturb the output.

Transmission. Illustration from Guillemin and Watson (2008).

From a linguistic perspective two problems arise. First, the AMR-WB codec has been found to misinterpret the input and leave out speech sounds. Secondly, consonants in many cases consist of exactly the same type of sound which, when singled-out, could be described as noise. My current research focuses on exactly these issues. So far, the results suggest that if one looks at the mobile phone effect from an acoustic perspective, some overall patterns seem to appear. At the moment, I have found indications of alterations in spectral moment values, duration and in a few cases complete deletion of consonant sounds caused by the mobile phone effect. This is all based on the comparison between directly recorded speech and the same speech simultaneously transferred and recorded from a mobile phone.

Spectrographic illustration adapted from my Master Thesis of deleted /h/ in the comparison between participant 10 producing the word in the same single utterance in the direct recording and the mobile phone recording (recorded simultaneously)

All in all, I hope to have broadened your perspective on what you might call technical linguistics, and maybe even gotten you interested in the mobile phone effect. If nothing else, it should now be clear to you why the answer to my initial question was no. Regardless of its advantages, with the research available at the moment, it is still ill-advised to use mobile phones to gather data in linguistic research as an alternative to direct recording.

It is essential that we as linguists understand the basic workings of the communication technology, we surround us with, in order to make informed choices in our research and understand the language around us. Technology and technical considerations do belong in linguistic research, and when we as linguists realise this, our research only improves.

 

A few references

Alzqhoul, Esam A. S., Balamurali Nair, and Bernard J. Guillemin. 2012. “Speech Handling Mechanisms of Mobile Phone Networks and Their Potential Impact on Forensic Voice Analysis.” In 13th Australian International Conference on Speech Science & Technology. Sydney.

Besette, Bruno, and Redwan Salami. 2002. “The Adaptive Multirate Wideband Speech Codec (AMR-WB).” IEEE Transactions on Speech and Audio Processing 10 (8): 620–36. https://doi.org/1063-6676/02$17.00.

Byrne, Catherine, and Paul Foulkes. 2004. “The ‘mobile Phone Effect’ on Vowel Formants.” Speech, Language and the Law, 83-102, 11 (1).

ETSI. 1992. “Voice Activity Detection.” Recommendation GSM 06.32 V3.0.0. https://www.etsi.org/deliver/etsi_gts/06/0632/03.00.00_60/gsmts_0632sv030000p.pdf.

Guillemin, Bernard J., and Catherine I. Watson. 2006. “Impact of the GSM AMR Speech Codec on Formant Information Important to Forensic Speaker Identification.” In 11th, 483–88. Auckland University: New Zealand.

Guillemin, Bernard J., and Catherine I. Watson, 2008. “Impact of the GSM Mobile Phone Network on the Speech Signal: Some Preliminary Findings”. Journal of Speech, Language and the Law 15 (2): 193–218.

International Telecommunication Union (ITU). 2019. “ITU_key_2005-2019_ICT_data_with LDCs_28oct2019_final.” https://www.itu.int/en/ITU-D/Statistics/Documents/statistics/2019/ITU_Key_2005-2019_ICT_data_with%20LDCs_28Oct2019_Final.xls: ITU.

Künzel, H. J. 2001. “Beware of the ‘Telephone Effect’: The Influence of Telephone Transmis-Sion on the Measurement of Formant Frequencies.” Forensic Linguistics 8 (1): 80–99.

Labov, William. 2000. “The Telsur Project at the Linguistics Laboratory.” Database. Www.Ling.Upenn.Edu/Phonoatlas. 2000. http:// www.ling.upenn.edu/phonoatlas.

Nolan, Francis. 2002. “The ‘Telephone Effect’ on Formants: A Response.” Forensic Linguistics 9 (1): 74–82. https://doi.org/10.1558/sll.2002.9.1.74.

 

Krestina Vendelbo Christensen is currently a PhD student and doing a joint degree between Aarhus University and the University of York with the title “Understanding the technical implications of digital transmission on consonants” She has previously been working with the mobile phone effect in linguistics as part of both her Bachelor’s and Master’s degree.

Leave a Comment