Picking apart polyglottal practices on the web

March 19, 2019


facebook icon facebook icon

Currently, our planet speaks 7,097 languages. Twenty percent of the approximately 7.5 billion people living on the planet, around 1.5 billion, speak English.

Although Mandarin and Spanish have more speakers than English, technology’s favourite language seems to be English. The reason for this linguistic preference might be the fact that English remains the most commonly studied foreign language in the world.

According to Hackernoon, English-centric development has been noticed most in the Google Translate app, which gives more correct translations when a language is paired with English than languages like Korean, Hebrew, or Bahasa Indonesia.

Read More: What’s your buzz? Decoding Jargonaut technobabble

While European languages are seeing some improvement, other languages lag behind even after Google boosted its tech with zero-shot translation and a ‘multilingual neural machine translation’ mechanism.

Amazon’s Alexa supports three languages, English, German, and Japanese. However, only within English, it offers special support for five dialects or accents, namely, Australia, Canada, India, the UK, and the US. Google’s Home currently supports four overall languages English, French, German, and Japanese. Again, it also supports four English dialects, from Australia, Canada, the UK and the US.

However, tech giants, like Facebook, Google, and Amazon, whose businesses thrive on a population’s access to the internet, are realizing fast that sticking only to English is not a good business model. Let’s look at some numbers.

Users can avail Facebook in more than 100 languages. More than 1 billion people already avail it in languages other than English. Over 70% of Facebook users are located outside the US. The extent of the social media giant’s reach is over 10% of the total national population in 26 countries.

In 2008, after Facebook brought out an Italian-language version of the service, its user database rose from 350,000 to around 8 million within a year. Also, after 25,000 volunteers helped translate Facebook into Turkish, 9 million Turkish language users signed up for Facebook within a year.

Other examples include Snapchat, which offers its users 22 languages, including Arabic, Norwegian Bokmål, Romanian, Turkish, and Filipino. Also, Amazon Polly reads text aloud in 24 languages.

India, a Special Case

In countries like India and Pakistan, English is the official language without being the primary language. So, while English is used in business, education, and for official work, the majority of residents don’t use it for communication.

A 2017 report by KPMG and Google found 234 million Indian-language internet users and 175 million English users. By 2021, Indian language users are projected to rise to 536 million, while English users will only reach 199 million. Also, 75% of India’s internet user base will consist of Indian language internet users by 2021. That’s a huge market!

To reach out to this crowd, companies are working hard to deliver as many vernacular languages as they can. Thus, more and more locals can access the internet in comfortable languages.

For example, WhatsApp, which has 200 million monthly active users in India, supports 11 Indian languages. Facebook, WhatsApp’s parent, and the second most used app in India in 2017, supports 13 local languages. So much so, that Indian Android WhatsApp users have the ability to change languages from within the app.

Google Search also offers nine Indian languages, while Google Translate offers three more. Google has also added eight Indian languages to its voice search feature.

Microsoft started an initiative called Project Bhasha two decades ago to promote computing in the vernacular in India. The company has made email addresses available in 15 Indian languages.

Indian startups and small local apps too are tapping this market to their advantage. Indian content discovery app, ShareChat allows users to choose from 10 languages and 27 dialects. Leveraging WhatsApp’s popularity in India, ShareChat allows users to share photos and videos on WhatsApp.

Challenges in Adopting Languages

Developing products for multiple languages cannot be easy, since many languages have different scripts, which can be complex to translate to. Sunder Srinivasan, general manager, artificial intelligence (AI) and research, Microsoft India, told The Economic Times, “Customising and training technology involves putting in massive amounts of high-quality data to execute translations. For accurate translations, the system demands millions of parallel sentences in each language pair, in all permutations and combinations.”

How many clicks

Characteristics and facts such as how many clicks it takes for a user to change the language on an Android smartphone as well as the fact that smartphones come with a default setting of English, has led internet businesses to rethink their strategy.

Trust issues

Often local people of a place hesitate to adopt a technology in a foreign language. They tend to trust content in their own language more and hence are more likely to use the internet.

Accents and dialects

Problems with accents and dialects also persist. Not only do languages have different accents, they also tend to differ with the age or health of the speaker. English dialects, such as West Virginian English and African-American Vernacular English, (AAVE), not only have different pronunciation, but even different grammar. Similarly, Germany has many types of German, such as Bavarian, Allemanic, and Schwabisch; and Italian consists of Sicilian, Neapolitan, Sardinian, and more.

Solutions for Multilinguistic Problems

Developments in this sphere are companies like Beyond Verbal and Affectiva, which use emotional analytics. While Beyond Verbal measures changes in a subject’s health, Affectiva derives data on drivers’ emotions to improve road safety.

Read More: Most European languages “unlikely to survive in the digital age”

Last year, Amazon got a patent for its Accent Translator, which can help Alexa identify a user’s physical characteristics, language accent, ethnic origin, emotion, gender, and age. Sheesh! That’s too close for comfort, Alexa.

A study called ‘Multi-Dialect Machine Translation (MuDMaT)’ addresses the need for researching “machine translation systems for less resourced languages and their variants or dialects.” The project successfully achieved the use of the Tunisian dialect in a rule-based translation system for translating texts into Arabic and French.

Multilingual Chatbots

Africa also boasts of around 1,500 to 2,000 languages broken down into four main language groups. Eleven official languages are spoken in South Africa, and many African countries are multilingual.

Read More: Multilingual bots allow African SMEs to simplify funding across language barriers

To ease business in such a multilingual environment, Microsoft recently worked with Growth as a Service (GaaS) fintech platform, Ovamba, to build a multilingual bot that could simplify its funding application intake process.

Ovamba, co-founded by Marvin Cole, is dedicated to connecting investors with African businesses and distinguishes itself by delivering capital via its mobile app and web based platform, according to its blog.

Vocal Apps

Apart from vernacular languages, a catalyst that is driving technology to think out of the box when it comes to promoting the internet, is a sad one, illiteracy. A third of the population in Sub-Saharan African countries like Chad or Mali, as well as almost half the population in South Asian countries like Pakistan and Bangladesh are illiterate.

Since, these markets are emerging ones, use of mobile phones are rising, even though the rate of literacy isn’t rising that fast. However, textual content becomes useless for such a population. This has given rise to many voice based apps. The Ghana based Voto Mobile is one such app that organizes surveys for populations living in remote areas, who have access to mobile phones but with low literacy rate.

The app is helping in garnering higher participation rates in surveys that are significant for NGOs and institutions to recognize the requirements and resources of rural populations. They offer services ranging from one-way communication between the institutions and citizens and two-ways interactions such as citizen-centric reporting of broken infrastructure or corruption cases.

Another vocal app, called, is trying to help schools and educational institutions to grow the reach of teaching beyond the school in rural areas of emerging markets. Students can reach out to the app via SMS. The service then calls the student back and offers vocal browsing of short lessons pre-recorded by their teachers.

After the lesson, students can also answer quizzes using their voice. Apart from African countries, the app is now being used by the Startup Chile program, an accelerator based in Santiago.

Towards a Polyglot Society

As the world move towards multiculturalism, and societies become more and more polyglot, the internet will have to learn to adopt not just new languages, but also unique nuances and inflections of accents and dialects.


facebook icon facebook icon

Sociable's Podcast