Google Translate

& Gender Bias

By Kurt Kazan | Magazine | December 3, 2021

Cover Illustration: Google Translate translates the Finnish gender-neutral pronoun hän, into feminine or masculine pronouns in English based on gender bias.

Reporter Kurt Kazan inspects how the Google Translate algorithm has contributed to the perpetuation of gender stereotypes.

 

It goes without saying that most people have various preconceptions regarding gender. After all, gender stereotypes are imposed on us everywhere – schools, the workplace and the media, causing all kinds of injustices to prevail.  It should come as no surprise that many of the products and services we use in our daily lives are also affected by these preconceptions. A notable case which has recently emerged into the public sphere is Google Translate’s apparent gender bias when translating between several languages, for example from Finnish to English.  

As a genderless language, that is, a language without any distinctions of grammatical gender, Finnish has only one personal pronoun, hän. It is epicene, referring to men, women, and all other genders alike, and thereby treats all people uniformly. Having one gender-neutral personal pronoun provides a space for inclusive language which respects everyone’s self-identity. However, because not all languages work this way, either because they have two or more gendered personal pronouns, genderless languages are not always adequately translated. This has clearly been demonstrated by the encoded biases of Google’s neural machine translation service.

Earlier this year, a  Twitter user demonstrated this bias by running Finnish sentences containing hän  through the translation process to English, revealing controversial results.  Several people attempted to verify this phenomenon and found that English results switched between the feminine pronoun she and the masculine pronoun he, depending on the wording of one’s input. For example, “hän maksaa laskut” would translate to he pays the bills,” whereas hän siivoaa” to “she cleans.” In other words, Google Translate determined which pronoun to use based on typical gender stereotypes. 

Google Translate translates the Finnish gender-neutral pronoun hän, into feminine or masculine pronouns in English based on gender bias. Anna Brchisky / Twitter

The functioning of Google Translate shows how the outputs of data-driven services are, in many cases, intrinsically biased as a result of our own human biases. Ultimately, after extensive criticism, Google made some changes to its translation service. Now, instead of giving a single output when using the personal pronoun hän, it gives two – one for both of the main gender pronouns. Nonetheless, gender bias can still be seen when translating several sentences at once. For example, “Hän maksaa laskut. Hän hoitaa lapsia.” is still translated into “He pays the bills. She takes care of the children.” This goes to show that the slight modification does not necessarily prevent a similar problem from emerging. After all, the foundations of data-driven services remain largely the same. The biases embedded into algorithms via machine-learning reflect human knowledge and data. Hence, issues pertaining to algorithmic bias should not be directly pinned on programmers. The role of sources from which data are accumulated should be taken into consideration as well, as many of these biases originate there.

For instance, the Oxford English Dictionary, has been previously criticized due to its sexist synonyms regarding women. Specifically, the word “women” has been associated with condescending terms, such as “bitch.” It should be noted, however, that such ascriptions reflect more on the lack of revision of outdated words and definitions rather than the wrongdoings of contemporary wordsmiths. 

Biases are intrinsically human. However, this built-in partiality should not be an excuse for embedding them within the products and services available to citizens of this modern world. At the end of the day, the contents of digital landscapes are known to operate tools that influence people’s behavior and perceptions, either consciously or unconsciously. Therefore, the extent to which these biases are incorporated in services such as translation software should be actively scrutinized in order to diminish their potential ramifications. Biases should not be left lingering in algorithmic codes, as these may have wide-ranging societal repercussions, such as creating preconceptions regarding gender or perpetuating existing stereotypes.

“Biases are intrinsically human. However, this built-in partiality should not be an excuse for embedding them within the products and services available to citizens of this modern world.”

Kurt Kazan is a student at the University of Amsterdam. The views expressed here are not necessarily those of The Amsterdammer. 

+ posts