UI Localization to Chinese: What You Need to Know

WHAT YOU NEED TO KNOW ABOUT UI LOCALIZATION TO CHINESE AND KOREAN 

 

Chinese is the most spoken language in the world with 1,197 million native speakers of its various dialects living in 33 countries. Considering the huge number of Chinese speakers, it is crucial to approach UI localization to Chinese with due respect and consideration.

As Chinese, Korean, and Japanese are often grouped together, in this article, we will touch upon all these languages with the main focus on Chinese.

We will start by discussing challenges in Chinese software localization. We will then move to issues in Korean software localization. We will wrap up by looking at issues common to both languages as well as Japanese.

 

Challenges in Chinese Software Localization

Selecting Simplified Chinese, Traditional Chinese, or both

Choosing the correct dialect of written Chinese is vital for successful localization to Chinese.

To understand why we need to look at recent Chinese history. To encourage literacy in China in the 1950s and 1960s, the government of the People’s Republic of China created simplified versions of Chinese characters. These characters, which used fewer strokes, are known as Simplified Chinese. These were adopted in mainland China as well as in Singapore and Malaysia.

However, Hong Kong and Taiwan continue to use the original form, known as Traditional Chinese. Many Chinese-speaking communities outside of Asia (for example, in North America) also use Traditional Chinese.

In addition to the differences in the character version, there are also differences in terminology, vocabulary and idiomatic expressions between Simplified and Traditional Chinese.  

This is a complex issue with many political elements, so it is important to get it right.

You may be thinking to yourself: what about Mandarin and Cantonese? 

They are spoken dialects of Chinese, not written formats. Both can be written in either Simplified or Traditional Chinese. To use the correct vocabulary, we will need to define your target country.

(Note: In this article, we discuss issues that apply to both Simplified and Traditional Chinese. To make reading easier, we will refer to them jointly as “Chinese.”)

 

Sorting with Chinese characters

Another issue during Chinese software localization is how to sort information. For languages that use an alphabet, developers organize information in alphabetical order. 

So how do you organize a dropdown menu or a support index in Chinese, which uses characters instead of letters?

There are multiple methods of sorting. Some developers use a method that relies on pinyin, a pronunciation-based sort. Others sort by the number of strokes in a character, then sort further by the individual strokes. Microsoft, for example, prefers this method.

Either method will work. What is important is to select one method and apply it consistently during the localization process.

 

Using typefaces correctly in Chinese software localization

When creating a font for English, the bare minimum is 26 letters (upper and lower case), 10 numerals, and special characters. Add in characters with accent marks, and the total is around 100-150 characters.

On the other hand, Chinese fonts need to display upwards of 20,000 characters.

This means that there are few fonts that display every character in a readable fashion.

The best fonts for a successful Chinese software localization process are system fonts. System fonts are the Chinese equivalent of Arial, Helvetica, or Times New Roman. They are not exciting from a design perspective, but they are guaranteed to be readable.

Using fonts such as SimHei and SimSun decreases the chances that a character will not render correctly.

 

Challenges in Korean Software Localization

Handling particles

The Korean language uses particles that follow nouns to indicate whether they are the subject or object of a sentence. These particles are necessary to communicate information.

Particles take different forms depending on whether the noun ends in a consonant or a vowel.

As you can imagine, this makes it challenging to translate UI strings with variables, such as “{Name} liked your comment.”

Translators experienced with Korean software localization can often come up with creative workarounds. To do so, they will need context as well as variables that are part of a complete string (and not separated from it).

 

Displaying Hangeul correctly

The Korean language uses a sophisticated writing system called Hangeul. Hyelim Chang of Google’s localization team explained how it works for Multilingual Magazine:

Hangeul is different from most writing systems in that it is half-alphabetic and half-syllabic. It is alphabetic in that one letter corresponds to one sound. This way of writing a word coincides with the convention of linear writing in English.

 

However, the letters are not written linearly as in English. Instead, they are grouped into syllabic blocks of two to three letters — initial, medial and final — which makes the writing nonlinear. These blocks are arranged horizontally from left to right or vertically from top to bottom. 

 

To write Hangeul, for example, you would not write h-a-n-g-e-u-l, you would write in blocks: han-geul, 한 (han) and 글(geul). […] although 한 looks like one letter, it is actually composed of three letters: ㅎ,ㅏand ㄴ, each representing the sound h, a and n. In 글, the letters ㄱ, ㅡ and ㄹ represent the sounds g, eu and l respectively.

This means that text wrapping is critical to ensuring that words display correctly. Korean texts should not break in the middle of words. They need to break on spaces between words, and the UI must allow translators to insert breaks where needed.

 

UI localization issues common to Chinese, Korean and Japanese

Chinese Software Localization

 

Concatenating strings should be avoided

Just as we saw with Spanish and Russian (among others), translators working on UI localization benefit from having each sentence as a single string.

Sentence structure in Korean, Chinese, and Japanese is quite different from English. This means that elements of the sentence move around in translation.

For example, let’s look at the sentence: 

                       You have 5 items in your cart.

After localization into Chinese, this reads:

                      您的购物车中有 5 件商品。
                      (您的[your]购物车[cart]中[in]有[have] 5 件商品[5 items])

As mentioned above, using a single string per sentence in Korean software localization helps translators handle particles.

 

Making sure one- and two-byte characters display correctly

In English (and many other Western languages), we use letters to make up words. Each letter takes up one byte of space.

In Chinese and Japanese, a character conveys the same meaning as a whole word. But depicting a character with multiple strokes requires two bytes of data.

Characters that require two or more bytes are best supported by UTF-8. This encoding uses Unicode to ensure that the correct characters display.

What happens if the outdated encoding is used during Chinese software localization? The text appears as ��� or character 2 , signaling that the system does not know what to display.

As discussed earlier, Korean is not a character-based language. However, its syllables still require more than one byte of data to display. In other words, successful Korean software localization also requires the use of UTF-8.

When building software, it is also important to select fonts that support UTF-8. If the system font only recognizes Latin or Cyrillic characters, the dreaded boxes will reappear.

For an in-depth look at the history of Unicode and how it accommodates character-based languages, we recommend this article from Smashing Magazine.

 

Localizing culture as part of UI translation

We have already discussed the importance of adapting software for the target culture. 

This is especially vital for Chinese software localization as well as for Korean. Certain symbols are politicized and making incorrect assumptions can lead to PR problems.

Specific elements that need to be carefully examined during the localization process are:

  • Flags
  • Maps
  • Place names

Many UI designers use flags to represent language. Choosing the wrong flag can lead to problems. For example, selecting the Chinese flag for Traditional Chinese because it is used in Hong Kong may offend Taiwanese readers.

Other assets, including images and video, should be reviewed as well. For example, an American user might not blink at an image of a person with tattoos. The same is not true in Japan, where tattoos are linked to the Yakuza and are socially unacceptable.

Likewise, colours have significantly different meanings in China or Japan than they do in Western countries. 

Airbnb learned this the hard way when they combined a poorly chosen Chinese name with bright pink branding. Chinese users reacted on social media by saying that the choice made Airbnb seem like a matchmaking website.

 

Adapting inputs

Inputs such as text boxes need to be able to accommodate Chinese characters or Korean hangeul

(Note: you may have read that Chinese, Japanese, and Korean can all be written top-to-bottom as well as left-to-right. While this is historically true, modern digital use favors left-to-right. Users will not expect to be able to input text top-to-bottom.)

The format and content of inputs may also need to be adjusted for cultural differences during the localization process.

The simplest example is names. In any of the three languages, the family name (or surname) comes first, followed by the given name. Developers need to ensure that these fields will map correctly into their databases. If not, they risk incorrectly addressing users.

Addresses may also be formatted differently. A Chinese user who resides in a large apartment complex may need to input their development’s name, the specific building, their floor, and their apartment number. But they might not even know their postcode, as it is not commonly used.

This means that developers need to assess which inputs are required and ensure that all information provided is stored correctly.


Getting these elements right is vital to successful localization into Chinese, Korean, or Japanese.

Users in countries where these languages are spoken (among them China, Hong Kong, Taiwan, South Korea, and Japan) are digitally savvy and expect high-quality products. If localization does not meet their expectations, they will not hesitate to hop on social media and let others know.

Both Chinese software localization and localization to Korean can be especially high stakes. Each language is shared by two (or more) countries with a complex political history. By working with an experienced localization team, companies can avoid issues that might damage their reputation in these critical markets.

Would you like to have your software localized to Chinese, Korean, or any other Asian language? 

Contact us today to discuss your project and see how our expert team can help you assure an exceptional user experience.

Comments are closed.

error: Content is protected !!