As a developer, have you ever been overwhelmed by this bewildering array of codes?
- In a user registration form, is the country list using
CNorCHN? - For multilingual translation (i18n), should the folder be named
zhorzh-CN? - When handling video subtitles, a spec demands an unfamiliar three-letter code—sometimes
zho, sometimeschi—what's the difference? - Not to mention seemingly random time zone identifiers like
Asia/Shanghai.
After reading this, you will thoroughly understand the logic behind these codes and be able to confidently use them correctly in your projects.
Core Idea: Divide and Conquer
These standards seem chaotic because we try to understand them with a vague concept of "region." But the principle of the computer world is precision. Therefore, international standards organizations "divide and conquer" the fuzzy concept of "region," breaking it down into several specific, orthogonal (mutually independent) dimensions, and establishing a gold standard for each.
Our journey of exploration begins with understanding these dimensions.

1. Geography: Where Am I? - ISO 3166-1
This is the foundation of all codes, answering the simplest question: "What is this country/region?"
- Standard Name: ISO 3166-1
- Core Mission: Provide unique identifiers for countries and regions worldwide.
- Primary Formats:
- alpha-2 (Two-letter code): e.g.,
US,CN,JP. This is the most commonly used and universal format. - alpha-3 (Three-letter code): e.g.,
USA,CHN,JPN. More readable, often used in data statistics and official documents.
- alpha-2 (Two-letter code): e.g.,
Developer Practical Guide:
- Database Design: When storing a country in a user table, create a
country_codefield, using theCHAR(2)type to store the two-letter code (alpha-2). For example:
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(255),
country_code CHAR(2)
);- API Design: Region-related APIs (e.g., e-commerce shipping ranges) should use two-letter codes as parameters. For example:
GET /api/v1/shipping?country=CN HTTP/1.1- Frontend Development: In a country selection dropdown, the
valuefor<option value="CN">China</option>should use the two-letter code. For example:
<select name="country">
<option value="CN">China</option>
<option value="US">United States</option>
<option value="JP">Japan</option>
</select>Learn More
- Wikipedia: ISO 3166-1
- Official Standard Query: ISO Online Browsing Platform
2. Language: What Do I Speak? - ISO 639
This standard cares about only one thing: Which language are we using?
- Standard Name: ISO 639
- Core Mission: Encode the world's languages.
- Primary Formats:
- ISO 639-1 (Two-letter code): e.g.,
en,zh,ja. It covers about 184 major world languages, conventionally in lowercase. - ISO 639-2 (Three-letter code, T and B categories): e.g.,
eng,zho,jpn. It covers over 500 languages, addressing the limited coverage of two-letter codes. - ISO 639-3 (Three-letter code): e.g.,
eng,zho,jpn. ISO 639-3 is an extension of ISO 639-2, aiming to be a superset covering all individual languages.
- ISO 639-1 (Two-letter code): e.g.,
Learn More
- Wikipedia: ISO 639
- Official Code List (ISO 639-1 & 639-2): Library of Congress
3. Precise Localization: Where Am I and What Do I Speak? - Locale
Now, we combine the first two to answer a more precise question: "What specific language is the user using in a specific region?" This is the concept of Locale.
- Standard Name: No single standard, typically follows the IETF BCP 47 specification, combining
ISO 639andISO 3166-1. - Core Mission: Precisely describe language variants in specific regions to handle differences in spelling, vocabulary, date formats, currency symbols, etc.
- Format:
language-code-COUNTRY-code(language-COUNTRY)en-US: English used in the United States.en-GB: English used in the United Kingdom.zh-CN: Chinese used in Mainland China (specifically Simplified).zh-TW: Chinese used in Taiwan, China (specifically Traditional).
Developer Practical Guide:
- Software Internationalization (i18n): Your resource files (e.g., translation strings) should be placed in folders named by Locale, e.g.,
values-zh-CN/strings.xml(Android). For example:
res/
values/
strings.xml
values-zh-CN/
strings.xml- HTTP Request Header: Parse the
Accept-Language: zh-CN,zh;q=0.9header to return the most suitable language version for the user. For example:
Accept-Language: zh-CN,zh;q=0.9- Date/Currency Formatting: Libraries in all modern programming languages accept Locale as a parameter. For example, in Java:
Locale locale = new Locale("zh", "CN");
DateFormat dateFormat = DateFormat.getDateInstance(DateFormat.DEFAULT, locale);
String dateStr = dateFormat.format(new Date());Learn More
- Wikipedia: IETF language tag
- Official Standard Definition (BCP 47): IETF Tools - BCP 47
4. Professional Fields & Special Cases: Subtitles, Multimedia & T/B Codes - ISO 639-2
Why don't video subtitles simply use zh or en? Because professional fields require broader language coverage, and this is also the root of the "one language, multiple codes" problem.
Standard Name: ISO 639-2 (Three-letter code)
Key Knowledge Point: T/B Codes (Terminology/Bibliographic Codes) About 20+ languages have two three-letter codes in
ISO 639-2, stemming from historical reasons:- B Code (Bibliographic): Derived from the English name, primarily used for library cataloging, a historical legacy. For example,
German->ger. - T Code (Terminology): Derived from the language's native name, recommended for use in modern computer applications. For example,
Deutsch->deu.
The most common example is Chinese:
chiis the B code (from Chinese).zhois the T code (from 中文, Zhōngwén).
- B Code (Bibliographic): Derived from the English name, primarily used for library cataloging, a historical legacy. For example,
| Language | English Name | Native Name | B Code (Old/Cataloging) | T Code (New/Terminology) | Recommended Use |
|---|---|---|---|---|---|
| Chinese | Chinese | 中文 | chi | zho | zho |
| German | German | Deutsch | ger | deu | deu |
| French | French | Français | fre | fra | fra |
| Tibetan | Tibetan | བོད་ཡིག | tib | bod | bod |
Developer Practical Guide:
- Golden Rule: Prefer the T code! It is designed for technical applications. However, when dealing with legacy systems or external data, your code needs compatibility, able to recognize both T and B codes.
- Media Processing: Use the T code with FFmpeg. For example:
ffmpeg -i input.mp4 -metadata:s:s:0 language=zho output.mp4- Data Cleaning: When receiving data from external sources, use a mapping function to unify codes. For example, in Python:
language_map = {
"chi": "zho",
"ger": "deu",
"fre": "fra",
"tib": "bod",
}
def normalize_language_code(code):
return language_map.get(code, code)5. Ultimate Challenge: Time and Time Zones - IANA Time Zone Database
Why can't we use the country code US to represent US time? Because the continental US has 4 time zones, plus complex daylight saving time rules.
- Standard Name: IANA Time Zone Database (also known as tz database or Olson database)
- Core Mission: Precisely define the boundaries of all time zones worldwide, their offsets from UTC, and all historical daylight saving time change rules.
- Format:
Continent/Representative_City(Area/Location)Asia/ShanghaiAmerica/New_YorkEurope/London
Developer Practical Guide:
- Golden Rule: Never calculate time zones or daylight saving time yourself!
- Backend Development: On the server, all times should be stored in UTC, using IANA identifiers when converting to local time. For example, in Java:
Instant instant = Instant.now();
String timestamp = instant.toString();- Frontend Development: Browser APIs can get the user's time zone. For example, in JavaScript:
const timeZone = Intl.DateTimeFormat().resolvedOptions().timeZone;Learn More
- Wikipedia: tz database
- Official Data Source: IANA Time Zones
Quick Reference Cheat Sheet
| Task Scenario | What Do I Need? | Use Standard | Example Code | Key Developer Point |
|---|---|---|---|---|
| Select Country | Unique country ID | ISO 3166-1 alpha-2 | CN, US | Database CHAR(2) storage, API param |
| Webpage or Simple Translation | Identify a major language | ISO 639-1 | zh, en | HTML lang attribute, i18n foundation |
| Precise Localization | Distinguish regional language variants | IETF BCP 47 | zh-CN, en-US | i18n folder naming, HTTP header, formatting |
| Subtitle/Audio Track Tagging | Cover as many languages as possible | ISO 639-2 | zho (recommended) | Prefer T code, be compatible with B code |
| Handle Local Time | Precisely calculate time & DST | IANA Time Zone DB | Asia/Shanghai | Server stores UTC, client uses IANA ID for conversion |
Now, the fog has cleared. These codes are not the product of chaos but a well-designed, clearly divided system. Mastering them will enable you to:
- Build a Clear Mental Model: Understand the applicable scenarios for each code and the historical reasons behind special cases like
zho/chi. - Write More Robust Code: Gracefully handle global user needs while maintaining compatibility with legacy data.
- Collaborate Efficiently: Communicate with your team using precise terminology.
Reference Links
- ISO 3166-1: Wikipedia | ISO Online Browsing Platform
- ISO 639: Wikipedia | Library of Congress
- IETF BCP 47: Wikipedia | IETF Tools - BCP 47
- IANA Time Zone Database: Wikipedia | IANA Time Zones
