Knowledge Base

Chinese and Japanese Sorting

Article ID: 123271

Article Last Modified on 11/6/1999


APPLIES TO


This article was previously published under Q123271

SUMMARY

Chinese and Japanese use ideographic characters. There are at least three major ways to sort ideographic characters:

  • By strokes within radicals. Characters are sorted by radical first, then by the number of strokes within the radical. Radicals themselves are sorted by the number of strokes, in increasing order.
  • By radicals within strokes. Characters are sorted by number of strokes first, then by the order of the radicals.
  • By pronunciation. Characters are sorted by their pronunciation (phonetic order). Note that many Chinese characters have more than one pronunciation.
Common Kanji dictionaries use all three sorting methods. Currently, most applications bypass these issues because sorting tables for Asian code pages are extremely large. Most often, the option is to sort by code points, which works reasonably well. The lstrcmp() function compares two strings by code points in Chinese and Japanese Windows.

Japanese Windows uses SHIFT-JIS, Traditional Chinese Windows uses Big-5, and Simplified Chinese Windows use GB as their respective code pages.

Additional query words: fesdk DBCS

Keywords: KB123271