Character Code as mainly used in Japan – ソフトウェアエンジニアの技術ブログ：Software engineer tech blog

Character Code
It refers to the correspondence between the characters used on the computer and the numbers in bytes assigned to each letter. Character codes have some to be used in many language spheres by computers, and the variety has increased. Typical character codes are said to be more than 100 kinds.

Mainly used in Japan
JIS Code
The official name is “ISO-2022-JP”. It is widely used.

SJIS(Shift-JIS) Code
It is ASCII code plus Japanese, and it is used in Japan domestic mobile phones.

EUC
It is widely used on UNIX.

Unicode consists of “encoded character set” and “character encoding method(encoding)”.

“Character set” refers to letters put together by certain rules such as “All Hiragana” and “All Alphabet”, for example. A rule in which a unique code is associated with the character set is called a “coded character set”. The associated numerical value is called a code point and it is displayed in the form of “U + xxxxx”.

“Character encoding method” is a method of converting a coded character set into another byte sequence so that it can be handled by a computer. The encoding methods include UTF-8 and UTF-16.

Unicode code point
0x0000 – 0x007f : ASCII
0x0080 – 0x07ff : Country alphabet
0x0800 – 0xffff : Indian characters, punctuation marks, academic symbols, pictograms, East Asian characters, double-byte, half-size

OK!