External

https://en.wikipedia.org/wiki/Character_encoding

Internal

Overview

Character encoding is the process though which characters within a text document are represented by numeric codes. Depending of the character encoding used, the same text will end up with different binary representations. Common character encoding standards are ASCII, Unicode and UCS.

Concepts

Character Set

Character Code

Unicode is a character code.

Code Point

Code Space

Character Encoding Standards

ASCII

ASCII stands for American Standard Code for Information Interchange and it is a seven-bit encoding scheme used to encode letters, numerals, symbols, and device control codes as fixed-length codes using integers.

Unicode

Unicode supports a larger character set than ASCII.

Unicode Transformation Format (UTF)

Binary representation of a text represented in Unicode depends on the "transformation format" used. UTF stands for "Unicode Transformation Format", and the number specified after the dash in the transformation format name represents the number of bits used to represent each character.

UTF-8

UTF-16

UTF-16 support a large enough character set to represent both Western and Eastern letters and symbols.

Character Encoding

Contents

External

Internal

Overview

Concepts

Character Set

Character Code

Code Point

Code Space

Character Encoding Standards

ASCII

Unicode

Unicode Transformation Format (UTF)

UTF-8

UTF-16

UTF-32

Universal Character Set (UCS) ISO 10646

Western

Latin-US

Navigation menu

Character Encoding

External

Internal

Overview

Concepts

Character Set

Character Code

Code Point

Code Space

Character Encoding Standards

ASCII

Unicode

Unicode Transformation Format (UTF)

UTF-8

UTF-16

UTF-32

Universal Character Set (UCS) ISO 10646

Western

Latin-US

Navigation menu

Search