Character Encoding: Difference between revisions
Jump to navigation
Jump to search
Line 18: | Line 18: | ||
Unicode supports a larger character set than [[#ASCII|ASCII]]. | Unicode supports a larger character set than [[#ASCII|ASCII]]. | ||
===Unicode Transformation Format (UTF)=== | |||
Binary representation of a text represented in Unicode depends on the "transformation format" used. UTF stands for "Unicode Transformation Format", and the number specified after the dash in the transformation format name represents the number of bits used to represent each character. | Binary representation of a text represented in Unicode depends on the "transformation format" used. UTF stands for "Unicode Transformation Format", and the number specified after the dash in the transformation format name represents the number of bits used to represent each character. | ||
===UTF-8=== | ====UTF-8==== | ||
===UTF-16=== | ====UTF-16==== | ||
===UTF-32=== | ====UTF-32==== | ||
==Universal Character Set (UCS) ISO 10646== | ==Universal Character Set (UCS) ISO 10646== |
Revision as of 20:31, 25 June 2018
External
Internal
Overview
Character encoding is the process though which characters within a text document are represented by numeric codes. Depending of the character encoding used, the same text will end up with different binary representations. Common character encoding standards are ASCII and Unicode.
Character Set
Character Encoding Standards
ASCII
Unicode
Unicode supports a larger character set than ASCII.
Unicode Transformation Format (UTF)
Binary representation of a text represented in Unicode depends on the "transformation format" used. UTF stands for "Unicode Transformation Format", and the number specified after the dash in the transformation format name represents the number of bits used to represent each character.