UTF-8 (U from Universal Character Set + Transformation Format8-bit) is a character encoding capable of encoding all possible characters (called ''code points'') in Unicode. The encoding is variable-length and uses 8-bit ''code units''. It was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in UTF-16 and UTF-32.
More at: UTF-8