Back

Character encoding

What is character encoding ?

Encoding is a way to allow browsers to correctly interpret characters. It is vulgarly speaking to inform browsers about how characters are encoded (example: alpha encoding, 1 = A, 2 = B / beta encoding, 10 = A, 2 = B).

Why is this important ?

Character encoding is important to prevent the browser from displaying incorrect characters. This can happen, for example, when you want to use language-specific characters (such as Mandarin, for example) that have not been encoded.
For example: My source code will be translated 私のソースコード into Japanese, if my encoding is correctly specified. On the other hand, if my encoding is incorrect, the same sentence may show abnormal characters (私a!!ソ§--スコ#€).

How to correct it

All documents (.txt, .html, or plain text) that contain text are saved with defined characters. This corresponds to the actual encoding of the document.

It is recommended to use the "utf-8" encoding on your pages which allows to encode the majority of the possible characters.

To check which encoding your HTML pages use, you can go to the configuration settings of your text editor (i.e. Atom, Sublime Text).

Don't forget to declare your character encoding information to the browser. To do so, you should : 

  • as a value in the header dedicated to the document type "Content-Type: text/html; charset=utf-8"
  • in HTML tags <meta charset="utf-8">
  • in an HTML tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  • on several elements but the value must be the same, the same encoding of defined

Finally, it must be defined on at least one of the elements.