Declaring Your Page's Character Encoding
Once you've decided which character encoding you're going to use, you should declare that encoding at the beginning of your Web page.
To declare your page's character encoding:
At the top of the head section of your page, type <meta http-equiv="content-type" content="text/html; (including the dash).
Then type charset=utf-8", where code is the name of the encoding with which you saved your page.
Type /> to complete the meta tag.
Figure 21.6. In the head section of your Web page, create a meta tag that describes the encoding you used to save the file.
Which encoding should you choose? My first choice would be UTF-8. It's more flexible than regional encodings, which are my second choice.
The encoding you declare must match the encoding with which your page was saved. Otherwise, characters that differ between the encodings will display incorrectly.
If you don't explicitly choose an encoding when saving your files, your text editor probably uses the default encoding for your system. You must still declare that encoding using the meta tag as described above. On Windows in the U.S. and Western Europe, the default encoding is windows-1252. On Macintosh in the U.S. and Western Europe, it's x-mac-roman.
If you don't specify your page's encoding, the browser (and search engines) will guess, based on the visitor's preferences, information from the server (see next tip), the charset attribute (see last tip), or by examining the document. You have a better chance of the browser getting it right if you just make its life easy and tell it.
Apache server users can add a line to their .htaccess file to declare the encoding of all files with a particular extension. The line should look like AddType 'text/html;charset=utf-8' .html, where code is the character encoding. The .htaccess file overrides what you set with the meta tag. If you're having trouble, it may be that the server administrator has already adjusted the .htaccess file. For more details, see http://www.w3.org/International/questions/qa-htaccess-charset (including both dashes).
To write a page in a different language from that of your operating system, you may also need keyboard layouts (or perhaps an IME, input method editor) that let you input the characters, and a text editor that supports the desired languages and that can save the page in the proper encoding. If the text editor can at least save the file in Unicode, you can use IE for Windows to convert the file to any of the encodings that it recognizes (by opening the file and choosing File > Save As).
IE 6 for Windows, when browsing a page with an encoding it's not set up for, will automatically ask the visitor if they'd like to download the appropriate resources. This is an important reason to include the proper encoding.
You can theoretically add the charset attribute to a link or script to describe the associated files' encodings. However, this feature is not yet widely supported.
Figure 21.7. When you tell the browser what encoding to expect, and as long as it supports that encoding and the visitor's system has an appropriate font, the characters display properly.
Figure 21.8. If you don't tell the browser what to expect, it makes an attempt, but often doesn't know how to display the characters properly.