Fixing Dreamweaver Encoding

14 comments | Posted: 22 November 05 in Tutorials, by Nathan Smith

UTF-8 vs. ISO 8859-1:

Update: For a great example of foreign characters being utilized with UTF-8, go here: HanziSmatter.com. If you view source, you will notice that these characters are simply there in their raw form, and are not inline images.

If you are a standards compliant, accessibility minded web developer, then one of your main priorities is making sure that your pages work with the widest variety of browsers and operating systems. You care about this, because you want to communicate to the widest possible audience.

If you are anything like me, you own a variety of Macromedia software products, including Studio MX or version 8. Something that has always bothered me about Dreamweaver, which is one of the reasons I rarely use it for hand-coding, is that it is very Western, particularly American – centric.

What I am referring to of course is the encoding that is selected by default. When you fire up Dreamweaver, and select a new XHTML page, it starts you out with a default document template, including this line of code for content-type in the head:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

In short, technically that line should read like this:

<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8" />

For purposes of this article, let’s lay aside the argument of text/html verses application/xhtml+xml for usage in XHTML documents. There is enough argument on either side that I don’t want to get into that now. I will simply attempt to explain and persuade you why the content-type is a necessary change. First of all, if you go to Wikipedia and read their article on ISO 8859-1, you will see:

In June 2004, the ISO/IEC working group responsible for maintaining eight-bit coded character sets disbanded and ceased all maintenance of ISO 8859, including ISO 8859-1, in order to concentrate on the Universal Character Set and Unicode.

Now, I don’t know about you, but if an auto manufacturer did a recall on a car I drive, I would quit driving that thing and get the newer model. This is analygous to what has been done with ISO 8859-1, in favor of Unicode. The most common and universally compatible form of Unicode is UTF-8, which stands for: 8-bit Unicode Transformation Format.

Without going overboard with geeky details, suffice it to say that Unicode is the preferred standard by which to encode accessible websites. In fact, if you fail to specify a Content-Type in your web pages, the W3C validator will use UTF-8 as the fallback, not ISO 8859-1. Again, quoting Wikipedia:

In computing applications, encodings that provide full UCS support (such as UTF-8 and UTF-16) are finding increasing favor over encodings based on ISO 8859-1.

Big Deal?

I know you’re probably thinking, “Well, my pages display correctly, so who cares?” To some extent, that’s true. If your target audience is located in the United States, and you don’t do much business outside of that country, then you’re probably fine. However, one of the bigger drawbacks cited with ISO 8859-1 is the lack of the Euro character: €. Imagine trying to do an e-commerce site in the UK without that character. That would be like having all prices lack the dollar sign $ in the United States.

This is one of many examples, in which countries and languages are alienated by our (lack of) choices made regarding content-type encoding. I think that this sort of explains my feelings on foreign web developers as opposed to American ones. We as US citizens tend to be comfortable and self-satisfied.

To us, America = the world. I would use this analogy: We have the baseball World Series but we really only include one other country: Canada. Whereas, other countries tend to be a bit more adaptive, and realize the need for more open communication. Think of how many countries love soccer!

So, if I you take away anything from this article, let is be this: You need to start encoding your pages in an accessible manner. It’s called Unicode for a good reason: “Universal Code” UTF-8 is more accessible than the more specifically outdated ISO 8859-1. So, next time you decide to fire up Dreamweaver to make a web page, be sure to change that one line of code in your head as a common courtesy to our world-neighbors. It will also help you look less like a WYSIWYG newbie.

Note: For more in-depth info, go here:

Discuss This Topic

  1. 1 Yannick

    Thanks for this writeup Nathan. To be honest I had no idea about this. I usually just load up my template and go. Next time I’ll be sure to pay attention to this and fix it too.

     
  2. 2 Nathan Smith

    Actually, your page is being served as UTF-8 already. But yeah, it’s something to watch out for if you use Dreamweaver on a semi-regular basis. I just use Araneae and have my content-type set up in the default page template.

     
  3. 3 Yannick

    Yeah well I think that was Textpattern’s doing, however I’m not sure if that is the case for the websites I have done at work. I will have to check that out.

     
  4. 4 Ray

    Very nice article, Nathan. I knew to use UTF-8 (don’t shoot me if my personal website doesn’t right now, I haven’t gone back to look) and all pages I create these days use UTF-8, but that really explains to me why it’s necessary. You’ve got me really curious about the application/xhtml+xml part, though.

     
  5. 5 Nathan Smith

    Ray: There’s this ongoing argument as to whether XHTML should be served as application/xhtml+xml or as the older text/html. Technically, it’s supposed to be served as the newer format, but cruddy browsers like Internet Explorer 6 don’t officially support it, and rumor has it neither will IE7. For the most part, it’s a non-issue, as the browser defaults back to what it does know, text/html.

    The way to get around this is to use XHTML, I’m assuming for its code-strictness, including lowercase and closed tags, but still serve it as text/html. This is what Stopdesign.com does. The other alternative is to forget about XHTML altogether, and simply go with old-school HTML until Internet Explorer catches up, while still adhering to the modern standards of XHTML, which is what sites like 456BereaStreet.com do.

    Me, I guess I like to live dangerously, because I prefer not to let an obsolete company hold us back with their out-dated browser. So, I’m gambling with XHTML 1.1 served as application/xhtml+xml. Who knows, maybe Microsoft will wise up, realize they’re no good at making browsers, and just slap their logo on the Gecko engine. I mean, hey – It worked for Flock, didn’t it?

     
  6. 6 Mark Priestap

    Thanks for the info Nathan. I’ve always wondered what that line of gobbledy-gook meant (too lazy to go find out).

     
  7. 7 Ray

    Heh. Gotta love that tooltip you used for IE7. :)

     
  8. 8 Tilde

    Actually, your site is being served as text/html, Nathan—see Firefox’s “View Page Info”.

    You can’t change your page’s MIME type using the meta element. The meta element is only supposed to be an echo, so to speak, of how you’re intending to serve it. You’ll need to change your server settings to actually change the MIME type.

    (Living dangerously, always a fun approach! :D )

    —~

     
  9. 9 Nathan Smith

    Tilde: Good point. Perhaps I should roll it back to text/html. I’ll have to look into that a bit more, and figure out whether it’s worth changing server settings, and having to tweak back and forth based on newer or older browsers. ;)

    Ray: Thanks, wasn't sure if anyone would pick up on that tool-tip.

     
  10. 10 Per Liedman

    I agree completely with the idea behind this article, but let me just correct you on one point. You say “however, one of the bigger drawbacks cited with ISO 8859-1 is the lack of the Euro character: €. Imagine trying to do an e-commerce site in the UK without that character.”

    Well, I can imagine doing that… The UK has not converted to the Euro, and still uses Brittish pounds as its currency (decimal code 163 in ISO-8859-1). So the example is a bit flawed, unfortunately :)

    Anyway, a great article otherwise!

     
  11. 11 Chris

    I’m not sure if you’re aware, but in the preferences of Dreamweaver, there’s New Document setting where you can change your default page encoding for every new page.

    Mine’s set to UTF-8. :-D

    Cheers,
    ~Chris

     
  12. 12 Theophan

    Regarding the referenced HanziSmatter.com site, I don’t see the characters you’re talking about, either in IE6 or Firefox 1.0.7. In Firefox I see question marks, and in IE6 I see empty boxes. In both browsers it seems to have identified utf-8 as the encoding.

    I obviously don’t understand this issue, but what good did the utf-8 encoding do, when I don’t see the result?

     
  13. 13 Tilde

    I obviously don’t understand this issue, but what good did the utf-8 encoding do, when I don’t see the result?

    You can’t see the characters unless you have some appropriate font installed (many free ones exist).

    This has nothing to do with which ISO-8859-1 vs. UTF-8, mind you. if you don’t have a good range of fonts to choose from, your browser won’t be able to do anything with the characters it runs into.

    So the only way they could achieve maximum compatibility is to use images for the characters, which is the way it’s been done throughout the last decade. This is the future, though, and that just isn’t a desirable solution anymore. We’re at a point in the Internet’s history when it’s safe to use real characters and expect that people will have at least one wide-range font installed.

     
  14. 14 Nathan Smith

    Good point guys, I didn’t realize that I was one of the few actually seeing the characters. I’m not sure why though, as I don’t remember installing extra language support. I do have quite a few fonts installed though, for doing graphics / logos. Still, I like that site all the same, being half Japanese. :)

     

Comments closed after 2 weeks.