Nimbupani Designs

The Long Road from XHTML to HTML

XHTML - A New Hope!

I started developing websites in 2003. At that time, XHTML was the flavour of the season (or rather the official W3C recommendation to move forward from HTML 4.x) and Deepak introduced me to the world of web design with XHTML and CSS. It was even written about favourably by the Godfather of Web Design. The reasons that were given to adopt XHTML were these:

  1. XHTML had a smaller set of rules (since it is based off XML).
  2. Using XHTML will make your document well formed (not the tag soup HTML allows you to write).
  3. XHTML will be future compatible.
  4. XHTML lets you integrate content from other markup languages with XML namespaces (like SVG and MathML)
  5. You can use standard XML tools to work with your XHTML documents.

This was in 2000-2003. With the gift of hindsight we know now that XML was not the future. The biggest disadvantage to using XHTML being, any user agent (e.g. the browser) when it comes across an unescaped ampersand (&) in a URL or a mismatched character encoding in a trackback message, would cause the entire page to fail, and so most of today’s public web applications can’t safely be incorporated in a true XHTML page.

Internet Explorer has never supported XHTML (not even IE 8) making all arguments about standards compliance go down the drain. XHTML is not HTML compliant if it is served as content type text/html (read more on why sending XHTML as text/html is harmful). Moreover, XHTML 2 is not compatible with XHTML 1.1, which throws the future compatibility argument out of the window. Others had doubts over XHTML and XHTML 2.0 in particular (1, 2, 3, 4, 5) which paved the way for HTML 5.

HTML Strikes Back!

HTML 5 takes an evolutionary path with focus on backwards compatibility along with some innovations. HTML 5 aims to replace XHTML 1.0 and HTML 4.0. HTML 5 and XHTML 2 are opposites in a way, as support for HTML 5 does not imply support for XHTML 2 (browsers need to support these separately).

HTML 5 defines 2 syntaxes - HTML syntax (which is compatible with HTML 4 and XHTML 1 documents (served as text/html)) and XML syntax which needs to be served as application/xhtml+xml. Even better, HTML syntax of HTML 5 allows MathML and SVG elements to be used inside a HTML document (reducing the need to use XML at all).

With HTML 5, you can continue using XHTML closing for empty elements (e.g. <br/>) and it will still be acceptable in HTML syntax.

HTML 5 defines different rules for user agents and authors. The user agents are required to be backwards compatible, but the authors are required not to use “deprecated” elements for HTML 5 conformance.

Safari, Opera, Firefox have openly declared support for HTML 5 (and Internet Explorer is on the team to draft the HTML 5 standard). XHTML 2 does not have any browser support and Safari has explicitly opted out of drafting XHTML 2.

Mike Malone has written a great review of XHTML 2 vs. HTML 5 and it seems like HTML 5 will win this war.

Personally, I gave up on XHTML when I learnt to be more forgiving (and less of a pedant) about HTML documents (it is not their fault!). This blog though, is ironically XHTML 1.0 Strict.

What should you do?

If you are new to web design or working on large portals, my suggestion would be to use HTML 4 rather than XHTML. You can code well-formed HTML with HTML 4 and you don’t need XHTML to whip you into shape to do that. Better still, use HTML 5 - without using the newly introduced elements (as they are not supported universally).

Read More

UPDATE: Dave Shea talks about why he is switching to HTML 4.01

Comments

Comments are closed for this post. Please message me on twitter, if you would like to comment.
 
Guest's picture

Nice read!

Just curious, a HTML document containing self-closing void-elements and has HTML5 doctype can be served with both media types text/html and application/xhtml+xml right?

Which means HTML5 document having well-formedness(self-closing) will be HTML5 when sent with text/html, and XHTML5 when sent with application/xhtml+xml and both are valid pages (without making the former tag soup)?

 
divya's picture

HTML 5 author reference states that such documents can be created and are called “Polygot documents”, but need to be created carefully and with full understanding of what each parser is capable of.

Interesting question, I think HTML 5 is the first standard which makes it possible to do so.

 
Guest's picture

I just marked up my homepage in HTML5. Though the markup seems straightforward and semantic, I wonder if it will have the same adoptability issues like XHTML.

 
divya's picture

I doubt it. HTML 5 (using HTML syntax) is backwards compatible, so at least it won’t throw a garbled error message (like XHTML does), but may probably not inherit the CSS.

 
Guest's picture

True. We might not see a Yellow screen of death with HTML5. But you need to add the html5.js hack to make sure non-modern browsers to display newer elements…

 
divya's picture

Technically no browser supports HTML 5 elements yet, Ffox, Safari, Opera allow inserting and styling any custom elements which is why HTML 5 elements display fine. Anyway we need to add that hack not to display but only to style the custom element and its children.

 
Guest's picture

Jax, I have been using XHTML5 served as application/xhtml+xml for aover a year now. It does return Yellow Screen of Death with Firefox when there is a mal-formed tag.

 
Guest's picture

Interesting, about time I start serving my homepage as text/html soon instead of application/xhtml+xml; was worried about it being tag soup (at least according to HTML5 specs, not current browser implementation).

 
Guest's picture

I’m confused. You claim “the biggest disadvantage to using XHTML” is in how it handles incorrect quoting (entirely under the control of the server), yet in the very next paragraph you admit that “Internet Explorer has never supported XHTML” (entirely in the hands of random users). Does the latter seriously seem like a smaller issue to you?

In similar news, the biggest disadvantage to starting a restaurant that serves rocks for dinner is that I have a slight quality control issue with my quarry. Oh also people never eat rocks!

 
Guest's picture

I don’t know a single commercial client who has asked me to code up in HTML4.01 so far in the last 5 years and the trend of using XHTML of some sort is continuing.

Advising people to learn HTML4.01 and use only it is a bit weird since people would be cutting out their job perspectives proactively that way.

I don’t know what makes you proclaim that somehow XML is dead, when some of the major ISO formats are in XML (ODF, RSS, etc.)

XML is arguably what makes major aspects of the web work today.

 
divya's picture

Can you explain how they are cutting out their job prospects (is that what you mean?). The only difference in terms of actually writing down the code of XHTML is the “/>” and closing all open tags and using smallercase for text. You can do all this in HTML 4.01 strict as well.

Btw, what major aspects of the web work through XML? I dont see any major application using XML (except for the feedreaders). Websites do not use XML for displaying the HTML pages nor do web apps, this is essentially because XML is not fail safe and that is the problem.

 
Guest's picture

divya wrote:

Technically no browser supports HTML 5 elements yet

Correction: Opera supports most of the new form input types (which aren’t new elements, they’re new attributes of the input element), canvas and (in experimental builds) video. Firefox 3.1 supports video as (I believe) Safari 4 beta does. They all support canvas, as does Chrome.

Jason, no-one is saying XML is dead. HTML 5 is also serialised as XHTML 5 (think html and xhtml) which is true XML and will yellow-screen-of-death to your heart’s content ;-)

 
divya's picture

Oops, I should have known better! Thanks for the correction Bruce!

 
Guest's picture

All part of the service, ma’am

 
Guest's picture

Well, this blog WOULD be XHTML 1.0 if it didn’t have several validation errors..