Divya Manian

RSS Feed Youtube Channel Github

The Long Road from XHTML to HTML

XHTML - A New Hope!

I started developing websites in 2003. At that time, XHTML was the flavour of the season (or rather the official W3C recommendation to move forward from HTML 4.x) and Deepak introduced me to the world of web design with XHTML and CSS. It was even written about favourably by the Godfather of Web Design. The reasons that were given to adopt XHTML were these:

  1. XHTML had a smaller set of rules (since it is based off XML).
  2. Using XHTML will make your document well formed (not the tag soup HTML allows you to write).
  3. XHTML will be future compatible.
  4. XHTML lets you integrate content from other markup languages with XML namespaces (like SVG and MathML)
  5. You can use standard XML tools to work with your XHTML documents.

This was in 2000-2003. With the gift of hindsight we know now that XML was not the future. The biggest disadvantage to using XHTML being, any user agent (e.g. the browser) when it comes across an unescaped ampersand (&) in a URL or a mismatched character encoding in a trackback message, would cause the entire page to fail, and so most of today’s public web applications can’t safely be incorporated in a true XHTML page.

Internet Explorer has never supported XHTML (not even IE 8) making all arguments about standards compliance go down the drain. XHTML is not HTML compliant if it is served as content type text/html (read more on why sending XHTML as text/html is harmful). Moreover, XHTML 2 is not compatible with XHTML 1.1, which throws the future compatibility argument out of the window. Others had doubts over XHTML and XHTML 2.0 in particular (1, 2, 3, 4, 5) which paved the way for HTML 5.

HTML Strikes Back!

HTML 5 takes an evolutionary path with focus on backwards compatibility along with some innovations. HTML 5 aims to replace XHTML 1.0 and HTML 4.0. HTML 5 and XHTML 2 are opposites in a way, as support for HTML 5 does not imply support for XHTML 2 (browsers need to support these separately).

HTML 5 defines 2 syntaxes - HTML syntax (which is compatible with HTML 4 and XHTML 1 documents (served as text/html)) and XML syntax which needs to be served as application/xhtml+xml. Even better, HTML syntax of HTML 5 allows MathML and SVG elements to be used inside a HTML document (reducing the need to use XML at all).

With HTML 5, you can continue using XHTML closing for empty elements (e.g. <br/>) and it will still be acceptable in HTML syntax.

HTML 5 defines different rules for user agents and authors. The user agents are required to be backwards compatible, but the authors are required not to use “deprecated” elements for HTML 5 conformance.

Safari, Opera, Firefox have openly declared support for HTML 5 (and Internet Explorer is on the team to draft the HTML 5 standard). XHTML 2 does not have any browser support and Safari has explicitly opted out of drafting XHTML 2.

Mike Malone has written a great review of XHTML 2 vs. HTML 5 and it seems like HTML 5 will win this war.

Personally, I gave up on XHTML when I learnt to be more forgiving (and less of a pedant) about HTML documents (it is not their fault!). This blog though, is ironically XHTML 1.0 Strict.

What should you do?

If you are new to web design or working on large portals, my suggestion would be to use HTML 4 rather than XHTML. You can code well-formed HTML with HTML 4 and you don’t need XHTML to whip you into shape to do that. Better still, use HTML 5 - without using the newly introduced elements (as they are not supported universally).

Read More

UPDATE: Dave Shea talks about why he is switching to HTML 4.01