owen

Dan asks:

what's up with it being HTML 4.01 Transitional//EN in the only template that comes in the zip? i was just perusing the code and that caught my eye because of the supposed concentration on new technologies and best practices... what gives?

Dan, talk about walking into a minefield. I will try to distill more than a hundred messages used to come to a decision on this topic into a concise reply to your query.

In order to serve correct XHTML, the server must not only serve correct markup, but a correct content-type. If the content type is not an XML type (what is supposed to be used with XHTML, since it is XML) then the browser will interpret the XHTML code as poorly-formed HTML.

What really happens when you receive XHTML markup with a text/html content type (this is how WordPress serves pages, for the most part) is your browser ignores all of the extra characters and invalid non-HTML markup that is part of the XHTML. That it does this is a blessing for you, because otherwise your beautiful but improperly served XHTML code would render like garbage. A byproduct of browsers having to deal with sloppy HTML code over the years is that they are used to taking garbage-like XHTML and making it look nice in spite of itself.

Serving XHTML with a text/html content-type rather than a valid XML content type (application/xhtml+xml seems correct) is allowed by the W3C spec, but it causes the browser to render the XHTML as if it’s HTML. So why didn’t you code it as HTML in the first place, which is less likely for the browser to misinterpret when using the text/html content type?

Serving with a correct content type is possible, but it requires that your markup is XML-valid. There is so much user-provided content on a blog that would need to be filtered to make it XML-valid that doing so is prohibitive. Note that a single incorrectly placed tag or un-encoded element (remember, XML doesn’t have all of the entities that HTML does!) would cause your entire page, and possibly your entire site, not to render in teh browser – at all.

After consulting with experts in markup and standards who made recommendations to us not to use XHTML because it’s mostly a broken standard, we decided that HTML works, it does what we want, and we can serve pages with it that validate. Compared to invalidly serving XHTML markup as text/html content, we would rather serve valid content for what content type we specify. Compared to attempting to assure that all themes, posts, and comments were valid XML before they are served by converting them somehow, we would rather focus on a standard that is well-traveled and has a future with WHATWG and HTML5.

That said, it is entirely possible to create and serve valid XHTML pages with any content type you like, valid, semi-valid, or invalid, from Habari. However, there are no tools in Habari for validating your output as XML before output, so if you screw it up, it’s on you. The only place you will continue to find HTML even if you change your public site’s content type is in Habari’s admin.

A proper blogging tool that outputs true XHTML would have an XML parser/validator and force you to add new content nodes to a DOM before output. It’s simply not practical to expect that concatenated strings will always result in valid XHTML, given the abundance of user-supplied content on a blog.