[The following are highlights of comments I made in response to Anil Dash's proposal to use XHTML as a syndication format instead of RSS. I've lightly edited the text for relevancy.]
A set of tags already exists: they're called RSS. The point of XHTML (or one of them at least) is that it can leverage the full range of XML toolkits and specification. This includes XML Namespaces that allows tags from other schemas to be included thereby extending the original from its designed purposes. Some have already been experimenting with combining RSS and XHTML tags into their pages here. (Do a view source to see what I mean.)
What Anil proposes (H3 with this class...) has been done before in the past before we had XML -- its called screen scraping -- albeit more refined screen scraping, but screen scraping nonetheless. It all seems rather retro to me.
I don't mean to sound harsh. I really don't. Its an interesting notion that has its merits. I can understand the argument that ideally content authors shouldn't have to create two versions. I think that in practice its limitations will outweigh its benefits.
A separate syndication file should be more bandwidth efficient especially with aggregators and the like banging away frequently on them. Aggregators have recently improved from their early days of brute force updates -- downloading a feed on some interval regardless of changes. RSS is more about data (that just so happens to be about content) where XHTML is more about display. Combining the two is fine, but inefficient in that the information necessary for one task must be ignore when the document is used for the other.
I complete object that RSS exists out of laziness as Anil says. If a content author is "too lazy" to generate two versions of their content, I'd suggest that they author their content in RSS. You can easily convert RSS more efficiently and reliably into XHTML. RSS is for machine processing while XHTML is designed for display. In fact I could be really lazy as a content author and have multiple XHTML pages generated from one RSS file.
I think of RSS files as more of a Web service then a web page. That may help provide a different perspective.
In responding to my comments Scott Andrew LePera writes:
Properly-structured XHTML is far more robust than RSS for providing syntactic structure for a Web document, and is just as machine readable. The fact that an <EM> is rendered as italicized text in a browser is completely incidental.
The more I learn about these issues, the more I become convinced that it's wrong to ask authors to jump through additional hoops to support formats for alternate endpoints like RSS newsreaders. At the end of the day, I'm paying the RSS tax through additional bandwidth and ensuring that what I put in my XHTML won't break my RSS (like matching character encodings and avoiding relative links).
Syntactic structure of a Web document is not the intended purpose of RSS. RSS was designed to syndicate a collection of online resources called a channel. Admittedly the RSS format and its documentation have been a disaster and are lacking, so let me clarify: RSS files were never intended to contain or transport HTML. If you go back to the version 0.91 format documents, you will see that
I'm all for experiments such as Anil's proposal, though I'm personally skeptical that is will be more successful then (proper) RSS and XHTML working together.
[UPDATE: In related to this growing discussion, Mark Pilgrim writes in "The rebellion will be syndicated": Tantek Çelik: XHTML vs. the world. Bet on the world. Another good point.]

Leave a comment