| WebWord.com > Interviews > XML: What You Need to Know (26-March-2000) |
|
An interview with well-known XML wizard, Simon St. Laurent. Conducted via email by John S. Rhodes (21-March-2000) Web Design and XML How is XML related to usability? Or more generally, how is XML related to web site design? Usability is a pretty broad topic, and I'd rather not get into a debate with Jakob Nielsen's fans, but I think XML may eventually blow the doors off usability as it's currently discussed. Basically, XML makes it possible for Web designers to send information in a form completely separate from the presentation of that information. This means that Web site designers can present the same information in several different forms, customized to the needs of particular readers. If readers find the 'Nielsen' approach most usable, they can select a style sheet that supports that view (maybe that would be the default). If they'd rather go with a funked-out view of the information, they can select that. This isn't limited to sheer appearance, either - with tools like Extensible Stylesheet Transformations (XSLT), you can reorganize information according to different needs. At this point, however, we aren't there yet. I enjoy dreaming about this stuff, but there's a few more years to go in browser design and designers and programmers getting used to the idea of giving users these kinds of choices. I'm not sure that certain kinds of designers - those who demand absolute control over what the user sees - are ever going to be comfortable with this approach.
CSS is an important ingredient in the process of separating form from function, but this separation can go much further with XML than it has with HTML. In a lot of ways, HTML's built-in style information (shared understandings about what headers, paragraphs, tables, etc. look like) has limited CSS. CSS1 was about supplementing HTML's built-in capabilities, and only with CSS2 did the W3C really start thinking about working from a clean slate and using CSS to describe all of the formatting necessary to display a document. XML itself doesn't bring any formatting to a Web browser. Certain XML vocabularies - like Scalable Vector Graphics (SVG), XSL Formatting Objects (XSL-FO), or even XHTML - describe information about presentation, but there's no assumption that an XML document has to carry presentation information. In fact, mixing content and presentation is discouraged for the most part. This means that developers can create their own document formats and apply CSS or XSL style sheets to the documents without any legacy information about formatting interfering with the results. CSS2, for instance, includes display properties that let you build lists, tables, block elements, and inline elements without ever having to use a UL, TABLE, DIV, or SPAN element. It also means that designers and programmers can get out of each other's way when building dynamically generated sites. The designers no longer need direct control over the structure of the generated output - they just need to know what it's going to look like so they can build a style sheet for it.
XML uses the same basic markup concepts - elements, attributes, and entities - as HTML, since they have a shared parent in SGML. HTML uses markup pretty loosely, and has more or less a single vocabulary (though one that's evolved a few times.). XML uses an extremely strict set of rules for how to use markup, but lets you pick your own vocabulary. You can build rules describing that vocabulary, and even share it with other people, but XML gives you a lot more freedom as far as how you describe your information. Instead of marking up a price as <B><FONT COLOR="RED"> you can mark it up as a <PRICE>. You can use both HTML and XML to mark up documents - you can even combine them, using the stricter XML syntax rules and an HTML vocabulary. On the other hand, you can use XML to mark up pretty much any kind of data, though marked-up text isn't a great way to store multimedia files like graphics, video, and sound.
Sorry to disappoint you, but in the long term, I'm pretty much hoping, even expecting, that XML does replace HTML. It won't be this year or next, but I don't think HTML - today's HTML - is going to survive except as a lowest common denominator format. Some parts of HTML will undoubtedly continue to be important, since too many people 'speak' HTML to throw it away immediately, but that surviving vocabulary is going to be mixed with other vocabularies, most likely in an XML universe. The W3C is pushing this pretty hard with XHTML. If you read their HTML Working Group Roadmap, it's pretty much an "HTML is dead! Long live XHTML!" kind of approach. XHTML 1.0 was just an XMLization of HTML 4.0, but 1.1 and beyond are much more drastic measures. Developers who've spent years learning the ins and outs of HTML creation, battled all the browser variations, and cursed for years about the difficulties inherent in this seemingly simple HTML are probably going to be irritated about having to retool. For the most part, it's going to mean learning CSS and cleaning up code. Learning XML and CSS, along with the few bits of XLink needed to get started with hyperlinking, isn't hard, though, and the advantages are incredible. You can describe your information in terms that make sense to you or your organization. You can write those terms in nearly any other language. CSS is still English-biased, as is XSL, but the markup doesn't have to be. You can start doing markup with the expectation that others will be able to take your work and do more to it than read it on a screen. The other exciting aspect to replacing HTML with XML is that agents might finally achieve reality. Instead of having to sort through piles of formatting information, they have some chance of finding the content they're looking for. It won't be easy, but at least the signs pointing the way might mean something. It's going to be a while before we have browsers capable of supporting these visions. Mozilla is getting close, and Internet Explorer is getting better, but it's not going to be safe to ship XML to browsers on a regular basis for a few more years yet.
There are a few reasons. First of all, Java's object structures and XML's markup structures are a pretty good match. Java supports Unicode directly, which has helped it leapfrog other environments for developing XML-based applications. There's lots of free and open-source Java code available for working with XML, which also helps. On the political side, the relationship between XML and Java isn't clear. Sun really pushed for XML's early development, and talks loudly about XML, but their explicit support for XML parsing and processing in Java has remained pretty weak, with little sign of a clear direction. Microsoft, on the other hand, seems to see XML as a potential Java-killer. Once data becomes interoperable across platforms, why worry about cross-platform code? I love working with XML and Java, but the politics are disastrous and best avoided. The best news is that you can use plain old Web development tools to do a lot of real work with XML. As XML support reaches the browsers, the need for custom programming to get basic things done should diminish. You'll still have the full power of Java available - but only if you need it.
In a nutshell what is XML? What is the real strength of XML? XML describes a syntax for marking up text that makes it easy to describe complex structures. Those structures have to follow some strict rules (about nesting properly), but the rules are simple and the structures are incredibly flexible. The real strength of XML is that these labeled structures can make a great foundation for all kinds of processing, from styled presentation for human readers to collection by agents seeking best bargains to interchange between databases or even businesses. Perhaps its greatest strength from my perspective is that it provides a much better balance between human- and machine- readability than any of its predecessors. While XML files aren't always easy to read directly, they're usually a lot easier to work with than their binary equivalents.
If you're a Web developer, I tend to recommend my own XML: A Primer, John Simpson's Just XML, or Elliotte Rusty Harold's XML Bible, though there are plenty of other good books. I'm not sure that a true "programmer's intro to XML" is out yet, though I know some are on the way. There's also XML: A Manager's Guide - I haven't read it, but I hear it does well for its target audience. As far as Web sites go, there's a lot to choose from. XML.com has its own content and links to other sites. The Web Developer's Virtual Library has some really good tutorial information. There's lots of good material out there, though putting it together is often hard for beginners. The one thing I don't recommend is trying to learn XML from the W3C spec. It can be done, but it's not easy.
XML was created by a group at the World Wide Web Consortium (W3C). The W3C is still home to the main thread of development for XML and its related standards, though they focus only on the infrastructure and applications of XML that solve narrowly-defined Web needs. For more on their activity, visit http://www.w3.org/XML. The W3C's recommendations and drafts - lots and lots of them - are available at http://www.w3.org/TR. Anyone can build their own vocabulary, however. This is already leading to a good deal of competition and chaos. While it may be frustrating to companies used to single-source solutions, I'm quite pleased to see a return to more open models. The most visible duel is between XML.org, a consortium backed by IBM, Sun, Oracle, and others, and BizTalk.org, a framework backed by Microsoft. I don't expect either of these ventures to control the market for XML - in fact, I hope they get outflanked by smaller and more independent ventures that focus on domain-specific needs. Right now, this means that it's hard to look in a single place for information, but in the long term it should bring us better vocabularies. There's certainly been an explosion of work - http://schema.net and http://www.oasis-open.org/cover provide plenty of evidence of that.
Will XML revolutionize business? In particular, how does XML relate to Business to Business (B2B) transactions? XML revolutionizes business because it lets you use commodity components and existing Web infrastructures to transfer information. There's nothing XML can do that couldn't be done by other formats riding other protocols, but XML will let you do it cheaper. You never have to be locked into tools coming from a single vendor, and you can monitor and manage the flows using whatever tools you like. Effectively, this reduces the level of agreement that businesses need to reach in order to exchange information. XML can travel over email, over HTTP, over whatever. The processing involved is pretty generic - there are no magic tricks to be figured out. You can layer cryptography or workflow on top of XML without having to change the information inside the document. None of these layers are tightly bound. There's no special X-HTTP Web servers need to support in order to transfer XML. If you and I have a basic understanding of the kind of information we're sending, we can use any communications style we like, as long as XML shows up at the right time and the right place. Transforming XML is easy enough that we aren't confined to using the exact same vocabulary and structures, an important issue when you may be talking with your competition. Implementing these systems still takes significant integration, and I wouldn't set up ordering systems on unsecured Web servers, for instance, but at least it's more about putting together different kinds of LEGO blocks rather than sculpting things out of gold.
I wouldn't suggest that anyone leap into XML with the expectation of becoming stinking rich. There's money to be made, certainly, but the models for making money with XML aren't likely to be get-rich-quick stories. The whole point of XML is opening up information, and using commodity components to process that information. The promise of XML from a business perspective is that it offers a way to do a lot of things much more cheaply without locking your project into eternal dependence on a single tool vendor. This makes it a lot harder for the first vendors in the market to abuse customers for extortionate profits. That said, there are still a lot of gaps in the XML market. User-oriented software has been slow to develop - two years since XML 1.0 came out, we're finally getting acceptable (heck, any) support for viewing XML in a browser. Tools for editing XML documents still feel primitive to me, and many of them are oriented to a single platform, making it harder to use across an organization. I think the best way to make money in this market is either consulting, helping clients through the maze of decisions required to choose XML formats and implement them, or creating tools that take care of lots of the integration in advance. Building applications that process and transfer XML on a large scale requires a lot of integration and balancing parts. By prepackaging that integration, you can save companies a lot of development work based on free tools and sell them a package. All kinds of tools for processing XML are freely available, though, and I have a very hard time believing the kinds of pricetags some companies are putting on their integration. I think there's plenty of room for an approach that either provides consulting support for open source code or creates commodity tools, like editors and browser/viewers, that get sold at a much lower price. Ubiquity is important. While it's a lot easier to market only to the Fortune 500, XML's low-cost approach makes it appropriate to an incredibly wide audience that shouldn't be ignored.
If by sites you mean Web sites, there aren't very many of them. http://www.xmltree.com keeps track of them, and more sites may be using XSLT to generate HTML from XML in the back-end, but XML hasn't yet caught on as a Web technology. The main reason, I think, is the incredibly slow move to support XML in browsers. Netscape pretty much collapsed as XML was coming on the scene, and Microsoft's initial IE 5 release provided a minimal and rather broken set of tools. I've heard of a few intranets that use IE 5 and XML to build client-server applications, but for the most part, there's been very little activity. Lately, the pace has picked up enormously. Mozilla appears finally to be making good on their promises, and may finally give Microsoft some real competition again. Opera's also getting on board with XML, adding support for it to the 4.0 browser. It'll probably be a few years before these browsers become ubiquitous, so I'd say we're still pretty far from XML as a viable Web site framework. When it comes, it'll be interesting. Web designers should probably take a look at XLink to get an idea of how much more powerful their hypertext linking will become, and how much easier it will be to maintain it.
Do you have any XML heroes? The usual group of XML's creators - Jon Bosak, Tim Bray, and many more - certainly deserves a lot of credit. Most of that 'first generation' of XML seems to be retiring from the W3C and moving on to other projects, and a lot of people are wondering how that will affect future development. At the same time, though, I think the real 'hero' is the XML community, which includes that core group but goes well beyond it. Filled with strong-minded people, arguing and building on lists like XML-dev, this larger group of people has built a lot of (largely open source) software and helped define best practices for this new technology. While it's a fractious group, I think it's given XML a chance to live up to its hype.
I seem to keep going to conferences lately - I teach tutorials, make presentations, and cover the shows as press on XML.com and xmlhack.com. I spend far too much of my time on mailing lists, with XML-dev as a home base. I've been working on an Internet Draft describing XML media types (MIME types) with Murata Makoto and Dan Kohn, and that seems to be moving toward completion. I'm also fairly active in a group working on simplifying XML further. My latest released book is XML Elements of Style, which I really intended as a second book on XML. It fills in the details, maybe too many details, of XML 1.0 and XML Namespaces, and points out best practices and pitfalls. There are lots of beginner's books on XML, but all the advanced books seemed to require an understanding of SGML. It seemed like it was time. Right now I'm working on a book on XHTML, hoping to help Web developers make the transition from plain old HTML and get them thinking about the new possibilities XHTML and eventually XML will open up. Final thoughts, shameless plugs? I just wish more of this work was paying gigs! As much as I encourage others to take on community projects, I have to admit I'm pretty well drowning in them.
Simon, thank you for sharing your thoughts. This is excellent material.
Read another WebWord.com interview: Who is Jakob Siegel? Have you heard about the WebWord.com Guru Interviews E-book? (Click here for more)
|
|
If you want to know when new interviews go online,
|
Home | Moving WebWord | Cool Books | Hot Web Sites | Reports
Newsletter Archive | Interviews | News | About
WebWord.com
URL: http://webword.com/interviews/stlaurent.html
© 1999-2000 by John S.
Rhodes. All rights reserved.
Do not reproduce or redistribute any material
from this document,
in whole or in part, without explicit written permission
from John S. Rhodes.