E-mail GeekPhilosopherSearch Site E-mail GeekPhilosopher Search site FAQ
Getting information off the Internet is like taking a drink from a fire hydrant.
Mitchell Kapor
Is XML the Answer to Everything?
Rian Schmidt is the president and CEO of of Fine Brand Media. With his MBA and BSEE background, he is often asked to translate technical concepts for business audiences. His philosophy is that communication is improved when the client is well informed and is comfortable with the underlying concepts of Internet technologies.

Seems like you hear more and more about XML coming into its own these days. In fact, some folks tout it as the universal data format. This article provides a brief introduction to XML, its uses, and some of its problems.

What is XML?
XML stands for "eXtensible Markup Language" and is, basically, a format for representing information in a structured, neutral way. 'Structured' means that all data exists in a "parent-child" relationship. 'Neutral' means that the data is defined according to an open standard, so developers can see exactly how XML is defined and even participate in its evolution. An XML file is a simple text file, which can be transferred across the Internet just like HTML.

Here is a sample of a simple XML document:

<?xml version="1.0"?>
<cars>
   <car id="1">
      <nickname>Old Junky</nickname>
      <engine type="4 cylinder" />
   </car>
</cars>

While this looks quite a bit like HTML, there are major differences. The most fundamental difference is that XML tags tell you "what something is" whereas HTML tags tell you "how something should be laid out." In HTML, an <i> tag means that some text should be displayed in italics. In contrast, the <cars> tag in XML means that we're talking about things called "cars," and you can display them however you want.

Another difference you might notice is that the <engine> tag has a closing slash at the end. Whereas in HTML you can use tags like <BR> with no closing tag, XML requires that all tags be closed with either a closing tag: <car></car> or with a closing slash within the tag: <engine type="4 cylinder" />. XML that follows these rules is called "well-formed" XML.

The Document Type Definition (DTD) is a separate document that specifies what data must be in the XML document. In the example above, the DTD might tell you that every "car" must have one, and only one, "engine," but it can have zero or more "nicknames." If your XML document meets the requirements of the DTD, it is said to be "valid."

What can you do with XML?
An XML file contains specially formatted information, as shown above. The software that goes through the XML and breaks the information up for other software to use is called a "parser." The great thing about parsers is that the same parser can be used on well-formed XML no matter where it came from. That means that different systems, or people, can put information into an XML format, and any other system can read and understand it.

The independence of the parser from the data source means that XML is good for data exchange. When I request information from your database, your database can reply in XML without knowing anything about my system. This allows developers to "de-couple" systems and use XML as the neutral glue between them. Even old or "legacy" systems can be fitted with XML adapters and learn to communicate with newer technologies.

For example, suppose I'm in California and have a website which lists cars for sale, and you are a car dealer in Oklahoma. I want to include your current cars for sale on my site (along with many other dealers). Every time somebody brings up my "Cars for Sale" page, my system can make a request to your system for your current inventory. If you present your information in XML format as shown above, my parser can easily break it up and I can display it in my own site's "look and feel." I might use XSL to transform the data into my look and feel.

Currently, with the advent of Microsoft's .Net, the press is beginning to talk about other uses of XML, like SOAP (Simple Object Access Protocol.) SOAP is a way to package a request to a remote system in XML format. Because XML is just simple text, it can be transferred across the Internet just like HTML, parsed, used to package the results, and sent back.

The Downsides of XML
But all is not roses with XML. For one thing, XML parsers are notoriously slow. It is faster for your database to deliver data directly to your application without transforming it into XML. That means, for example, that a Content Management System (CMS) handling information in XML format will require a lot of processing power. If it is going to be served on a busy intranet or Web site, it may need to pre-process the information off-line into a more familiar format (like HTML).

Another practical issue with XML is the need for skilled developers who understand how to use the various technologies correctly, and who fully understand the applications and limitations of these technologies. For example, while some browsers are beginning to understand XML and XSL, most don't currently. That means it would probably not be a good way to present your site. On the server-side, if a native database connection is practical for large volumes of data, it will almost always be faster than using XML as an intermediate format. The best technology in the world isn't any good if no one at your company knows how to apply it correctly.

In summary, XML is great for some things and terribly misplaced for others. It's often viewed by programmers with the reverence previously reserved for Java with claims that it's the "only way to go." However, just as you would probably not want to write a Java servlet to process a simple form, XML may not be the best choice for your site or application. Consider the positive and negative features summarized above, and make your decision based on your specific situation.


Getting Started Get a Spine Web Design Hints Web Domain Names Logo Design Hints Web Browser Wars HTML thru Server-Side Scripting ASP Programming Fundamentals Brief Intro To PHP CGI: What & How? XML Fundamentals XML E-commerce solutions The trouble with Domain Names Start a Web Traffic Virus Syndicate Your Headlines Using RSS Security: What Hackers Do