![]() |
by Billie
Peterson, Baylor University
|
| Dear Tech Talk--
In a previous column you discussed Cascading Style Sheets and their impact on web pages. I've recently heard about a new mark-up language, XML. Will I have to completely redesign my library's instructional web pages so they work with this new standard? --Xpecting
an Xplanation about XML
Dear XXX-- In the beginning (1986), there was SGML (Standard Generalized Markup Language, ISO 8879), an international standard for defining descriptions of the structure and content of different types of electronic documents. SGML is the "mother tongue" used for describing thousands of different document types from transcriptions of ancient languages to technical documentation of sophisticated machines. HTML is only one of these SGML document types. It defines a single, fixed type of document that lets you describe a simple, office-style report (headings, paragraphs, lists, illustrations, etc.), with some provision for hypertext and multimedia. HTML is relatively easy to learn, but as was mentioned in the December 1997 Tech Talk column on Cascading Style Sheets, HTML is rife with limitations. (See Mace, "Weaving a Better Web," for additional details.) To reduce these limitations, HTML needs to be extended, and there are only 2 ways to "extend" HTML: 1. The World Wide Web Consortium could approve a new HTML standard, a very slow and cumbersome process. 2. Browser developers
could implement new features that are not part of the HTML standard and,
therefore, are not uniformly supported by other browsers, causing incompatibility
and design problems.
Because SGML is completely
extensible, it could be used to overcome all of the limitations associated
with HTML, but SGML is very complex and difficult for a lay person to use.
Hence, the development of the eXtensible Markup Language, XML, a bridge
between the rich complexity of SGML and the restrictive simplicity of HTML.
Whereas HTML describes how information is presented (to a certain extent),
XML describes the content and the hierarchy
of the information that is presented. XML makes use of the Document
Type Definition (DTD) to define a page's elements and its attributes as
well as the relationships among the elements and attributes; the eXtensible
Style Language (XSL), style sheets for XML documents; and the eXtensible
Link Language (XLL) to increase the power of links in web pages.
The beauty of XML is that new tags and hierarchies can be developed by any web page author, without waiting for a new standard; and, as long as the document is "well-formed," any XML (or SGML) application will be capable of interpreting the information. What's the catch? As of this writing, no HTML browser is optimized for XML. Internet Explorer currently offers limited support for XML, and Netscape promises that the next major upgrade (5.0) of its Navigator software will be XML compliant. In many ways, XML documents appear similar to HTML documents, except for the provision of non-standard HTML tags. So, does that mean that HTML pages are automatically XML compliant? If the HTML document is "well-formed," then it is XML compliant; otherwise, it is not. But what makes a document "well-formed?" 1. All tags must be properly nested and must match, and there must be an enclosing element for the whole document. 2. All attribute values must be enclosed in quotes, for example, <font size="5" color="blue"> is correct but <font size=5 color=blue> is incorrect. 3. All elements with empty content must end with "/>" instead of ">". For example, the HTML tags <br>, <hr>, and <img> would have to be changed to <br/>, <hr/>, and <img src="picture.gif"/>. This is required in XML because the "parser" needs to know that the <br> tag is empty so it won't look for a matching </br> tag later in the document. If all web page developers precisely followed the HTML standards, their web pages would be XML compliant. However, web browsers are purposely forgiving of "incorrectly" written HTML code, so there are millions of web pages that work with current browsers but are not XML compliant. With some time and patience, any HTML page can be converted to XML, but it's probably not necessary because XML and HTML are meant to complement, not compete with, each other. For details on converting HTML documents to "well-formed" XML documents, see the XML FAQ (Frequently Asked ...). According to Jon Bosak, chair
of the XML Working Group which developed XML, the best applications for
XML will be those that can't be accomplished with the current HTML limitations:
1. Applications that require the Web client to mediate between 2 or more heterogeneous databases.Another real advantage to XML is the power of the eXtensible Link Language. According to Neil Randall, with XLL, web authors "can provide a link that will take users to a particular resource" just as HTML currently does; but in XML, "a cross-reference link will then show all the links that lead to that resource, and the user can follow these links to their sources." In addition, "XML authors can . . . specify what happens when a link is not found," with possibilities including following the link without further action on the user's part or perhaps even embedding the linked document within the original (319). Where does XML fit in with library web pages? To a certain extent, it's too early to say. However, a couple of possibilities come to mind: 1. Designing library or library instruction web pages that change, based on the user's sophistication or physical capabilities. 2. Instructing users to use Internet search agents or databases that employ the XML standard. Use of tags (fields) that describe the content of specific elements of the database should result in the retrieval of more relevant information, just as it does in standard library databases. Given the three examples of XML documents in the sidebar, imagine the difference in search results for the topic "chip" if your search agent could look for "chip" as part of the <computer> tag or the <processor> tag. How will libraries make use of XML -- only time will tell as browsers become XML compliant and XML development tools evolve. For more information:
Beale, Stephen. "XML Ascends on the Web: New Web Authoring Standard Offers Advantages over HTML." Macworld 15 (February 1998):28-29. Bosak, Jon. "XML, Java, and the Future of the Web". 10 March 1997. <http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm> (19 April 1998). Connolly, Dan. XML Principles, Tools, and Techniques. Sebastopol, CA: O'Reilly & Associates, 1997. Extensible Mark-up Language. 22 April 1998. <http://www.w3.org/XML> (23 April 1998). Extensible Mark-up Language (XML) 1.0. 10 Feb. 1998. <http://www.w3.org/TR/1998/REC-xml> (19 April 1998). Frequently Asked Questions About the Extensible Markup Language: The XML FAQ. 3 Feb. 1998. <http://www.ucc.ie/xml/> (19 April 1998). Gee, William and John Gartner.
"Xpand
Your Site With XML". TechTools 25 March 1998.
Light, Richard. Presenting XML. Indianapolis, IN: Sams.net Publisher, 1997. Mace, Scott, et. al. "Weaving a Better Web". BYTE (March 1998):58-68. <http://www.byte.com/art/9803/sec5/sec5.htm> (19 April 1998). Randall, Neil. "XML: A Second Chance for Web Markup." PC Magazine 16 (November 4, 1997):319-320. XML.com. 11 April 1998. <http://xml.com> (19 April 1998).
As always, send questions and comments to:
|