|
||
|
Generation XMLby Chris FeolaYou just can't get there from here. We take our transportation infrastructure for granted now, but its early days were marked by technologies as incompatible as modern computer file formats. Take trains, for example. How far apart should the rails be on a train track? In the 19th century, every company had its own answer, forcing passengers to disembark at the end of each line, walk across the platform and get on another coach. Markets forced an end to that practice-trains now run on standard tracks. Now the same forces are bringing an end to the tangled web of computer file formats, a knot so Gordian that different versions of the same software often use incompatible formats. The computer world is shifting to XML-Extensible Markup Language-and in many ways, computers will finally start working the way we think they should. Making everything work together seems simple. Aren't words words? Well, yes, but the words aren't the problem. A universal file format for raw text already exists. You can move content in ASCII files between just about any two programs on any computer platform or operating system-as long as you don't mind losing fonts, formatting, graphics, photos, tables... You get the picture. Even if you could preserve these things between mediums, would you really want to? Ask yourself this: Would you really want the size and layout of a broadsheet newspaper transplanted to the Web? Imagine how much fun your readers would have scrolling around. Organizations with heavy documentation requirements, such as the U.S. government and Boeing, actually solved this problem years ago with the implementation of Standard Generalized Markup Language. SGML is a "metalanguage"--a universal language used to create specific languages. Each specific language is defined by a Document Type Definition. Any program that understands the DTD can read and write these files. Sound wild? This is a technology millions use every day-Hypertext Markup Language is nothing more than a DTD based on SGML. The good news is that the HTML DTD allows any HTML document to be read by any Web browser on any computer platform. The bad news is that limiting yourself to a single DTD is like buying a big 18-wheel rig--and then only using first gear. HTML is a fixed Document Type Definition, meaning nothing new is added until the oversight committee can work it through. That creates enormous work for webmasters, since anything out of the ordinary must be done using scripts or applets. There are good reasons, though, why everyone simply didn't adopt SGML. Face it: When you want to run down to the bank drive-through and cash your check, you don't want to take the Mack truck. SGML was designed for handling things like the documentation for a Boeing 777--documents hundreds of pages long, with heavy footnoting and cross-referencing. While we have reporters handing in stories like that, we like to chop them down into something workable. Enter Extensible Markup Language, an "SGML-lite" of sorts-almost all the power, with much less overhead. Like SGML, XML is a metalanguage used to create Document Type Definitions like HTML. You can create your own DTDs, and any XML-compliant application can use your documents-instantly. This is good news for webmasters and others who care about new media. But it's life-saving news for content creators like newspapers. Think of that broadsheet page again. Let's say you have a table that shows local stocks, an index of stories or whatever. With XML, you can simply tag that puppy as, say, TABLE1. Then you can create a DTD called MYNEWSPAPER and a second called MYWEBSITE. In each DTD you define TABLE1. You can make it 150 pixels wide in MYWEBSITE, and 16 picas wide in MYNEWSPAPER. Now both your Web site and your raster-image processor know what to do with it--without separate layout teams for each product or translation software. Even better, industries can define Document Type Definitions allowing universal use of data while preserving each company's individual implementation. In other words, there could be universal formats for classifieds and display ads without a single newspaper being forced to change its layout. XML sounds wonderful, doesn't it? But the $3 million question (priced a new front-end system lately?) is whether any of this will be available before we retire. Yes. Current browsers offer limited XML support, and both Microsoft and Netscape have promised robust implementations in upcoming iterations. More importantly, Microsoft sees advantages in implementing XML in its desktop software. XML will not only be a fully supported file format, but also an alternate native format. You'll be able to do all your work in Microsoft Office applications-Word, Excel and so forth-without ever using a proprietary file format. This version of Office is scheduled for beta release in the second quarter of 1998. So here's the bottom line. If in the next year or two you buy or upgrade one of the seeming bazillions of front-end systems that uses Microsoft Word, you're getting a front-end with XML built into it. Print-side implementation may seem problematic because of the relative dearth of newspaper-industry research and development. But most current front ends use a plethora of off-the-shelf computer software. No matter what the newspaper industry does, the computer industry developing these underlying technologies is jumping onto the XML bandwagon with both feet. Christopher J. Feola is the director of the Media Center at the American Press Institute. E-mail, feola@apireston.org; phone, (703) 715-3333. TechNews Volume 4, Number 2: March/April 1998Return to March/April Home Page |
|
©1998 Newspaper Association of America. All rights reserved.