white space

Into the Digital Future

By separating content from format, advanced editorial systems are automating the process of publishing in multiple channels

by Andrew Bowser

Phil Rugile has seen the future of editorial systems, and it is browsers-the same software now used by millions to cruise the World Wide Web.

"We're trying to get away from users having five applications on the desktop," says Newsday's director of information systems. "The ultimate goal is for everyone to have a browser on their desktop to access production tools."

Okay, so maybe you won't be able to lay out your front page using Netscape Navigator. But if the components of that page-the text, images, graphics and design characteristics-are all sliced and diced into databases, you may be able to manage all that content from a single desktop, as well as handle print, online and other products simultaneously.

"When I talk to developers and publishers, the biggest move is toward content being thought of as a valuable commodity that needs to be repurposed or re-expressed across different media," says Donna Conner, media and publishing-industry manager for Microsoft Corp.

The clock is ticking. In just five years' time, perhaps half the world's information could be stored, archived and published digitally, available anywhere at any time. If newspapers aren't prepared to make the digital leap, they risk becoming casualties of the nascent Information Age.

"Publishers have to stop viewing what they produce as a printed publication, and start viewing it as an assembly of textual, graphical and illustrative documents," says Charles Abrams, a strategic-planning analyst with the Gartner Group of Stamford, Conn.

Vendors, already morphing from proprietary technology to open systems, are now starting to develop content-neutral systems equally compatible with online publishing and ink-on-paper production. With a slew of product introductions anticipated in the next 24 months, sales of products supporting this emerging architecture are projected to grow from under $2 billion in 1998 to well over $12 billion in 2002, according to the Gartner Group.

Already, newspapers are rewarding vendors moving in that direction; witness the growth of such technology providers as Danish major-metro darling CCI Europe and Unisys Corp. of Irving, Tex. In February, Philadelphia Newspapers Inc. awarded a multi-million dollar contract to Unisys for an integrated editorial system, multimedia archive and remote communications-management system. "The needs of The Philadelphia Inquirer and Philadelphia Daily News have become increasingly sophisticated, and we need to have equally sophisticated technology at our disposal," says Tom Sims, PNI vice president of systems and technology.

Others are looking outside the traditional newspaper stable for solutions. Early this year, Knight Ridder New Media turned to Seattle software developer Alki Software Corp. to help design its own multi-format production system.

While industry pundits have long stressed the need for such soup-to-nuts systems, partial solutions are now emerging. And while no complete system exists today, current products offer a road map of sorts to the digital future. "You need to seize the initiative, see the pieces as they are emerging and put them together," Matt Cohen, then chief technology officer of the New Century Network, advised NEXPO®97 attendees.

The Archiving Answer

Looking for a glimpse of the future? Then take a look at your past, as stored in your archiving system. (You do have an archive, don't you?) As these systems turn into potential money-makers, they're becoming far more than mere dumping grounds for text and images.

Take Newsday, which mined its archives for photos of the New York Yankees' 1996 season, selling them to fans for $60,000 in additional revenues. Now the paper is using the archive's enabling technology, the MediaSphere database developed by Cascade Systems Inc. of Andover, Mass., to tag images and articles about Long Island from an extensive print series documenting the region's history from the Ice Age to the present.

Rugile envisions the history archive supporting a host of business opportunities, from CD-ROMs to coffee-table books. At the same time, Newsday plans to deliver all archives online using a pay model, much like Central Newspapers and other publishers.

But staffers at the Melville, N.Y., daily are also looking beyond business models, seeing database-driven technology as ultimately driving a big part of how the paper is put together. Already production staffers track components through the MediaSphere system using Web-browsing software, eliminating the need for specialized computer systems and costly software front-ends.

"Part of our goal is to make the archive part of the production process," Rugile says.

In western Pennsylvania, The Pittsburgh Post-Gazette is moving to a building-wide network bringing text, photo, Web page and Portable Document Format (PDF) archives to every desktop. It's an ambitious goal, admits Tim Rozgonyi, assistant technology systems editor, but the value is obvious to anyone who's sifted through piles of CD-ROMs or Syquest cartridges to track down an obscure image.

The greatest benefit may be moving the document-selection process closer to editors, asserts Cascade product manager Francesco Rietti. For example, editors who once filed requests for photos with a librarian can use products like MediaSphere to submit a real-time, plain-language search to the database and view a host of thumbnail images-all through a browser interface.

Doing so also brings new players into the process. "The big picture is to make sure everything is available on a digital basis and easily accessed," Rozgonyi says. "Then it becomes a bigger resource-not just the editorial department's resource."

An enticing proposition, selling access to photo and text archives, is one byproduct of the change. The Post-Gazette's aging LeafDesk system will be replaced by MediaSphere and Rochester, N.Y.-based Applied Graphics Technologies' DigitalLink, providing both database and archive tools.

"Instead of a back-end archive, it's a front-end production tool," Rozgonyi says.

Future Present

The next step is applying a similar philosophy to every front-end production tool. A host of software solutions are now stepping up to the plate, promising to simplify information transfer across different formats and products, be they ink-on-paper or electronic.

Abrams points to the architecture of Adobe Systems Inc.'s PDF browser-enabled collaborative suites, which facilitate document creating, editing, formatting, archiving and outputting to printers or the Internet. "We need to move away from a print-based environment where all we look at is the page and nothing else," he says. "We must move toward true multi-channel publishing."

While complete end-to-end systems wait in the wings, the pieces are starting to fall into place. Andrew Bart, president of Bethesda, Md., integrator Publishing Connections Inc., says the transition from print-centric to multi-channel will be simplified by solutions that extract data from traditional pagination systems and store disparate elements in a central, content-neutral database.

"That elegance is where the world is heading," Bart says. "It is incumbent upon technology providers to adopt the single-database concept."

That concept remains at the heart of what's called the "fifth wave" of publishing tools (hot lead was the first wave; cold type, the second; proprietary front-ends, the third; and desktop-publishing tools like QuarkXPress, the fourth). CCI demonstrated its fifth-wave MediaServer database at NEXPO'97; a few booths away, Orem, Utah-based Digital Technology International showed how a then-new Web-client tool could pull information out of the content-neutral database DT systems have used for years.

With that approach, the Web "becomes just another edition," said Don Oldham, DT's chief executive officer.

Vendors stress the value of content not published in the traditional print newspaper, including reference material, editorial revisions and comments, sign-offs from key departments, and rights and permissions. "What you print is just a one-dimensional view of your intellectual property," says Sebastian Holst, vice president of electronic-publishing product management for Boston-based Inso Corp. "The focus is to recognize that your management system should be more than a mirror image of what you publish."

Most multi-channel solutions coming to market promise automatic version tracking of data across any format. The Dallas Morning News has licensed Inso's DynaBase, a Web content-management system and scripting-development environment that places print and online documents into an object-oriented database that stores every version of every article. "Object-oriented databases lend themselves well to retaining archives," says Holst.

One publisher used Inso technology to produce a CD-ROM that instantly created corresponding teacher and student editions from a single publication. In a print context, such tools might allow easy repurposing to different-language editions, zoned editions and, of course, more targeted advertising.

Pagination Progress

To get an idea of just how much of a hurdle current fourth-wave production systems pose to the fifth-wave vision, consider the term one vendor applies to decoupling text from pagination systems.

"It's a concept we call 'de-Quarking,'" says Cascade's Rietti, whose MediaSphere product does just that in "almost fully automated" fashion. "Quark might be great for layout, but it's somewhat restricting, so we dissect it at the end." In April, British business-magazine publisher Miller Freeman Inc. will launch a Web version of The Engineer using the technology.

Which tools will drive layout in a multi-channel world remains open to speculation. Quark's present-day dominance cannot be ignored, but the Denver company may soon face steep competition. As unnamed sources told MacWeek late last year, Adobe is radically revamping a number of its publishing products, including a new page-layout application internally referred to as a "Quark Killer" and a new workflow-management, client-server package code named Stilton-a reported direct competitor to the Quark Publishing System.

"It's simply a rumor, and I can't comment on rumors. We haven't announced anything at this point," says Bryan Lamkin, Adobe's vice president of marketing.

While the March 1998 Computer Shopper called the newest 4.0 version of 'XPress a "DTP star" that "re-establishes Quark as the leader" in desktop publishing, reporter Susan Glinert noted a "surprising" omission of Internet features. "QuarkXPress' competition has not been as shy about incorporating HTML and PDF awareness into the core program," she wrote.

Quark, however, is hardly ignoring the Internet. The company has entered an alliance with Oracle Corp. to marry its publishing products to Oracle's server technology.

Oracle Chief Executive Officer Larry Ellison says the various solutions that will emerge, such as a database connectivity tool to link QuarkImmedia to Oracle's Universal Server and WebServer, will allow QuarkXPress users to migrate "an enormous reservoir of content" onto the Internet.

Stay tuned.

Online and On the Fly

So you've managed to pry text and images from your creaky print-based front-end. Now getting them on the Web is a piece of cake, right?

Wrong. The dirty secret of Internet publishing is that it's a remarkably unautomated process. Although software promising automation is readily available, talented individuals are often still hired (or drafted) to code HTML or massage text.

Despite having Web templates to automate the process, Newsday added a third shift in its library to massage text and images for the Web. Rugile says he hopes to install a new editorial system-possibly CCI or Unisys-to facilitate generating Web pages on the fly.

Numerous vendors are tossing their hats into the ring. One of the latest is Baseview Products Inc., whose LiveIQue is an editorial system add-on that serves newspaper copy to the Web without staff intervention. "It's ideal for someone who doesn't want to hire five people to run a Web site," says marketing associate Scott Sigler.

Such systems generally work in one of two ways-they either automatically extract articles from front-ends, code them with HTML and upload them to the Web; or place stories into a content-neutral database and dynamically create new pages each time a document is requested.

On-the-fly data conversion is not limited to Web republication, however. Miller Freeman uses MediaSphere to quickly convert articles to Lexis-Nexis format at the end of production cycles. According to Cascade's Rietti, automatic formatting requires fewer bodies and speeds delivery of content to the commercial database from seven days to one.

The Database Dilemma

Whether in print or online, emerging solutions all rely on some type of central information repository. But whether a unified database of past and current elements is desirable depends on whom you ask.

"The lines between live news and archival material have really blurred," says Allen Miller, Atex Media Solutions' vice president of marketing, "but I would question whether the integration of those two types of materials should happen at a database level or an application level.

"It's still the live database that really gets crunched in a production environment," Miller explains. "You need to optimize database design to get the performance you need in a highly demanding production environment."

Accordingly, such vendors as CCI and DT have backpedaled on the central database theory, instead selling their systems as replicated databases, one of which serves pre-press and the other the Internet, according to David M. Cole, editor and publisher of the Cole Papers and NewsInc.

"A much more desirable environment would be one where a common user interface could be used to dip into discrete databases for all of these services," says Cole (see What Next, p. 31).

Alphabet Soup

Dipping into databases with the precision that newspaper production demands, however, requires not only separating content from form-which many systems already do-but also adding new information to make that content searchable. A database that can't be searched, after all, isn't very valuable, whether it's sitting in your newsroom or on the Internet.

Consider the limitations of the Internet's common coding language, HyperText Markup Language (HTML). It can tell a heck of a lot about a document's format, but nothing about what's being formatted-without programmers lumping such descriptive metadata as searchable keywords on top of coded pages. Thankfully, its would-be successor is much smarter. Extensible Markup Language (XML), a subset of the Standard Generalized Markup Language (SGML), has generated such excitement and widespread support that many observers predict it will replace HTML as the primary Internet-publishing language within the next five years (see story, p. 21).

"XML gives documents the same structure, rigor and standardization capabilities as relational database technology," says Holst, touting Inso as the developer of the first end-to-end XML publishing solution. "SGML enabled this notion of multi-format publishing, but XML is what's going to bring it to the masses."

Sure, that's been said countless times before, but many key developers are signing on, and last month, XML was made an official standard by the World Wide Web Consortium (W3C) - in effect giving it the same level of endorsement as HTML.

"It's very strict in implementation. You can't really stray away from it too much. It will force standardization in digital content," says Cascade's Rietti.

XML is already gaining acceptance in the newspaper industry. In July-well in advance of the W3C statement-Atex agreed to license NuDoc, the XML composition system developed by BitStream Inc. of Cambridge, Mass. "It has the potential to be a dominant standard in the long-term, and we think it's key to the separation of form and content," says Miller.

Tribune Media Services may also incorporate the smarter standard into its WebPoint syndicated online-content packages as soon as this summer. "Right now, we don't have a simple mechanism to distribute content," says Jay Brodsky, Tribune's technology-development manager. "XML is a great way to package and describe content, and it's hopefully a standard we're going to take advantage of."

The emerging standard's usefulness extends beyond Web publishing, however. For example, XML will allow end users to "get into a publication's library and assemble a document according to their own needs," says Gartner's Abrams. In addition, XML will make it simpler for a publisher to deliver targeted print products, as well as personalized content directly to a consumer's desktop. Finally, it promises a way for publishers to share information-a key issue as the online-classified battle heats up, with national aggregators taking on local newspapers.

Still to Come

A host of related changes, both big and small, are now waiting in the wings. Observing how they develop may provide publishers with clues on which direction to take moving forward.

Watch for changes in newsfeeds that speed up repurposing content. Dave Stonehill, director of systems development for The Associated Press, says AP is readying an SGML-format test feed allowing easier identification of content. AP plans to share the format with vendors later this year.

"A nice side benefit is that the information we transmit will be easily displayable in a Web browser without any modification," Stonehill says.

Another puzzle piece that should help fuel dynamic publishing is Apple's upcoming Rhapsody operating system, slated for an initial developer's release this summer. Touted to newspapers as the glue to tie together publishing applications, Rhapsody's drag-and-drop development environment will be easily adaptable to any emerging technology, from XML to Web channels. The new OS will also make current dilemmas like version-tracking of stories breaking on the Web "almost trivial," says product manager Ernie Prabhakar.

On the Microsoft side, look for news on an "industry standard" framework to facilitate development of dynamic publishing tools for newspapers and other publishers. Along with six vendors-all of which have an interest in the newspaper industry-Microsoft is developing what it calls an "asset-management solution," or a common architecture similar to its banking-industry data-sharing framework (see http://www.microsoft.com/industry). The publishing standard is scheduled to be announced in late March.

Also expected to emerge are a spectrum of categories that defy the standard view of "Internet publishing" as HTML-page downloads to a computer monitor. The Gartner Group has identified on-demand printing over the Internet, where end users can print custom news content on standard networked printers, as one such market.

Merging workflow with dataflow applications promises to add another dimension. Companies including Cascade are expected to announce new applications this year that will allow works-in-progress to be archived and searched. For instance, a manager could call up proofs of all pages two hours from deadline, again from a Web-browser interface.

Expect paradigm shifts in workflow itself. Key to developing a true multi-channel publishing process, especially for smaller newspapers, will be moving away from the traditional "workstation-centric" production environment, according to Abrams. With databases opening the door to vast new amounts of information, Bart says time spent today on maneuvering pictures on the page will give way to filtering data for publication.

"The publisher with the most compelling product will be the winner," he says.

Moving forward, bear this in mind: multi-format publishing isn't just Internet publishing. Atex's Miller notes that many new products now thought of as "Web tools" will become valuable in every medium, because they reposition data into new contexts.

"Newspapers have an opportunity to create new products that can appeal to new customers and increase revenue," he says. "That's where the focus should be, rather than, 'Oh my God, I've got to put something on the Internet!'"

Andrew Bowser is a free-lance writer based in New Orleans. E-mail, andyb@comm.net; phone, (504) 897-4026.

Sources

Charles Abrams, The Gartner Group, 56 Top Gallant Road, Box 10212, Stamford, Conn. 06904-2212; (203) 964-0096, fax, (203) 316-3590.

Andrew Bart, Publishing Connections Inc., 4940 Hampden Lane, Suite 106, Bethesda, Md. 20814. E-mail, abart@pcipage.com; phone, (301) 951-1014; fax, (301) 951-0927.

Jay Brodsky, Tribune Media Services, 435 North Michigan Ave., Suite 1300, Chicago, Ill. 60611. E-mail, jbrodsky@tribune.com; phone, (312) 222-4140, fax, (312) 222-2816.

Sebastian Holst, Inso Corp., 31 St. James Ave., Boston, Mass. 02116; (800) 733-5799; fax, (617) 753-6620.

Allen Miller, Atex Media Solutions Inc., 15 Crosby Drive, Bedford, Mass. 01730. E-mail, amiller@atex.com; phone, (617) 276-1171; fax, (617) 276-1256.

Francesco Rietti, Cascade Systems Inc., 300 Brickstone Square, Andover, Mass. 01810. E-mail, rietti@cascadenet.com; phone, (978) 749-7000; fax, (978) 749-7099.

Tim Rozgonyi, Pittsburgh Post-Gazette, 34 Boulevard of the Allies, Pittsburgh, Pa. 15222. E-mail,rozgonyi@post-gazette.com; phone, (412) 263-1923; fax, (412) 263-2014.

Phil Rugile, Newsday, 235 Pine Lawn Road, Melville, N.Y. 11747. E-mail, rugile@newsday.com; phone, (516) 843-2362; fax, (516) 843-4801.

Dave Stonehill, Associated Press, 50 Rockefeller Plaza, New York, N.Y. 10020. E-mail, dave_stonehill@ap.org; phone, (212) 621-1813; fax, (212) 621-1512.


TechNews Volume 4, Number 2: March/April 1998
Return to March/April Home Page

©1998 Newspaper Association of America. All rights reserved.