Learn from this publishing heavyweight how to navigate the content-management path so the perils are few, the benefits fruitful and the content flows freely to multiple media.
It has been almost 100 years since James McGraw and John Hill merged the book departments of their two companies to form the McGraw-Hill Book Co. It has since become a world leader in educational and professional publishing.
One of McGraw-Hill's best-known publications—the Encyclopedia of Science & Technology—is considered a classic.
According to Mark Licker, vice president and publisher, science, "It is the most authoritative and comprehensive guide available for the broad spectrum of knowledge in science." But due to its scope and size—20 volumes, 15,600 pages, and 12,000 digital images (plus thousands of equations, chemical structures and tables)—the information in the encyclopedia had become unwieldy to manage and update with the systems formerly in place.
Thousands of large, interrelated SGML files held the encyclopedia's content for its print edition. Thousands of other files held the content for its CD-ROM version (the Multimedia Encyclopedia of Science & Technology). However, these files used the same document type definition (DTD) and tagging scheme, and they had similar content.
Having separate SGML files for both the print and CD-ROM versions of the encyclopedia created twice the work for McGraw-Hill staff when additions or changes to the encyclopedia's content were required. For example, when an article appeared both in print and on CD-ROM, McGraw-Hill had to create two separate SGML files for it. Additionally, an editing tool could not manipulate these disparate and numerous SGML files because they were not logically connected in any way. As a result, the editors of the encyclopedias had to write and run custom batch scripts each time they wanted to process these files, which was a cumbersome and highly error-prone process.