Avoid 4 Costly Link Management Pitfalls
Link management is a vital component of a publisher’s XML-based system. Innodata Isogen, the Hackensack, N.J.-based provider of digital content services and solutions, recently issued a white paper titled, “Don’t Break the Link: Avoid Four Costly Pitfalls in Linking and Reuse” that lays out some of the challenges of link management and offers practical advice for incorporating an effective strategy into a publishing environment.
Pitfall #1—Failing to Maintain Link Paths and Dependencies
Dependency tracking plays a key role in managing links and maintaining their validity. A link relationship includes the source document (which originates the link), the target document, the link path location information, and various other properties. As the source and target XML documents are moved between various computers, systems, repositories, and file locations, the dependencies must be updated and the links must remain valid.
The location paths that identify link or reuse targets are typically contained within the XML file that is the source of the link. These location identifiers typically contain two parts. The first part defines where to locate the target file. When applicable, the second part of the identifier identifies the location of the desired content within the target XML document.
Often the path portion of the location identifier mimics file system paths, such as C:/folder/target.xml or ../folder/target.xml. These identifiers are examples of two ways to specify a target location: The first uses a full file path whereas the second uses a relative path. The internal address portion of the location may be an ID attribute or it may be an XPath(12) or XPointer(10) expression. For Web-based systems, the location identifier may target a Web resource, such as http://www.company.com/folder/target.xml. This same XML file may be stored in many different physical locations at different points in time.
Pitfall #2—Confusing Entities with Reuse
The XML specification defines the ability to include content from another file by referencing external entities. While this is a part of the XML specification, it is not a suitable mechanism for reuse in a complex system. The external entity reference mechanism merely provides the ability to import plain text content, verbatim, into your XML file. An entity file is not an XML file. The contents of an entity are not treated as XML, are not validated separately, and have no other special reuse properties. Many systems make the fundamental mistake of considering this to be a robust XML reuse mechanism – it is not.