6 Things You Need to Know About Unstructured Content
Now think about all the content that flows in and around a publishing company: The articles, books (and chapters), images, slideshows, Web sites, tweets, user comments … the stuff we refer to simply as "content." Now think about how tough it is to find it—so we can repurpose it.
So here are six things you need to know about unstructured content.
1. It's everywhere. Analysts, pundits and people in the know estimate that more than 80 percent of content produced in an enterprise—let alone a publishing company —is unstructured.
2. Content is containerized. Unstructured content resides in containers like .doc, .ppt, tiff, .html, and you must have the right software application to read or edit it.
3. Managing unstructured content is hard. Because content resides in containers, it is hard to know what it is in each one.
4. XML is crucial for reuse and sharing. Sometimes called atomic or neutral format, XML is a language used to transmit content—without burden of the container. Neutral content can be then "poured" into any template (Word, Web, PDF—mobile apps!) for easier repurposing. If you have unstructured content (and most likely you have lots of it), it should be stored in an XML format.
5. Good metadata is essential. Once content is in an XML format, enrich it with semantic metadata—contextually relevant information about the content, such as concepts, people, places and organizations. Having a strong taxonomy can assist with embedding semantic metadata. This article is obviously about XML, but it also can be classified as "Tools for Publishing." Creating semantic metadata—manually, by machine and ideally both—is a critical step so editors and readers can find the most contextually relevant information.
6. Native XML databases provide agility and efficiencies. Relational databases (RDBMS) are great for organizing and querying structured data—while XML databases rock for unstructured content. You can make an RDBMS work with XML, but you will lose a lot in database performance (upwards of 30 percent is estimated by Forrester). Heck, I will use a knife to tighten a screw, but sometimes I need to go and get the Phillips-head.