One of the problems I have with traditional media on the web is that they are informationally dead-ends. Many articles from traditional media, especially newspapers, that are transitioned to the web do not have links to anything else. That makes them a dead-end. The meaning of this metaphor is that the article claims to be the last bit of information I need, but very seldom is that the case. An article is about something, and I would like to be able to get to that something from the article, but cannot directly.

This dove-tails into a conversation I’ve been having with myself, especially after re-acquainting myself with Clay Shirky’s Ontology is Overrated: Categories, Links, and Tags. I ran into Shirky’s essay again because I’ve been thinking about the list extensions to RSS and OPML.

RSS and OPML both enforce a hierarchical relationship between elements, To some extent, I feel this is also a problem with XML also. The structure implies a relationship that is bounded by a single-meaning link. By single-meaning, I mean to describe that the only context for the relationship is “this thing follows this other thing” in the chain of information. Certainly, these relationships can result in recursive links if the next page, or somewhere else along the chain, links back to a previous node. However, there’s no other kind of link.

Some pages, especially blogs, have attempted to qualify links by having sections that collect links backward and referrers, but these have to be explained in the context of the content, not by the link themselves. So, one has to very inefficiently, and lossily, determine the relationship from context.

The extension to Shirky’s observation about ontologies might be that what would be useful, as a next step, would be a thesaurus to describe links. By this I mean that a link could express specific information about the relationship, such as the page is “more specific” than the link, or “more general” than, etc …

An example of these relationships from Introduction to thesauri, ISO-2788, LCSH offers:

“The relationships specified by the standard, and their abbreviations are presented below:

SN Scope note; a note attached to a term to indicate its meaning within an indexing language

USE The term that follows the symbol is the preffered term when a choice between synonyms or quasi-synonyms exists

UF Use for; the term that follows the symbol is a non-preffered synonym or quasi-synonym

TT Top term; the term that follows the symbol is the name of the broadest class to which the specific concept belongs; sometimes used in the alphabetical section of a thesaurus

BT Broader term; the term that follows the symbol represents a concept having a wider meaning

NT Narrower term; the term that follows the symbol refers to a concept with a more specific meaning

RT Related term; the term that follows the symbol is accociated, but is not a synonym, a quasi-synonym, a broader term or a narrower term”

When I was thinking about developing relationship terms to provide a thesaurus for links between community asset records, I thought of similar options but included the notion of creating human understandable, free-text descriptions to co-exist with the specific terms. For example, one might represent a second record having a relationship to the first as “more general” but also provide a free-form description as article two “offers more information about the topic in” article one. Certainly, this would all be optionally added to the link.

There is in fact a way to already express link relationships in the HTML specification, as described in LINK- Document Relationship, but this is not granular to anchor tags and only provides information on the entire HTML document.

And, now I feel pretty silly, because there is the option to place the same information in the actual anchor tag, as described in A – Anchor. However, the options for link types does not include useful entries to describe the relationship between documents outside of a local set. The link types include things such as index, toc, next and previous. This is not useful to describe thesaurus relationships between different sets of documents. The document does point out that authors may use other link types.

Bringing the elocution safari back home, a de facto ontology based on simple links is not nearly as interesting as one that actually expresses the nature of the relationship. However, a de facto ontology of simple, undifferentiated links has the advantage of not forcing a linear heirachy, inherent in RSS lists or OPML.

One of the productivity tips on concept mapping I have heard is to provide text on the link which makes a sentence. For example, if I have two nodes, I might write something along the linking line which creates a useful sentence: [OPML] — is a file format for –> [OUTLINERS]. Being able to provide this kind of description might allow for more robust visualization, such as the way that Omnigroup’s OmniGraffle 4 can now automatically display outlines as graphical maps.

The notion of describing the nature of relationship implied by a link, especially to external documents, seems to be a useful next step in allowing strong user-created ontologies to be constructed. Even if the description is free form, there would be advantages.

Tools like technocrati, flickr, etc … use tags to describe content, but there’s just no robust way to describe the relationships. I feel attracted to the notion of having a controlled thesaurus to describe these, but perhaps the lesson from livejournal and technocrati is that the thesaurus develops on its own as people self-select terms used by others when they wish to do so. However, either way, there should be such a thing.