The Semantic Web: Taxonomies vs. ontologies

"The Semantic Web: Differentiating Between Taxonomies and Ontologies." Online. 26 n4 (July/August 2002): 20.

    Computer scientists--along with librarians--are working to solve problems of information retrieval and the exchange of knowledge between user groups. Ontologies or taxonomies are important to a number of computer scientists by facilitating the sharing and reuse of digital information.
Katherine Adams' article in ONLINE (ironically, not available online) talks about the Semantic Web and the subtle difference in the approaches that computer science and library information science have taken toward making information findable using structured hierarchical vocabularies -- ontologies for CS and taxonomies for LIS.

The article generalizes one difference between CS and LIS by saying that "software developers focus on the role ontologies play in the reuse and exchange of data while librarians construct taxonomies to help people locate and interpret information". Both hopefully remain focussed on the end result of making data findable and usable.

    Some of the traditional skills of librarianship--thesaurus construction, metadata design, and information organization--dovetail with this next stage of Web development. Librarians have the skills that computer scientists, entrepreneurs, and others are looking for when trying to envision the Semantic Web. However, fruitful exchange between these various communities depends on communication.
    Commonalities exist--as do differences--between librarians who create taxonomies and computer scientists who build ontologies. Mapping concepts, skills, and jargon between computer scientists and librarians encourages collaboration.
I'm quoting a few large blocks from the article because they're probably important for us to read (fair use!). One of the sections discussess differing views on inheritance and the last discusses topic maps.

    DIFFERENT POINTS OF EMPHASIS: INHERITANCE

    In general, those in computer science (CS) are concerned with how software and associated machines interact with ontologies. Librarians are concerned with how patrons retrieve information with the aid of taxonomies. Software developers and artificial intelligence scholars see hierarchies as logical structures that help machines make decisions, but for library science workers these information structures are about mapping out a topic for the benefit of patrons. For librarians, taxonomies are a way to facilitate certain types of information-seeking behavior. It would be a mistake to overemphasize this point since one can point to usability experts in the CS camp who advocate user-centered Web design or librarians who are fascinated with cataloging theory to the exclusion of flesh-and-blood patrons. Yet, as an overarching generalization, software developers focus on the role ontologies play in the reuse and exchange of data while librarians construct taxonomies to help people locate and interpret information.

    This difference is illustrated by the concept of inheritance. Computer scientists build hierarchies with an eye toward inheritance, one of the most powerful concepts in software development. Machines can correctly understand a number of relationships among entities by assigning properties to top classes and then assuming subclasses inherit these properties. For example, if Ricky Martin is a type of "Pop Star" in a hierarchy marked "Singers," then a software program can make assumptions about Mr. Martin even if the details of his biography are not explicitly known. An ontology may express the rule, "If an entertainer has an agent or a business manager and released an album last year, then assume he or she has a fan club." A program could then readily deduce, for example, that Ricky Martin has a fan club and process information accordingly. Inference rules give ontologies a lot of power. Software doesn't truly understand the meaning of any of this information, but inference rules allow computers to effectively use language in ways that are significant to the human users.

    By contrast, librarians think of inheritance in terms of hierarchical relationships and information retrieval for patrons. Taking the example above, the importance of the taxonomy rests in its ability to educate patrons. Someone who's been tuned out of popular culture might use the Pop Star hierarchy to learn the identities of singers who are currently in vogue. A searcher could also uncover the various types of Pop Stars that exist in mass culture: Singers, Movie Stars, Television Stars, Weight-Loss Gurus, Talk Show Hosts, etc. Finally, a patron could hop from one synonym to another--from "Singer" to "Warbler" to "Vocalist"--and discover associative relationships that exist within this category.

    TOPIC MAPS AS NEW WEB INFRASTRUCTURE

    Topic maps are closely related to the Semantic Web and point the way to the next stage of the Web's development. Topic maps hold out the promise of extending nimble-fingered distinctions to large collections of data. Topic maps are navigational aids that stand apart from the documents themselves. While topic maps do not include intelligent agents, other aspects of this technology--metadata, vocabularies, and hierarchies--fit well within the Semantic Web framework. According to Steve Pepper, senior information architect for Infostream in Oslo, Norway, in "The TAO of Topic Maps: Find the Way in the Age of Infoglut", his presentation at IDEAlliance's XML Europe 2000 conference, topic maps are important because they represent a new international standard (ISO 13250). Topic maps function as a super-sophisticated system of taxonomies, defining a group of subjects and then providing hypertext links to texts about these topics. Topic maps lay out a structured voca bulary and then point to documents about those topics. Even OCLC is looking to topic maps to help its project of organizing the Web by subject.

    An important advantage of topic maps is that Web documents do not have to be amended with metadata. While HTML metatags are embedded in the documents described, topic maps are information structures that stand apart from information resources. Topic maps can, therefore, be reused and shared between various organizations or user groups and hold great promise for digital libraries and enhanced knowledge navigation among diverse electronic information sources.

Other articles mentioned:
Tim Berners Lee, "The Semantic Web," Scientific American, May 200.

Natalya Fridman Noy and Deborah L. McGuinness. "Ontology Development 101: A Guide to Creating Your First Ontology," Knowledge Systems Laboratory Stanford University, March 2001.

Tom Gruber, "What is an Ontology," [September 2001].

Steve Pepper, "The TAO of Topic Maps: Find the Way in the Age of Infoglut," XML Europe 2000.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Talking to the CS folks

This was a great entry. This article is a great intersection of the two camps. Hopfully we can get some CS folks to read it as well and perhaps get on the same page and really start to innovate in this space.

I look forward to the great products that come out of this intersection :)

@name: Madonnalisa G. Chan
@label: Information Architect