Skip to content


Organizing Information in the Digital World: Implications of Authority Control for Web Resources

Infrared remote control receiver for Arduino

Infrared remote control receiver for Arduino

With so many disparate assets, publications, and data on the Web, it’s a wonder we can find anything at all; and when we do find something we’re searching for, are we really sure we’ve exhausted all resources, maximized search operators, and have assurance in our results? Factor in user tags and varying search algorithms and the Web begins to look like a tangled network of information. Employing authority control for Web resources can add a level of accuracy, completeness, and confidence to a user’s search.

For this paper I used a variety of articles on three distinct subject areas that I felt painted a complete view of the aspects of authority control and how it relates to Web resources: authority control for the purpose of bibliographic control on the Web, authority control as a model for organizing the Web, and authority control coupled with specific methodologies and coding of Web resources. I also elected to use some articles using the vision of the Semantic Web in an effort to be forward thinking.

To make clear the distinction between the Web and the Semantic Web, the current Web is like a giant text file and you can search for instances of particular words. The Semantic Web is like a database, where every item of information is categorized, and new queries can combine categories in any imaginable way (Hardesty, 2010). Of the Semantic Web, “everything is based on Resource Description Framework (RDF) triples which exhibit the simplest possible semantic structure” (Dunsire, 2008, p.3). I bring up RDF because it is important in the discussion of authority control in Web resources. Although, machines can quickly process many chains of RDF triples, they cannot ‘think’ so humans must still be involved in creating RDF resources.

I wanted to get as broad a view as I could so I deliberately took the pains to find articles from sources both inside and outside the United States. As a general rule, I reviewed articles published within the past four years, but I did make some exceptions where I found relevant material.

Literature Reviews

UNIMARC, RDA and the Semantic Web – by G. Dunsire

In developing useable, easy to index and relatable metadata, well-trained professional catalogers combined with an internationally agreed upon authority control are at the core of making the Web a data-driven display of data, says the author.

Leaning on the International Federation of Library Associations and Institutions (IFLA), Dunsire (2010) suggests that professional catalogers should apply Resource Description and Access (RDA) as the content standard for metadata encoded in the Universal MARC Format (UNIMARC) to facilitate the exchange of bibliographic data on the Semantic Web.

This human-focused view is close to what needs to happen to help bridge the Web to a data-driven reality, but can we expect professional catalogers to be the only ones classifying Web resources and can we expect UNIMARC to be a metadata standard and RDA to be a content standard? My hunch is that this will not be sustainable in the long term, but that is just conjecture and speculation on my part. If you asked me 15 years ago if I thought I’d get my news from somewhere other than newspapers, magazines and television I would have laughed at you. I do think that, despite the acceptance of RDA, the greater challenge here will be for the global population to agree unilaterally on content and metadata standards – whether they’re a part of IFLA or not.

Authority Control of People and Organizations on the Semantic Web
by Kurki, J., & Hyvonen, E.

Kurki and Hyvonen (2009) discuss a bold and sweeping national initiative known as the Finnish General Upper Ontology. It is intended to be the main ontology in Finland, interlinking domain and instance ontologies and is focused on using authority control for classifying individuals and organizations. It is based on the widely used Finnish General Thesaurus that is maintained by the National Library of Finland. This is part of the nine year National Semantic Web Ontology Project in Finland that is due to close in 2012 (Finnish Ontology Library Service, 2010).

The manifestation of this initiative is a repository named ONKI People that is a service for finding and clarifying the identities of persons, groups, and organizations. The ontology is developed collaboratively between individuals and organizations and has a multifaceted search component and graph visualizer component (Kurki and Hyvonen, 2009).

This is a prime example of brave and visionary thinking – and it’s working. Although it’s limited in that it’s only brief name and organization information, it provides a solid foundation for focused experimentation and development. Having a national initiative that has the support of its citizenry is very unique and can only happen in an environment where there is a high level of literacy (internet and reading) and ample availability of Internet access. This limits what countries or regions may be able to undertake such an initiative, but nonetheless provides a good framework for development.

Interestingly, this system fits in well with a concept proposed by Barbara B. Tillett nearly a decade ago in her speech titled ‘Authority Control on the Web’ at the Bicentennial Conference on Bibliographic Control for the New Millennium (2001). In her speech, she talks of linking existing authority record control numbers and other nation’s authority files to display uniform results for users (Tillett, 2001, p. 9).

Supporting Name Authority Control in XML Metadata:
A Practical Approach at The University Of Tennessee – by M. Veve

There is an ongoing effort for libraries and other intuitions to develop automated systems to execute routine tasks. Though this is possible in some instances, it is not yet possible to execute an automated process for name authority control in XML metadata. In this paper, Veve defines a system that “consists of a simple manual approach to extract and create name access points that effectively reduces time and research efforts by efficiently setting priorities, identifying critical descriptive areas in the digital transcriptions, and identifying the most appropriate biographical resources to consult” (2009, p. 42).

Finding ways to integrate these initiatives with existing mechanisms for name authority control in libraries can help to bring library catalogues into the mix of tools available on the Web (Harper, 2007, p. 62).

Here again (as in the earlier Dunsire paper) we see an article with a call for the human factor – not machines – to serve an important role in authority control for Web resources.

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web – by C.A. Harper

This paper discusses the importance of simplifying authority control systems, converting information to a model for data interchange, and utilizing the formatting style detailed in the Simple Knowledge Organization System (SKOS) project (The World Wide Web Consortium, 2010). According to Harper (2006), the Library of Congress Subject Headings Authority Records provide a ready-made framework that is suitable for making a useful and accurate search and retrieval system across the Web and its resources.

The author goes on to say that by paring down any existing MAchine-Readable Cataloging (MARC) or Metadata Authority Description Schema (MADS) and translating them into to Library of Congress Subject Headings Authority Records, we will have a uniform and easily classifiable system that will make the Web a data-driven and easy to index network (Harper, 2006, p. 4). He adds that using Library of Congress Subject Headings will give users and Web developers a system that has been honed over decades by librarians and information professionals

I found this article to oversimplify existing content issues, authority control and Semantic Web formatting. I agree with parts and pieces, but as a general solution or roadmap to the development of the Semantic Web, it falls very short. Library of Congress Subject Headings are without question an excellent valid and reliable system for organizing information, but there is definitely more information that would need to be tagged in different ways and MARC or MADS will probably not be sufficient for all resources. Resources such as programming codes or databases and their multiple iterations, for instance, would be hard to classify with LCSH alone.

Re-inventing Subject Access for the Semantic Web – by R.A. Franklin

This article deals with subject access and authority control as it relates to scholarly and academic research. The author forecasts the future of the Web as an entity that models “subject access with library science principles of bibliographic control and cataloging” (Franklin, 2003, 94) and adds that as we enter the next generation of scholarly research on the Web, the foundation, based on authority, is there (Franklin, 2003, p. 100).

As the development the Web quickens, developers will need to “incorporate standards that are basic to the integrity of data” (Franklin, 2003, p. 94). Employing authority control, enabling interoperability and establishing thesauri are at the core of this development.

Although not the most recent article, the manner in which the author relates quality of academic and scholarly research to the integrity of sound measures of authority control on the Web is timeless. The currency of thought on this area represented in this article from seven years ago is a good reference point in that little has changed in this area of the research as it relates to the Web. There is, however, little relevance to the statements concerning the Web closely modeling library science principles of bibliographic control and cataloging.

Closing Thoughts

The implications of authority control for Web resources can be both positive and negative.

It’s absolutely necessary to develop a universal, mutually agreed upon and committed to, system of authority control if we choose to maximize the resources on the Web. This is the only way to have a complete, uniform, and thorough indexing and retrieval system for Web resources. However, librarians and information management specialists are among the few that have not just the structural understanding of authority control, but the theoretical understanding of what authority control is and how it is beneficial to organizing the panoply of resources on the Web. This knowledge also dovetails into realizing how critical authority control is in forming the bedrock for the ongoing development of the Semantic Web.

There will have to be a bridge across the various camps of authority control thought to make an effective marriage of authority control and the Web. Some embrace RDA and are loathe to accept MARC; some of the MARC diehards see no use for RDA; Dublin Core holds promise, but some professionals find it too limiting; LCSH is tried and true, but does it offer enough classification schemes to make the Web as useful as it can possibly be? Without a united formalized classification system, there will only be awkward steps toward interoperability among existing Web resources and, without interoperability, there will be no unification of terms and possibly little or no disambiguation.

Regarding metadata, current practices simply do not support name disambiguation. As the number of resources and types of metadata grow, so will this challenge as will the need for humans to use authority control to map web resources.

The irony of my research is that while the benefits of authority control are many, if there are no unilateral agreements on authority control, any efforts may end up having adverse implications on Web resources. We may see the same frustrations and challenges currently going on in some camps of bibliographic control (MARC vs. RDA, UNIMARC, etc.) spill over to the organization of Web resources.

Bibliography

Dunsire, G. (2008). Said the spider to the fly: identity and authority in the Semantic Web. CILIP Cataloguing and Indexing Group Conference Programme (p. 2). Glasgow: University of Strathclyde.

Dunsire, G. (2010). Unimarc, rda and the semantic web. International cataloguing and bibliographic control , 37-40.

Eckert, K., & Haffner, A. (n.d.). Use Case Authority Data Enrichment. Retrieved 12 2, 2010, from The World Wide Web Consortium: http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Authority_Data_Enrichment

Finnish Ontology Library Service. (n.d.). ONKI2 :: Finnish general upper ontology – yso. Retrieved 12 2, 2010, from Finnish Ontology Library Service ONKI: http://www.yso.fi/onki2/overview?o=http%3A%2F%2Fwww.yso.fi%2Fonto%2Fyso&l=en

Franklin, R. A. (2003). Re-inventing subject access for the semantic web. Online Information Review , 94-101.

Hardesty, L. (2010, 6 22). Toward the semantic web. Retrieved 12 3, 2010, from MIT News: http://web.mit.edu/newsoffice/2010/semantic-web-0622.html

Harper, C. A. (2006). Encoding library of congress subject headings in SKOS: authority control for the semantic web. DC-2006: Proceedings of the International Conference on Dublin Core, (p. 5). Manzanillo: Universidad de Colima.

Harper, C., & Tillett, B. (2007). Library of congress controlled vocabularies and their application to the semantic web. Cataloging & classification quarterly , 43(3-4), 47-4.

Keizer, J. (n.d.). Use case FAO authority description concept scheme. Retrieved 12 2, 2010, from The World Wide Web Consortium: http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_FAO_Authority_Description_Concept_Scheme

Kurki, J., & Hyvonen, E. (2009). Authority control of people and organizations on the semantic web. Helsinki: Semantic Computing Research Group.

Rapoza, J. (2007, June 4). Spinning the semantic web. eWeek , pp. 31-33.

RIF Developers. (n.d.). RIF Working Group. Retrieved 12 3, 2010, from The World Wide Web Consortium: http://www.w3.org/2005/rules/wiki/RIF_Working_Group

The World Wide Web Consortium. (2010, 12 1). SKOS: simple knowledge organization for the web. Retrieved from The World Wide Web Consortium: http://www.w3.org/2004/02/skos/

Tillett, B. B. (2001). Authority Control on the Web. Proceedings of the Bicentennial Conference on Bibliographic Control for the New Millennium (p. 12). Washington, DC: Library of Congress.

Veve, M. (2009). Supporting name authority control in xml metadata: A practical approach at the university of tennessee. Library resources & technical services , 41-52.

Posted in Architecture, Library, Syracuse University.

Tagged with , , , .


Excellus BlueCross BlueShield is CLUELESS

Excellus Blue Cross Blue Shield

Excellus BlueCross BlueShield is CLUELESS

So I get this thoughtful (NOT) mailing from Excellus Blue Cross Blue Shield advising that , now that I’ve turned 65, I may be eligible for Medicare. Too bad they’re about  30 years off. I wonder how many trees they wasted sending this out to incorrectly aged recipients.

Posted in Uncategorized.


Big in Japan…REALLY big

McDonalds is introducing the equally mammoth ‘Idaho’, ‘Manhattan’, ‘Miami’, and ‘Texas 2′ burgers (among others) in Japan. These things are huge and have some pretty varied and unexpected toppings (tortilla chips and hash browns to name a couple). The WSJ has the full story here.

Should I (or you) find the general chumminess (check out the picture on the WSJ site) between McDonald’s Japan President and the U.S. Agriculture Trade Office’s Japan chief disturbing?

Posted in Uncategorized.


Circadian rhythm test results

Sleeping white tigerNot sure who’s interested in my circadian rhythm test results, but you’re welcome to look. If you’d like to test yourself, Philips has the questionnaire.

Posted in Uncategorized.


Good vs. evil – both experiencing downturn; good losing more quickly

Good vs. Evil

WOW. While both are in a downturn (in English at least), good is making, marked steady gains over evil in Russian. (At least how those phrases have occurred in a corpus of books (e.g., “British English”, “English Fiction”, “French”) over the selected years.)

Posted in Uncategorized.




Switch to our mobile site