Volume 8 / Issue 6

DOI:   10.3217/jucs-008-06-0581


Automated Retrieval of Information in the Internet by Using Thesauri and Gazetteers as Knowledge Sources

Wolf-Fritz Riekert (University of Applied Sciences Stuttgart - School of Media, Germany)

Abstract: There is an immense number of information resources on the Internet that can be utilized free of charge. So many knowledge workers try to make use of this information in their daily tasks. Nevertheless, it is very hard to find the relevant information in the Internet by using the full-text retrieval techniques which are offered by most existing search engines.

This paper demonstrates that Thesauri, which have been used in established online retrieval systems for a long time, also open up new methods for the automated search for information in the Internet. In addition, thesaurus-like structures known as Gazetteers allow handling geographical references of information resources in a very effective way. The knowledge represented in thesauri and gazetteers can be used to process a variety of thematic and geographical queries and to retrieve the information of interest from the Internet. Comfortable ways of specifying queries can be offered to the users, e.g., by navigating in a hierarchical tree of descriptors, by using synonymous, related or foreign-language terms rather than fixed elements of a controlled vocabulary, or by indicating a geographical region of interest on a cartographic map.

In addition to the general principles, examples of powerful query processors and advanced user interfaces are presented which demonstrate the effective usage of the knowledge stored in thesauri and gazetteers. The implemented solutions turn out to be considerably more comfortable than the "black box search" offered by most existing library catalogs and Internet search engines.

Keywords: Gazetteer, Internet, Thesaurus, information retrieval

Categories: H.3.3