Searching ... in a Web
Ian H. Witten (University of Waikato, New Zealand)
Abstract: Search engines—"web dragons"—are the portals through which we access society's treasure trove of information. They do not publish the algorithms they use to sort and filter information, yet what they do and how they do it are amongst the most important questions of our time. They deal not just with information per se, but evaluate it in order to prioritize it for the user. To do this they assess the prestige of each web page in terms of who links to it. This article explains in non-technical terms what is known about how web search engines work. We describe the dominant way of measuring prestige, relating it to the experience of a surfer condemned to click randomly around the web forever—and also to standard techniques of bibliometric evaluation. We review alternatives: some strive to identify subcommunities of the web; others learn based on implicit user feedback. We also takes a critical look at how people use search engines, and identify issues of bias, privacy, and personalization that crucially affect our world of information today.
Keywords: PageRank, information ethics, personalization, privacy, search bias, search engines, web search
Categories: H.3.3, K.4.0, K.4.1, K.4.2