Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2000-5
Submitted on 26th of February 2000
Author Paepcke, A.; Garcia-Molina, H.; Rodriguez-Mula, G.; Cho, J.
Title Beyond Document Similarity: Understanding Value-Based Search and Browsing Technologies
Date of publication 2000
Citation A. Paepcke,H. Garcia-Molina,G. Rodriguez-Mula,J. Cho: Beyond Document Similarity: Understanding Value-Based Search and Browsing Technologies. SIGMOD Records, 29(1): March 2000
Language English
Project Digital Libraries
Type Conference or Journal Paper
Subject group Digital Libraries
Abstract In the face of small, one or two word queries, high volumes of diverse documents on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data within documents sharply exacerbates the shortcomings of these approaches. Recently, research prototypes and commercial experiments have added techniques that augment similarity-based search and ranking. These techniques rely on judgments about the 'value' of documents. Judgments are obtained directly from users, are derived by conjecture based on observations of user behavior, or are surmised from analyses of documents and collections. All these systems have been pursued independently, and no common understanding of the underlying processes has been presented. We survey existing value-based approaches, develop a reference architecture that helps compare the approaches, and categorize the constituent algorithms. We explain the options for collecting value metadata, and for using that metadata to improve search, ranking of results, and the enhancement of information browsing. Based on our survey and analysis, we then point to several open problems.
Keywords Information retrieval, information filters, metadata, relevance, World-Wide Web, search engines, ranking, links, hypertext, collaborative filtering
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bypubs@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server