On the Update of Term Weights in Dynamic Information Retrieval Systems

Report
Authors:Viles, CL, Department of Computer ScienceUniversity of Virginia French, JC, Department of Computer ScienceUniversity of Virginia
Abstract:

Using the vector space information retrieval model, we show that the update of term weights under document insertions is computationally expensive for weighting schemes that use collection statistics and normalization by document vector lengths. In the dynamic setting, we argue that strict adherence to such schemes is impractical and unnecessary as long as retrieval effectiveness commensurate with strict adherence is attained. Experiments using standard test collections as a source of document insertions support this argument. These experiments indicate that term weights may drift from their mathematically defined values without a serious loss of retrieval effectiveness. The only problematic setting is when new terms are present in newly inserted documents. Ignoring these terms can cause an effectiveness degradation.
Note: Abstract extracted from PDF file via OCR

Rights:
All rights reserved (no additional license for public reuse)
Language:
English
Source Citation:

Viles, CL, and JC French. "On the Update of Term Weights in Dynamic Information Retrieval Systems." University of Virginia Dept. of Computer Science Tech Report (1995).

Publisher:
University of Virginia, Department of Computer Science
Published Date:
1995