On the Update of Term Weights in Dynamic Information Retrieval SystemsReport
Using the vector space information retrieval model, we show that the update of term weights under document insertions is computationally expensive for weighting schemes that use collection statistics and normalization by document vector lengths. In the dynamic setting, we argue that strict adherence to such schemes is impractical and unnecessary as long as retrieval effectiveness commensurate with strict adherence is attained. Experiments using standard test collections as a source of document insertions support this argument. These experiments indicate that term weights may drift from their mathematically defined values without a serious loss of retrieval effectiveness. The only problematic setting is when new terms are present in newly inserted documents. Ignoring these terms can cause an effectiveness degradation.
Note: Abstract extracted from PDF file via OCR
All rights reserved (no additional license for public reuse)
Viles, CL, and JC French. "On the Update of Term Weights in Dynamic Information Retrieval Systems." University of Virginia Dept. of Computer Science Tech Report (1995).
University of Virginia, Department of Computer Science