Database Selection Using Document and Collection Term Frequencies

Report
Authors:Srinivasa, Rashmi, Department of Computer ScienceUniversity of Virginia Phan, Tram, Department of Computer ScienceUniversity of Virginia Mohanraj, Nisanti, Department of Computer ScienceUniversity of Virginia Powell, Allison, Department of Computer ScienceUniversity of Virginia French, Jim, Department of Computer ScienceUniversity of Virginia
Abstract:

We examine the impact of two types of information - document frequency (df) and collection term frequency (ctf) - on the effectiveness of database selection. We introduce a family of database selection algorithms based on this information, and compare their effectiveness to two existing database selection approaches, CORI and gGlOSS. We demonstrate that a simple selection algorithm that uses only document frequency information is more effective than gGlOSS, and achieves effectiveness that is very close to that of CORI.

Rights:
All rights reserved (no additional license for public reuse)
Language:
English
Source Citation:

Srinivasa, Rashmi, Tram Phan, Nisanti Mohanraj, Allison Powell, and Jim French. "Database Selection Using Document and Collection Term Frequencies." University of Virginia Dept. of Computer Science Tech Report (2000).

Publisher:
University of Virginia, Department of Computer Science
Published Date:
2000