Database Selection Using Document and Collection Term FrequenciesReport
We examine the impact of two types of information - document frequency (df) and collection term frequency (ctf) - on the effectiveness of database selection. We introduce a family of database selection algorithms based on this information, and compare their effectiveness to two existing database selection approaches, CORI and gGlOSS. We demonstrate that a simple selection algorithm that uses only document frequency information is more effective than gGlOSS, and achieves effectiveness that is very close to that of CORI.
All rights reserved (no additional license for public reuse)
Srinivasa, Rashmi, Tram Phan, Nisanti Mohanraj, Allison Powell, and Jim French. "Database Selection Using Document and Collection Term Frequencies." University of Virginia Dept. of Computer Science Tech Report (2000).
University of Virginia, Department of Computer Science