Effective and Efficient Automatic Database Selection

Authors:French, James, Department of Computer ScienceUniversity of Virginia Powell, Allison, Department of Computer ScienceUniversity of Virginia Callan, Jamie, Department of Computer ScienceUniversity of Virginia

We examine a class of database selection algorithms that require only document frequency information. The CORI algorithm is an instance of this class of algorithms. In previous work, we showed that CORI is more effective than gGlOSS when evaluated against a relevance-based standard. In this paper, we introduce a family of other algorithms in this class and examine components of these algorithms and of the CORI algorithm to begin identifying the factors responsible for their performance. We establish that the class of algorithms studied here is more effective and efficient than gGlOSS and is applicable to a wider variety of operational environments. In particular, this methodology is completely decoupled from the database indexing technology so is as useful in heterogeneous environments as in homogeneous environments.

French, James, Allison Powell, and Jamie Callan. "Effective and Efficient Automatic Database Selection." University of Virginia Dept. of Computer Science Tech Report (1999).

University of Virginia, Department of Computer Science
