Dissemination of Collection Wide Information in a Distributed Information Retrieval SystemReport
We find that dissemination of collection wide information (CWI) in a distributed collection of documents is needed to achieve retrieval effectiveness comparable to a centralized collection. Complete dissemination is unnecessary. The required dissemination level depends upon how documents are allocated among sites. Low dissemination is needed for random document allocation, but higher levels are needed when documents are allocated based on content. We define parameters to control dissemination and document allocation and present results from four test collections. We define the notion of iso-knowledge lines with respect to the number of sites and level of dissemination in the distributed archive, and show empirically that iso-knowledge lines are also iso-effectiveness lines when documents are randomly allocated.
All rights reserved (no additional license for public reuse)
Viles, Charles, and James French. "Dissemination of Collection Wide Information in a Distributed Information Retrieval System." University of Virginia Dept. of Computer Science Tech Report (1995).
University of Virginia, Department of Computer Science