Experiments on the Efficiency of Cluster Searches
Abstract
The efficiency of various cluster based retrieval (CBR) strategies is analyzed. The possibility of combining CBR and inverted index search (11s) is investigated. A method for combining the two approaches is proposed and shown to be cost effective in terms of paging and CPU time. The observations prove that the new method is much more efficient than conventional approaches. In the experiments, the effect of the number of selected clusters, centroid length, page size, and
matching function is considered. The experiments show that the storage overhead of the new method would be moderately higher than that of IIS. The paper also examines the question: Is it
beneficial to combine CBR and full search in terms of effectiveness?