Effect of Tunable Indexing on Term Distribution and Cluster-based Information Retrieval Performance
Abstract
The purpose of this study is to investigate the effect of tunable indexing on the structure and information retrieval performance of a clustered document database. The generation
of all cluster structures and calculation of term discrimination values is based upon the Cover Coefficient-Based Clustering Methodology. Information retrieval performance is
measured in terms of precision, recall, and e-measure. The relationship between term generality and term discrimination value is quantified using the Pearson Rank Correlation
Coefficient Test. The effect of tunable indexing on index term distribution and on the number of target clusters is examined.