DiVA: indexing high-dimensional data by Diving into Vector Approximations
Konstantinos Tsakalozos
Spiros Evangelatos
Alex Delis
Date published: 
Published In: 
Proc. of the 2011 IEEE Int. Conf. on Multimedia and Expo (ICME 2011)
Conference Article

Contemporary multimedia, scientific and medical applications use indexing structures to access their high-dimensional data. Yet, in sufficiently high-dimensional spaces, conventional tree-based access methods are eventually outperformed by simple serial scans. Vector quantization has been effectively used to index data that are mostly distributed uniformly. However, in real-world applications, clustered data and skewed query distributions are the norm. In this paper, we propose DiVA, an approach that selectively adapts the quantization step to accommodate varying indexing needs. This adaptation mechanism triggers the restructuring and possible expansion of DiVA so as to provide finer indexing granularity and enhanced access performance in certain “hot” areas of the search space. User-supplied policies help both identify such “hot” areas and satisfy versatile application requirements. Experimentation with our detailed prototype shows that in a real-world data set, DiVA yields up-to 64% reduced I/O compared to competing methods such as the VA-file and the A-tree.

Related files: 
application/pdf iconted-icme11.pdf 103.57 KB

MaDgIK 2009-2018