Skip to main content

DiVA: indexing high-dimensional data by Diving into Vector Approximations

Contemporary multimedia, scientific and medical applications use indexing structures to access their high-dimensional data. Yet, in sufficiently high-dimensional spaces, conventional tree-based access methods are eventually outperformed by simple serial scans. Vector quantization has been effectively used to index data that are mostly distributed uniformly. However, in real-world applications, clustered data and skewed query distributions are the norm. In this paper, we propose DiVA, an approach that selectively adapts the quantization step to accommodate varying indexing needs. This adaptation mechanism triggers the restructuring and possible expansion of DiVA so as to provide finer indexing granularity and enhanced access performance in certain “hot” areas of the search space. User-supplied policies help both identify such “hot” areas and satisfy versatile application requirements. Experimentation with our detailed prototype shows that in a real-world data set, DiVA yields up-to 64% reduced I/O compared to competing methods such as the VA-file and the A-tree.

 

Citation
Konstantinos Tsakalozos, Spiros Evangelatos, Alex Delis, "DiVA: indexing high-dimensional data by Diving into Vector Approximations ", Proc. of the 2011 IEEE Int. Conf. on Multimedia and Expo (ICME 2011), 2011
TAGS
Access
Unknown
Published at
Proc. of the 2011 IEEE Int. Conf. on Multimedia and Expo ICME 2011
Related research area
No related research area
Related Organizations
No related organizations