Thanks to the fast development of Internet and image capturing devices, the
available images online have gone through an exponential growth. Efficient indexing
and retrieval methods are crucial in order to leverage the web image dataset. This
has important impact to a number of research areas such as image recognition, image
retrieval and computer graphics. In this chapter, we review the current popular
image representation and corresponding large-scale index technologies. For global
representation, we review tree and hash based index structures. For local features,
which recently receive lots of attention for their invariance properties to lighting,
scale and rotation, we review inverted list indexing and the related “long query
problem”. Then we introduce an image decomposition approach to convert the local
feature representation from high dimensional sparse feature vectors to (relatively)
low dimensional dense feature vectors with residual information. We also discuss a
specially designed index structure to facilitate efficient storage and retrieval for this
image representation. At the end of the chapter, we present extensive experiment
results on a 2.3 million image database to demonstrate the efficacy of the image
decomposition approach.
Keywords: Image retrieval, image indexing, data driven image understanding, inverted list,
long query, search engine, image search, visual search, bag of words, dimension
reduction, latent dirichlet allocation.