New Compression Techniques for Content-based Retrieval

Due to the proliferation of multimedia information over the internet, users are confronted with large amounts of content from many sources around the world. Techniques that enable users to efficiently search and exchange information are greatly needed. Content-based retrieval system was proposed to automatically annotate and index the multimedia information by their own audio/visual contents instead of text-based keywords that were manually entered. Most of current content-based retrieval systems are designed in a centralized fashion where feature extraction, indexing and querying are all done in a single database server. This paradigm faces problems of intensive computation and difficulty to scale up.

We argue that we are able to get over this limitation by having a distributed retrieval system where users share the data storage and query computation over the network. Users are able to search and exchange information by transmitting the features, which contain sufficient information for retrieval, to each other through the network. We argue that by compressing the features we are able to reduce both the transmission bandwidth and storage space in a great deal, without losing retrieval performance. Different from traditional compression techniques, which are designed to provide best perceptual quality under given rate constraint, we design novel compression techniques tailored for specific classification purposes.

NSF Report (Year 8)
Poster

Research

New Compression Techniques for Content-based Retrieval