ABSTRACT

Consider the following common similarity search scenario. We are given a new image, and we need to determine whether it is similar to some image in the existing database, preprocessed beforehand. How do we capture the notion of similar? The standard approach is to use a suitable distance metric. For example, we can represent a 20 × 20-pixel image as a 400dimensional vector, one coordinate per pixel. Then we can measure the (dis)similarity using the Euclidean distance in R400.