SIFT database accuracy used in benchmarks #558

mesibo · 2024-12-18T10:03:55Z

I am looking for some feedback on the accuracy of the SIFT database used in ann-benchmark here. Since it has been used for benchmarking, I assumed it to be quite accurate but found some discrepancies and hence requesting your inputs.

A brief background, we are using ann-benchmarks to test the implementation of our new vector database which uses SIFT. However, we are noticing discrepancies between the ground truth provided by the SIFT dataset and our results. Just so you know, we are also using the square Euclidean distance as specified by the dataset.

Here’s an example from the SIFT 10K sample dataset The ground truth for the test vector at index 0, according to the SIFT dataset, is (a vector of approximate nearest neighbors, sorted from closest to farthest):

[2176, 3752, 882, 4009, 2837, 190, 3615, 816, 1045, 1884, 224, 3013, 292, 1272, 5307, 4938, 1295, 492, 9211, 3625, 1254, 1292, 1625, 3553, 1156, 146, 107, 5231, 1995, 9541, 3543, 9758, 9806, 1064, 9701, 4064, 2456, 2763, 3237, 1317, 3530, 641, 1710, 8887, 4263, 1756, 598, 370, 2776, 121, 4058, 7245, 1895, 124, 8731, 696, 4320, 4527, 4050, 2648, 1682, 2154, 1689, 2436]

As per our implementation

[2176, 3752, 882, 4009, 2837, 190, 3615, 816, 1045, 1884, 224, 3013, 292, 1272, 5307, 4938, 1295, 492, 9211, 3625, 1254, 1292, 1625, 3553, 1156, 107, 146, 5231, 9541, 1995, 9806, 9758, 3543, 1064, 9701, 4064, 2456, 2763, 3237, 1317, 3530, 641, 1710, 8887, 4263, 1756, 598, 2776, 370, 121, 7245, 4058, 124, 1895, 8731, 696, 4320, 4050, 4527, 1682, 2436, 2648, 2154, 1689]

As you can see, some elements in our results do not follow the same order. For example, SIFT indicates that vector 370 is closer than vector 2776, but our implementation suggests otherwise.

To rule out possible errors in our implementation, we cross-verified the results using hnswlib. Here’s the output from hnswlib, which aligns with our implementation:

[2176, 3752, 882, 4009, 2837, 190, 3615, 816, 1045, 1884, 224, 3013, 292, 1272, 5307, 4938, 1295, 492, 9211, 3625, 1254, 1292, 1625, 3553, 1156, 107, 146, 5231, 9541, 1995, 9806, 9758, 3543, 1064, 9701, 4064, 2456, 2763, 3237, 1317, 3530, 641, 1710, 8887, 4263, 1756, 598, 2776, 370, 121, 7245, 4058, 124, 1895, 8731, 696, 4320, 4050, 4527, 1682, 2436, 2648, 2154, 1689]

We also printed the distances, which confirm that vector 370 is farther than vector 2776 in our calculations. However, the SIFT ground truth suggests the opposite:

E1812-145923-905 (2488793344): index 46 id 598 distance: 0.432297
E1812-145923-905 (2488793344): index 47 id 2776 distance: 0.432492
E1812-145923-905 (2488793344): index 48 id 370 distance: 0.433172
E1812-145923-905 (2488793344): index 49 id 121 distance: 0.434176
E1812-145923-905 (2488793344): index 50 id 7245 distance: 0.435382

I was just wondering if you have noticed similar discrepancies with the SIFT dataset ground truth, or if we are doing something wrong.

We will appreciate your insights.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIFT database accuracy used in benchmarks #558

SIFT database accuracy used in benchmarks #558

mesibo commented Dec 18, 2024

SIFT database accuracy used in benchmarks #558

SIFT database accuracy used in benchmarks #558

Comments

mesibo commented Dec 18, 2024