Dec 24th, 2025: [EN] BBQ for Christmas Dinner

Ugo_Sangiorgi · December 24, 2025, 8:00am

What is VectorFaces?

Since today is December 24th, how about some BBQ for Christmas Dinner?

VectorFaces is a face recognition demo I developed to showcase Elasticsearch vector capabilities at conferences. It searches through 1.2 million celebrity face embeddings using Elasticsearch BBQ, DIskBBQ, int8 and int4 quantization techniques.

Click here to try the live demo with start-local on a single CPU (!), or run it yourself by checking out the github repo.

The dataset is an excerpt from the IMDB-WIKI dataset, which contains celebrity images from IMDb and Wikipedia. We generated 512-dimensional embeddings using InsightFace.

The demo leverages four quantization methods in parallel in four different indices with 318,526 vector embeddings each (totaling 1.2 million vectors), letting you compare their performance directly:

BBQ (Better Binary Quantization) - Elasticsearch's binary quantization combining high recall with significant compression
DiskBBQ - BBQ optimized for disk storage, reading vectors directly from disk during rescoring
int8 - 8-bit integer quantization reducing each dimension from 4 bytes to 1 byte
int4 - 4-bit quantization providing maximum compression (8× smaller than float32)

All quantization introduces some accuracy loss. In practice, int8 typically needs little to no rescoring, int4 often benefits from 1.5×–2× oversampling, and BBQ commonly requires 3×–5× oversampling for optimal recall. Each method balances accuracy, speed, memory footprint, and storage differently, you can play with the parameters and change the oversampling to verify that.

In Settings, you can experiment with these parameters to balance speed and accuracy, and select which indices will be considered on the query:

size: Number of results that we bring back from
Elasticsearch. Must be ≤ k.

k: Number of nearest neighbors to return globally
across all shards. The final top k results are merged from all shards.

num_candidates: Number of approximate neighbors to gather
per shard before selecting the top k. Higher values improve
recall and accuracy but increase latency.
This is the primary knob for optimizing the latency/recall trade-off.

For quantized indices, increasing num_candidates effectively
oversamples the search space, which helps recover accuracy lost during quantization.

After changing these parameters note how latency and relevance change. At the bottom right corner, the box shows the top result and the latency of the two most important steps: embedding generation of your snapshot (camera or picture) and Elasticsearch - it would say something like "Elasticsearch: 50ms".

The scores go from 0 to 1, or 0% to 100% similarity since it’s easier to understand. For this dataset, a match higher than 0.75 can be considered positive, test for yourself.

Another interesting thing to look into is how much memory and disk each index is using, we can use Elasticsearch _stats API and issue these requests in Kibana:

GET faces-bbq_hnsw-10.15/_stats?filter_path=*.primaries.dense_vector

GET faces-disk_bbq-10.15/_stats?filter_path=*.primaries.dense_vector

GET faces-int8_hnsw-10.15/_stats?filter_path=*.primaries.dense_vector

GET faces-int4_hnsw-10.15/_stats?filter_path=*.primaries.dense_vector

Changing more parameters

While k and num_candidates can be changed for our queries, some quantization techique have special settings we can change. Take a look at the full list of settings here, some of them are applicable only at index time (like m, ef_construction), which would require a re-index, but some are applicable for query time as well.

Perhaps the most important one is oversample. It controls how many candidates are rescored using the original float vectors. With oversample: 3.0, the search retrieves num_candidates per shard using quantized vectors, then rescores the top k * oversample candidates with the original vectors before returning the final top k results. Higher values improve accuracy at the cost of additional computation.

For BBQ we can change it in the index mapping at rescore_vector.oversample:

PUT faces-bbq_hnsw-10.15/_mapping
{
  "properties": {
    "face_embeddings": {
      "type": "dense_vector",
      "dims": 512,
      "index": true,
      "similarity": "cosine",
      "index_options": {
        "type": "bbq_hnsw",
        "m": 16,
        "ef_construction": 100,
        "rescore_vector": {
          "oversample": 3
        }
      }
    }
  }
}

For DiskBBQ we can change rescore_vector.oversample and also default_visit_percentage which controls what percentage of clusters to visit during the search phase. Lower values trade accuracy for speed by visiting fewer clusters.

PUT faces-disk_bbq-10.15/_mapping
{
    "properties": {
        "face_embeddings": {
          "type": "dense_vector",
          "dims": 512,
          "index": true,
          "similarity": "cosine",
          "index_options": {
            "type": "bbq_disk",
            "cluster_size": 384,
            "default_visit_percentage": 0,
            "rescore_vector": {
              "oversample": 3
            }
          }
        }
    }
}

Again, after changing these, note how latency and relevance change.

Next Steps

Check the full documentation for setup instructions on how to index and run this demo on your own infrastructure.

Merry Christmas everybody, and happy BBQ'ing!

Topic		Replies	Views
When Does BBQ Quantization Outperform Scalar Quantization Elasticsearch vector-search	1	91	April 23, 2025
No Observable Difference Between BBQ and Default Configurations in Elasticsearch – Help with Index Size Comparison Elasticsearch elastic-stack-machine-learning , vector-search	15	257	July 2, 2025
Dec 18th, 2025: [ES] Búsqueda Vectorial con ES\|QL Advent Calendar esql	0	23	December 18, 2025
Trouble implementing Elasticsearch BBQ Elasticsearch vector-search	1	91	April 18, 2025
Dense Vector Field Extremely Large Elasticsearch vector-search	12	207	October 6, 2025

Dec 24th, 2025: [EN] BBQ for Christmas Dinner

What is VectorFaces?

Changing more parameters

Next Steps

Related topics