"store":true improves aggregation speed. Why?

So I was recently running some benchmarks over a new index I built, where I set some fields to be stored in the index ("store":true), and set index.store.type: memory.
The expected behaviour was that there would be some overall speed improvement, which there was, and also that retrieving the fields that were now stored in the index would also be faster. This was also the case.

The thing I have no explanation for, is that aggregating over the fields that are now stored in the index, is faster than it should reasonably be. For the non-stored fields, aggregations were about 1.6x faster once I set the index to be stored in memory. However, for the stored fields, aggregations were 3.5x faster once i set the index to be stored in memory. My understanding is that aggregations work on the indexed data, and therefore setting store:true should garner no speed improvement. Is that not the case?

The only other explanation I can think of is that there is some property of the data in the stored fields that just happens to benefit MORE from the index being stored in memory than the other fields.

What did your aggregation look like? Was it accessing field values using a script and _fields[] or _source syntax?

What version of ES are you using? Are you using fielddata or doc values? If you are using FD, were you clearing the cache between each query?

Off the top of my head, stored fields really shouldn't have improved speed. Storing fields simply takes the original, non-analyzed version of the field and sticks it in a different Lucene field. This allows you to access that data without parsing the entire _source, for example when GETing a document. It isn't used to populate field data, since field data is post-analysis and comes from the inverted index.

The only explanation that I can think of is if you are using _fields[] or _source in a script in your aggregation or search, since both of those might benefit from stored fields vs parsing the source.

A different explanation may just be that ES and the OS were becoming accustomed to your benchmarking and optimizing caches. E.g. if you run the same query multiple times, the OS will tend to leave various file and memory pages hot because they are being repeatedly accessed, the JVM's JIT will have optimized the relevant execution paths in the code, ES will have FieldData loaded, global ordinals will have been populated into memory, etc etc.

As an aside, memory indices are being removed in 2.0, so don't rely on them :wink: If you want to run in-memory, we recommend using a real RAM-disk instead (better performance, simpler, etc)