Doc values vs inverted index


(Biswajit86) #1

I have a a few fundamental questions regarding doc values

  • if doc values are enabled for a field, does it mean that there is no inverted index for that field ? If the answer is yes, then does it slow down the search functionalities.
  • Is enabling doc values the equivalent of storing the data in a "columnar structure" instead of an inverted index.
  • Should doc values be enabled for all fields , or should it be enabled only for field which are used for aggregrations/sorting

For ex , if I have a representation of stock prices in the following layout
Symbol, Price, Quantity, MV,Rating,
AAPL, 400, 1000, 400000,A
MSFT, 200, 3000, 600000,B

I would normally use the Symbol and rating fields on the X-axis and aggregations on price ,quantity and MV fields on the Y axis. In this case

  • Should all my fields be stored as doc values, or
  • Should I store Price, Quantity and MV as doc values and the rest of the fields as field data

(Mark Walkom) #2

I'd enable doc values on all of them, they don't really need to be analysed at all, as long as you are confident the data is normalised (ie the symbol and ratings are always capitalised). Which will save you heap for more aggs :smile:

But yes, if you set it as non-analysed it doesn't create the inverted index, but for the dataset that you are working with it's probably not going to be a big deal.


(system) #3