Use my own global ordinals with a painless script?

Here's what one of my document might look like

{
  "CC":{"colors":["Blue","Green","Yellow"]},
  "CN":{"colors":["White","Green","Blue"]},
  "WA":{"colors":["Orange","Green","Blue"]}
}

I want a terms aggregation, on the intersection of two fields CC.colors and CN.colors. That is, for this document, that field will have ["Green", "Blue"] in the intersection, and I want a term aggregation on this intersection.

As far as I understand, there are two ways to do it.

  1. A painless script in terms aggregation, which returns the intersection of these two arrays for each document.
  2. A new field created during index time, maybe called CC_CN.colors, which holds intersection for all docs.

I can't go ahead with 2 because my combinations will be too many. I can have any need during search time, like CC_CN, or CC_WA, or WA_CN_CC etc.

For 1), it works, but gets painfully slow. One reason is that 1) cannot use global ordinals.

Is there any trick, that I can ask elastic to build a custom global ordinal for my painless terms aggregation? I know there are just 25 colors in my system, so can give all colors to elastic somewhere, and "assure" them that I'll not return anything else but these colors from my aggregation?

Or, if I encode and store numbers instead of strings in index, would this be faster for elastic? e.g. 0 instead of "Black", 1 instead of "Green" etc.?

Other than intersection, my other use cases involve union etc. as well. Thanks for reading!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.