Hey all!
We just updated our ES cluster from 2.4 to 5.2.2, and one of the changes in that is the string
to text
and keyword
type fields. A couple of the string
type fields we had included analyzers that were basically pulling out values from XML. This now requires the fielddata
property to be set true to use. We have an idea of a better way to go about sending these values to ES, but it's a much bigger change and a lower priority, so kinda stuck just doing it at the ES layer. Setting the fielddata
property to true is easy enough, but I was trying to get the fields to be keyword
s instead. The problem here is that keyword
type fields do not allow analyzers, and the filter/tokenizers are more limited in what can be used in the normalizer
. We had been using a pattern
tokenizer in the analyzer, but that sort of thing does not appear to be supported by normalizers because of needing to operate on single characters and only emitting one token (from what I tried to understand in the docs).
The only thing that seemed like it should work was using a field on the analyzed field that was a keyword, but I don't think I'm understanding how fields work.
For example, let's say there is a message
field that is the full XML blob. With the analyzer, a field called message.analyzed_field
can be made with a type text
and that uses an analyzer to grab whatever it needed from the message
field. This is the field that has fielddata
set to true and works in visualizations. What I thought could be done was to "pass" that value from the analyzed field into a sub field of its own. ie a field message.analyzed_field.keyword
that had a type keyword
and was just the analyzed field, but as a keyword instead of text. This does not appear to be the case, it ends up being the message
field value that is used in the message.analyzed_field.keyword
. My assumption was that this has to do with how things are indexed in Lucene and the nested nature of the fields and their properties isn't what I expected.
I'm sort of at a loss for what the proper way to do this with ES might be. Any suggestions would be greatly helpful, thanks!