Elasticsearch array field of keywords - how to index it?


(depahelix) #1

I've got input that is analogous to tags, where there are a couple of strings per record, and they should be thought of as keywords, not to be tokenized or broken up or analyzed in any particular way. I want it to show up in faceting "as-is", including spaces, slashes, dashes and ampersands.

I don't think I need multi_field here. There is one input value per record "keyPhrases" but the input value is a simple json array of strings.

I want elasticsearch to insert into the facets each of the values, and tag the record with all of the
Usually there are only one or two or three phrases per record, but there could be more. The set of keyPhrases is fairly small, like 30 or at most like 50. They could be thought of as "categories".

The faceting keeps breaking up the input strings and using lowercasing, even though I'm trying to specify not_analyzed, keyword tokenizer, keyword analyzer, and trying things like that.

I have other fields that keep their spacing and capitalization as I desire in the facets returned, however those fields are not_analyzed and are also store: true, but are also just exactly 1 string input per record, as opposed to many per record.

I could just take the top 1 keyPhrase per record and flatten it, but ideally all the tags would work and be available as facets.

Any ideas on how to do this?


(system) #2