I've got a Tags field in my index defined as ....
"Tags" : {
"type" : "multi_field",
"fields" : {
"Tags" : {
"type" : "string"
},
"Raw" : {
"type" : "string",
"index" : "not_analyzed",
"omit_norms" : true,
"index_options" : "docs",
"include_in_all" : false
}
}
},
I'm using this field to store a list of tags. 3 sample documents:
{"Tags": ["SomeTag", "SomeOtherTag", "Some.Tag3"]},
{"Tags": ["SomeTag", "Some.Tag3"]},
{"Tags": ["SomeTag"]}
This works fine for storing, retrieving and searching.
The problem occurs when I want to retrieve a full list of all tags in the
system (and a count of their usage).
My initial approach was to facet on the Tags field. If I facet on Tags directly
(effectively, the Analysed field), I get...
"sometag": 3,
"someothertag": 1,
"some": 2,
"tag3": 2
Note that Some.Tag3 has been split by the tokenizer into 2 fields. I also
lose the casing (although this is a lesser issue).
On the other hand, if I facet on the non-analyzed version of the field, I
get:
"["SomeTag", "SomeOtherTag", "Some.Tag3"]" : 1,
"["SomeTag", "Some.Tag3"]": 1,
"["SomeTag"]": 1
Which is, of course, absolutely correct but not what I want.
So... How can I tell elastic to parse the list but not the contents of
each entry?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.