Aggregation on comma separated text field

I have a field that contains comma separated values. I am trying to find the right way to define the tokenizer and analyzer so that I can aggregate on the individual values of the field instead of the entire value.

I was able to do this successfully in 2.3 with the string type. However, when I migrated that mapping to 5.3, the type was changed to text that behaves in a different way.

Things I tried to get the aggregation I need -

"tokenizer" : {
  "csv_token" : {
	"type" : "pattern",
	"pattern": ","
  }
},
"analyzer": {
	"csv": {
		"type": "custom",
		"tokenizer": "csv_token"
	}
}

"mappings": {
      "MyObject": {
        "properties": {
          "attributes": {
            "properties": {
              "csvvalues": {
                "type": "text",
                "analyzer": "csv"
              },
...

This doesn't seem to make the csvvalues field aggregatable in Kibana index and hence it doesn't appear in the Terms option in Visualize.

Any help will be appreciated!

Note that Kibana/Elasticsearch 5.3 hasn't been released yet. Are you sure that's the version you are using?

I would guess this is happening because in 5.0+, Elasticsearch by default doesn't allow aggregations on analyzed fields, as fielddata is disabled by default. Your best bet is to index the data as separate keyword fields for a terms aggregation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.