Aggregation on comma separated text field


#1

I have a field that contains comma separated values. I am trying to find the right way to define the tokenizer and analyzer so that I can aggregate on the individual values of the field instead of the entire value.

I was able to do this successfully in 2.3 with the string type. However, when I migrated that mapping to 5.3, the type was changed to text that behaves in a different way.

Things I tried to get the aggregation I need -

"tokenizer" : {
  "csv_token" : {
	"type" : "pattern",
	"pattern": ","
  }
},
"analyzer": {
	"csv": {
		"type": "custom",
		"tokenizer": "csv_token"
	}
}

"mappings": {
      "MyObject": {
        "properties": {
          "attributes": {
            "properties": {
              "csvvalues": {
                "type": "text",
                "analyzer": "csv"
              },
...

This doesn't seem to make the csvvalues field aggregatable in Kibana index and hence it doesn't appear in the Terms option in Visualize.

Any help will be appreciated!


(Tim Sullivan) #2

Note that Kibana/Elasticsearch 5.3 hasn't been released yet. Are you sure that's the version you are using?

I would guess this is happening because in 5.0+, Elasticsearch by default doesn't allow aggregations on analyzed fields, as fielddata is disabled by default. Your best bet is to index the data as separate keyword fields for a terms aggregation.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.