Is it possible to count each term of a text instead of the complete text to display in Kibana?

Hi! Im completely new to ES. I have the time and text of tweets stored in a CSV file. I'am using logstash to get them over ES. When I'm using Kibana's Tag Cloud to display the most common words in the tweets, every tweet is considered one term. Is it possible to make Kibana count every term in every tweet instead of just every tweet?

EXAMPLE:
[2017-02-01, foo foo bar]
[2017-02-01, foo bar]
[2017-02-01, foo bar]

RESULT:
2 counts: foo bar
1 count: foo foo bar

WHAT I ACTUALLY WANT:
4 counts: foo
3 counts: bar

and then display this in Kibana

I think you need to modifiy your index file configuration and choose the specify analyzer for your need

Elasticsearch Reference [5.2] » Analysis » Analyzers

The field should be a normal text field, which is analyzed, and not a
keyword field, which is not. You can define further analysis if you want,
but the standard analyzer should be good for most scenarios.

After that, you simply run a terms aggregation over the field, but the
field needs to have fielddata enabled:
https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html

I do not use kibana or logstash, so I do not know how those systems are
configured.

I am trying with

PUT twitter/_mapping/twitter_text

{
   "twitter_text": {
      "properties": {
        "publisher": {
          "type": "text",
          "fielddata": false
        }
      }
   }
}

And the mapping says

GET /twitter/_mapping
"twitter": {
      "mappings": {
         "twitter_text": {
            "_all": {
               "enabled": true,
               "norms": false
            },
            "dynamic_templates": [
               {
                  "message_field": {
                     "path_match": "message",
                     "match_mapping_type": "string",
                     "mapping": {
                        "norms": false,
                        "type": "text"
                     }
                  }
               },
               {
                  "string_fields": {
                     "match": "*",
                     "match_mapping_type": "string",
                     "mapping": {
                        "fields": {
                           "keyword": {
                              "type": "keyword"
                           }
                        },
                        "norms": false,
                        "type": "text"
                     }
                  }
               }
            ],

And twitter_text is still not aggregatable. What am I doing wrong?

Figured I had to build the index before populating it. Build it like

PUT twitter666
{
  "mappings": {
    "tweet": { 
      "_all":       { "enabled": false  }, 
      "properties": {  
        "twitter_text":     { "type": "text", "fielddata": true  }, 
        "user":      { "type": "text" },
        "country":      { "type": "text" },
        "longitude": {"type":"float"},
        "longitude": {"type":"float"}
      }
    }
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.