Confused about how to use .raw fields and not analyze string fields

mackenza · August 26, 2015, 12:51pm

I apologize if this is too newbie a question but I am struggling with setting up a terms visualization in Kibana 4 because the string field with the terms is analysed.

I am using logstash with the elasticsearch output to populate my index. In my searching around the web, my understanding is that the default logstash template for ES creates a multi-field for each string field where one of the fields is .raw. So somewhere in my index is the ability to use a not_analyzed version of my field. I think I get that.

What I am 100% struggling with is how to make that .raw field show up in my field list in Kibana? I have searched around for it and there are vague references to "make the .raw known to Kibana and refresh field list" but I can't for the life of me figure out how to do that. I have also seen references to creating a new template for Logstash to replace the default but there hasn't been enough detail for me to figure that out.

I am hoping someone with patience can help me out here? Thanks in advance.

tbragin · August 27, 2015, 5:11am

What Kibana version are you using?

Assuming it's Kibana 4, you won't see the raw values in "Settings >> Indices" and "Discover", but you should see them in metric selection of "Visualize" (see screenshot). The reason for that is that only Visualize deals with aggregations, where raw values matter.

If you don't see them there, I'd suggest the following:

Make sure you're using the default indexing template that comes with Logstash (you shouldn't have to modify it).
If you do have to re-index your data, refresh your mappings in Kibana (or delete and re-create the index pattern).

mackenza · August 27, 2015, 10:19am

Thank you for the reply. I don't see those raw fields as you show. I am
wondering if it's because my indices are not of the pattern logstash-*? I
will rename my indices and see.

Again tyvm

tbragin · August 27, 2015, 6:48pm

Index name shouldn't matter, to be honest.

mackenza · August 27, 2015, 7:01pm

It did... turns out the default elasticsearch template in Logstash has a
filter on it of logstash-*. When I changed the index names, all was good
and I can see the .raw fields now.

Thanks for your help!

tbragin · August 28, 2015, 3:49am

Cool! That makes sense. Thanks for the update

jerrac · October 23, 2015, 6:06pm

I have a similar issue. Did you change the default index patter in the Kibana 4 settings to logstash-*? Or did you change something in the template the logstash uses when it sends data to elasticsearch?

mackenza · October 23, 2015, 6:37pm

@jerrac I had to name my indices to be name logstash-*. This means I had to change my Logstash config to output to indices with that name pattern... so where I had events in my Elasticsearch output index, I had to change it to logstash-events. Kibana can only register indices that exist so just changing Kibana isn't enough.

The reason for this is because the default template that ships with Logstash only adds the raw fields to indices that match the pattern logstash-*. You could change your default template to match all indices... but that is much harder than just renaming them to match

jerrac · October 23, 2015, 7:04pm

Hmm... All my indices are already named logstash-YYY-MM-DD.

I do have some .raw fields. So maybe this isn't my problem. Thanks for the reply.

mackenza · October 23, 2015, 7:44pm

I see. Specifically my problem was I did not see any fieldname.raw fields in Kibana. This was a direct result of not naming my indices with the logstash-* pattern. As soon as I renamed them and rebuilt them in ElasticSearch, I was fine.

jerrac · October 24, 2015, 12:10am

Heh. Well, it was user error on my part . I needed to refresh my window...

jhnlsn · October 30, 2015, 7:45pm

In my case, it does not look like logstash ships the template over to elastic search when using the HTTP ES plugin. I don't even have a logstash template on my ES index. Not sure why or what the _template is supposed to be?

zappe · November 20, 2015, 2:56pm

Is that the only option, naming them logstash-*?

zappe · November 27, 2015, 11:05am

Bump.

mackenza · November 27, 2015, 1:43pm

The alternative is to change the default template to not implement that filter. I have no idea how that is done, though. I think I saw it one time on StackOverflow but I just went with the flow and renamed my indices.

zappe · November 27, 2015, 2:15pm

Thanks for the info!

Anybody else know how to change that?

vtst2412 · November 27, 2015, 6:20pm

If you don't want to change your indices name, you can add a new dynamic template with a pattern matching your indices.

For example, if my indices are psm-*, I would PUT /_template

     "psm1" : {
         "order" : 0,
         "template" : "psm-*",
         "settings" : {
             "index" : {
                 "refresh_interval" : "5s"
             }
         },
         "mappings" : {
             "_default_" : {
                 "dynamic_templates" : [{
                         "message_field" : {
                             "mapping" : {
                                 "index" : "analyzed",
                                 "omit_norms" : true,
                                 "type" : "string",
                                 "fields" : {
                                     "raw" : {
                                         "ignore_above" : 256,
                                         "index" : "not_analyzed",
                                         "type" : "string"
                                     }
                                 }
                             },
                             "match_mapping_type" : "string",
                             "match" : "message"
                         }
                     }, {
                         "string_fields" : {
                             "mapping" : {
                                 "index" : "analyzed",
                                 "omit_norms" : true,
                                 "type" : "string",
                                 "fields" : {
                                     "raw" : {
                                         "ignore_above" : 256,
                                         "index" : "not_analyzed",
                                         "type" : "string"
                                     }
                                 }
                             },
                             "match_mapping_type" : "string",
                             "match" : "*"
                         }
                     }
                 ],
                 "_all" : {
                     "omit_norms" : true,
                     "enabled" : true
                 },
                 "properties" : {
                     "geoip" : {
                         "dynamic" : true,
                         "type" : "object",
                         "properties" : {
                             "location" : {
                                 "type" : "geo_point"
                             }
                         }
                     },
                     "@version" : {
                         "index" : "not_analyzed",
                         "type" : "string"
                     }
                 }
             }
         },
         "aliases" : {}

     }
 }

Now this is a pretty boilerplate template and you can customize it further to fit your needs. But the important stanza is:

 {
     "string_fields" : {
             "mapping" : {
                    "index" : "analyzed",
                   "omit_norms" : true,
                   "type" : "string",
                   "fields" : {
                        "raw" : {
                              "ignore_above" : 256,
                              "index" : "not_analyzed",
                              "type" : "string"
                          }
                    }
               },
               "match_mapping_type" : "string",
               "match" : "*"
      }
   }

Which gives you an extra "raw" field that is "not analyzed" for all string fields in this index that matches the wildcard "*". That wildcard match will be useful if you only want certain string fields to have a "raw" sub field.

ayasha88 · December 1, 2015, 10:20am

Hi i have created an index like this

<connectionString value="Server=10.30.1.63;Index=logstash;Port=9200;rolling=true"/>

and then i use the following fields:

<param name="ConversionPattern" value="%date - %level - %message %property{mstimeload} %property{applicationid} %property{applicationid} %property{page} 
           %property{ipclient} %property{browser} %property{browsersignature} %property{appversion} %property{sessionuniquecodetag} %property{globalcountertailsloaded} 
           %property{ipserveraddress} %newline" />

And in kibana I don't see any .raw fields. I would like some of the field to be "not_analyzed". My index is like this:

{
   "logstash-2015.12.01": {
   "aliases": {},
   "mappings": {
     "logEvent": {
        "properties": {
           "className": {
              "type": "string"
            },
           "domain": {
              "type": "string"
           },
           "exception": {
               "type": "object"
           },
           "fileName": {
              "type": "string"
           },
           "fix": {
              "type": "string"
           },
          "fullInfo": {
             "type": "string"
           },
           "hostName": {
              "type": "string"
           },
           "identity": {
              "type": "string"
           },
           "level": {
              "type": "string"
           },
           "lineNumber": {
              "type": "string"
           },
           "loggerName": {
              "type": "string"
           },
           "message": {
              "type": "string"
           },
          "messageObject": {
             "type": "object"
          },
          "methodName": {
             "type": "string"
          },
         "properties": {
             "properties": {
                "@timestamp": {
                    "type": "date",
                    "format": "strict_date_optional_time||epoch_millis"
                 },
                "applicationid": {
                   "type": "string"
                },
                "appversion": {
                   "type": "string"
                },
                "browser": {
                   "type": "string"
                },
               "browsersignature": {
                   "type": "string"
                },
               "ipclient": {
                   "type": "string"
                },
               "ipserveraddress": {
                  "type": "string"
                },
               "log4net:HostName": {
                 "type": "string"
               },
               "log4net:Identity": {
                 "type": "string"
               },
              "log4net:UserName": {
                 "type": "string"
               },
              "mstimeload": {
                  "type": "string"
               },
               "page": {
                  "type": "string"
                },
               "sessionuniquecodetag": {
                  "type": "string"
               }
            }
         },
        "threadName": {
             "type": "string"
        },
        "timeStamp": {
            "type": "date",
            "format": "strict_date_optional_time||epoch_millis"
         },
         "userName": {
          "type": "string"
       }
      }
    }
  },
 "settings": {
      "index": {
       "creation_date": "1448964809178",
       "number_of_shards": "5",
       "number_of_replicas": "1",
       "uuid": "fFRoaMN4QDSFNCta4lV_QQ",
       "version": {
          "created": "2000099"
       }
    }
 },
 "warmers": {}
  }
 }

How can I do it?

vtst2412 · December 1, 2015, 1:28pm

As this is a logstash-* index, I'm surprised that it is not using the default logstash dynamic template (which gives you a .raw field for each string field). You can modify the template I posted above by changing{"template" : "psm-*"} to "template" : "logstash-*". Then any new indices created with the name logstash-* will start using that template and all your string fields will have a .raw field. I am however not sure about the nested properties field in your index though. The wild card match might handle the nested string fields, it might not.

I'm also curious to see why the default logstash template is not being used here for your index. Can you run this and provide us with the output (formatted or Github Gist would be nice)?

GET /_template

ayasha88 · December 1, 2015, 2:05pm

It looks like my default template is packetbeat-.. how is this possible??
{
"packetbeat": {
"order": 0,
"template": "packetbeat-",
"settings": {
"index": {
"refresh_interval": "5s"
}
},
"mappings": {
"default": {
"dynamic_templates": [
{
"template1": {
"mapping": {
"ignore_above": 1024,
"index": "not_analyzed",
"type": "{dynamic_type}",
"doc_values": true
},
"match": "*"
}
}
],
"_all": {
"norms": {
"enabled": false
},
"enabled": true
},
"properties": {
"request": {
"norms": {
"enabled": false
},
"index": "analyzed",
"type": "string"
},
"client_location": {
"type": "geo_point"
},
"response": {
"norms": {
"enabled": false
},
"index": "analyzed",
"type": "string"
},
"query": {
"index": "not_analyzed",
"type": "string",
"doc_values": true
},
"params": {
"norms": {
"enabled": false
},
"index": "analyzed",
"type": "string"
},
"timestamp": {
"type": "date"
}
}
}
},
"aliases": {}
}
}

Topic		Replies	Views
Analyzed fields not having .raw in Kibana for visualization Kibana	2	1051	July 6, 2017
Logstash input twitter Logstash	6	470	July 6, 2017
Kibana 4.1.2 - Raw fields not displaying when creating Terms sub-aggregation Kibana	1	800	July 6, 2017
[SOLVED] Visualize and raw field Kibana	3	1349	July 6, 2017
No .raw field in kibana Logstash	4	2353	September 27, 2017

Confused about how to use .raw fields and not analyze string fields

Related topics