Field maxlength?


(Craig Foote) #1

I have a java log with some entries having stacktraces. I'm using the multiline option in filebeat and a grok filter in logstash with the pattern ending in GREEDYDATA, i.e.:

SNIP%{GREEDYDATA:logmessage}

This works fine with logmessage sometimes having up to 100 lines though. In Kibana's Discover tab I see the whole stacktrace but in visualizations they appear empty, as if there was no data in that field. Could this be caused by a maximum field length in Kibana visualizations? I'm using version 4.4.0.


String field shows up in Discovery but not in Visualize
(Lee Drengenberg) #2

I'm going to try to reproduce your issue but need to create some test data... But let me know if you already figured it out.


(Lee Drengenberg) #3

Can you show a screenshot of what you see? Do you have the .raw field for the long string field?
In my test I don't (yet) and so I only get the first word of the long string field instead of the whole string.


(Lee Drengenberg) #4

Once I added the .raw field I can see a very long string (over 4000 characters) in a visualization (data table) (I actually did one more with a string over 8000 characters and it all shows up);


(Craig Foote) #5

Thanks Lee. BTW, I'm speaking with Jay Greenberg about this. Sorry about the crosspost. The only difference I see compared to yours is mine is a Java stacktrace and so typically starts with something like "org.something.something.SomeException: some message. \n\t some other lines". That and one stacktrace is 14000 characters.


(Jay Greenberg) #6

There is a maximum term length under Lucene.

You can avert this behaviour by using the ignore-above setting to ensure you come in under the limit. Something like this should work:

{
"exception": {
     "mapping": {
        "type": "string",
        "fields": {
           "raw": {
              "ignore_above": 10922,
              "index": "not_analyzed",
              "type": "string"
           }
        }
     }
  }
}

From the docs:

The value for ignore_above is the character count, but Lucene counts bytes. If you use UTF-8 text with many non-ASCII characters, you may want to set the limit to 32766 / 3 = 10922 since UTF-8 characters may occupy at most 3 bytes.


(system) #7