Ad hoc query regex (this time with sample data)

Here is all the consolidated information from the many splintered threads I've posted over the last week or two:

Here are my software versions:

  • logstash-5.2.2-1.noarch
  • elasticsearch-5.2.1-1.noarch
  • kibana-5.2.1-1.x86_64

Here is my as-basic-as-I-can-make-it Logstash config:

input {
file {
path => ["/var/local/test-logs/alb/alb-core.log"]
start_position => "beginning"
sincedb_path => "/dev/null"
filter { grok { match => { "message" => "%{GREEDYDATA}" } } }
output { elasticsearch { hosts => ["localhost:9200"] } }

Here is the single line in the input file:

Feb 24 03:48:11 myServer alb-core: 2017-02-24 03:48:02;149 INFO T[pool-32-thread-1] net.myproject.api.messaging.RedisService: Redis Service Message Received - Host: Channel: bigbluebutton:meeting:participants Message: {"timestamp":"1487908082148","externalUserId":"1234567890@foo","internalUserId":"1234567890@foo","meetingId":"ea02e4418fd0709572417991578c281913f2085c296486c0c1d40f284fd33d9c-1487905970320","guest":"false","role":"MODERATOR","messageId":"UserJoinedEvent","fullname":"Doe, John"}

Here are the Analyzers I have tried by going into Kibana Management - Advanced Settings and editing query:queryString:options:

  • Standard
  • Simple
  • Whitespace
  • English

Here is my first problem:

  1. Super basic regex queries flat out don't work, regardless of analyzer.

This works:


This returns 0 results


I have no idea how to make this any more simple to isolate the problem.

Could you try using /.*meetingId.*/ to search for it and check if that reviles any results?


Returns 0 results

The field value seem rather long. I think it might be over the default length of for ignore_above and thus not indexed for search. If you try to put a shorter document in that (usually below 256 chars), would that be found containing that string?

If so, you should most likely adjust the ignore_above value for that field in that index via the mapping for that index, if you know it will contain long values.

I did a reindex of .kibana to tmp, then I deleted the .kibana index, then used the mappings API to change all ignore_above from 256 to 2048, then checked the tmp index to verify the change, then I reindexed tmp back to .kibana, then checked my new .kibana index, and the values are back to 256!

Reindex forces ignore_above back to 256. That can't possibly be by design can it?

OK I cut down my single input line to this

"internalUserId":"1234567890@foo","meetingId":"ea02e4418fd0709572417991578c281913f2085c296486c0c1d40f284fd33d9c-1487905970320","guest":"false","role":"MODERATOR","messageId":"UserJoinedEvent","fullname":"Doe, John"

215 characters total. The basic regex still returns zero results:


OK I isolated the problem:

1.) I took out literally everything from the log message but this


Both queries work for this log message:

  • "meeting"
  • /meeting/

2.) I changed the log message to


This query does NOT work:

  • /meetingI/

This query DOES work:

  • /meeting./

Something about the capital letter I is screwing up regex query.

3.) I put the log line back to a medium length

"internalUserId":"1234567890@foo","meetingId":"ea02e4418fd0709572417991578c281913f2085c296486c0c1d40f284fd33d9c-1487905970320","guest":"false","role":"MODERATOR","messageId":"UserJoinedEvent","fullname":"Doe, John"

This search works:

  • /meeting.d/

This search does NOT work:

  • /meetingId/

The analyzer lowercases everything!

This works:

  • meetingid

Hey Brandon,

yeah if you always applied the Standard analyzer (that includes the lower case token filter) it will be lowercased.

You can check if you have a message.keyword or message.raw field with the same name in the index, that would contain the unanalyzed (and thus not lower cased) value in the field for you to regex on.


