I am indexing about 520GB of log files to elasticsearch a day. At this phase I am only keeping 24 hours of data (eventually the goal is 7 days).
Searches can be very slow, especially when I need to search a large field like @message. It can take up to 45 seconds. The time improves if I do not need to use asterisks, it'll reduce from 45 seconds to 9 seconds. If I select which index to search, it'll reduce to 0.51 seconds (no asterisks), or 12.9 seconds (with asterisks) -- times vary. Unfortunately, some users will search for generic strings that require us to append asterisks to find results.
What can I do to improve these queries? I need to address the need for using asterisks and route the user to the appropriate index. Should I try index routing? Are there any good example templates?
Here is an example @message:
A|aBCdef|Jan 22 08:32:26 2013|log.sample.app.call.SampleSvr|12345|node| 123456|bar |CodeName.cpp|123|***** START OF A LONG MESSAGE *****|12345.0123
Here is an example query:
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "@message",
"query": "SampleSvr"
}
}
],
"must_not": [ ],
"should": [ ]
}
},
"from": 0,
"size": 50,
"sort": [ ],
"facets": { }
}