We're starting to use Elastic for application log storage, and we created a template specific for logs. In order to reduce space as much as possible, this template got no '_all' field, and the only analyzed field is 'message', all others are numbers or keywords, therefore the default field was changed to 'message', which is where still makes sense to search by typing simple text (unspecified field).
The problem is that, since we also send logs made of complex JSON objects, not all the time the 'message' field is present, and since it is the default one, the result is that right after you open Kibana, you see no documents where the 'message' field is missing, which is confusing to users. To make it appear, you can't just use any filter, you must use some query string based filter, even one that is always true, like '_type:log'.
Commands to reproduce:
PUT /_template/my-template
{
"template": "my-index",
"settings": {
"index.query.default_field": "message"
},
"mappings": {
"log": {
"_all": {
"enabled": false
},
"properties": {
"@timestamp": {
"type": "date"
},
"tags": {
"type": "keyword"
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
POST /my-index/log
{
"message": "I have text",
"@timestamp": "2017-07-10T00:00:00Z"
}
POST /my-index/log
{
"message": null,
"@timestamp": "2017-07-10T00:00:01Z"
}
POST /my-index/log
{
"@timestamp": "2017-07-10T00:00:02Z"
}
POST /my-index/log
{
"message": "",
"@timestamp": "2017-07-10T00:00:03Z"
}
Now, if you query it by "everything", just as Kibana right after opened, you just get the first and fourth documents:
GET /my-index/_search?q=*
If you want to see them all, some filter is mandatory:
GET /my-index/_search?q=_type:log
Of course I could force this field to never be null with Logstash, but it felt hackish. Also, analyzed fields don't have a 'null_value' setting as keywords. I'd prefer to keep 'message' null, because then 'exists:message' would be an useful query, but it doesn't seem to be possible.
Is there any other solution to this, something cleaner than forcing this field to a blank value (or enabling '_all')?