No data for query "exists": { "field": "event.original" }

I'm sending web access logs from Nginx server to Elastic. Everything works.
I can see all data in Kibana correctly:

However, when I try to add
"exists": { "field": "event.original" }
I'm getting no data:

What am I missing?
Looks like Elasticsearch bug.

However, same query for other fields is working:

Can you try "exists": { "field": "event.original.keyword" } and see if that works?

If you are ingesting with Filebeat, the default mapping for event.original is:

"original": {
  "ignore_above": 1024,
  "index": false,
  "type": "keyword",
  "doc_values": false
}

The "index": false means that the field can't be used for a filter, and only can retrieved from _source.

You might want to look at the message field, generally it contains the same information, and is indexed by default.

2 Likes

Tried.
No, doesn't work. Getting "No results match you search criteria"...

Let's back up a bit..what are you trying to accomplish?

The event.original is parsed up into all the separate fields... Those can be searched on.. the event.original can be returned if needed from source as @BenB196 stated.

If you are trying to find docs that don't have event.original you won't have any of the other fields.. so pick one of those other fields to test for do not exist..

BTW pretty sure that ngnix logs do not have a message field

List of fields...

So back to my first question...what are you trying to accomplish / figure out?

There is no message field. I see, that nginx's module creates pipeline, which actually renames message to event.original:
pipeline.yml

- rename:
    field: message
    target_field: event.original

Generally, what I'm trying to do is to build Web Analytics dashboard, using, non-Elastic application.
I already using Filebeat to send all web access logs to Elastic.
I did basic web analytics dashboard, but ES has big limitations for web analytics:

  • I can't get session counts, session length, returning users. etc...
  • There is no good set of filter to filter out all bots in the logs. I can do basic one, but it's moving target.

So, I decided, that I'll create daily cron job to pull all web access data from ES and feed it to Matomo. I only pull event.original field.
It generally works, but the shell scripts I wrote to pull that data sometimes fails, because some of the events doesn't have "event.source". So, I tried to write the query to make sure, that events I pull have event.original but such query fails.

Here is the daily cron script:

NAME="meshumad.com-$DATE"
curl -XGET "https://***.westus2.azure.elastic-cloud.com:9243/filebeat-*/_search?size=10000" -u elastic:*** -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "'$DATE'T00:00:00",
              "lte": "'$DATE'T23:59:59"
            }
          }
        },
        {
          "match_phrase": {
            "container.name": "meshumad.com"
          }
        },
        {
          "match_phrase": {
            "http.request.method": "GET"
          }
        },
        {
          "exists": {
            "field": "source.ip"
          }
        },
       {
         "exists": {
           "field": "event.original"
         }
      }
      ],
      "must_not": [
        {
          "match_phrase": {
            "url.original": "/csrftoken"
          }
        }
      ]
    }
  },
  "fields": [
    "event.original"
  ],
  "sort": [
    {
      "@timestamp": "asc"
    }
  ],
  "_source": false
}' > $NAME.raw
cat $NAME.raw | jq -r '.hits.hits[] | if has("fields") then .fields."event.original"[] else .ignored_field_values."event.original"[] end' > $NAME.log
sudo docker exec matomo-app python3 /var/www/html/misc/log-analytics/import_logs.py --url=http://localhost --login=slavik --password=*** --add-sites-new-hosts  --recorders=1 /import/*.log

The error I was getting:

jq: error (at :0): Cannot iterate over null (null)

And when I debugged I found, that jq fails because some of entries doesn't have event.original nor ignored_field_values."event.original".

For example I found this entry in the the result:

{"_index":"filebeat-7.9.0-2022.01.08-000017","_type":"_doc","_id":"lYzMsn4BnBx-UCJhOtim","_score":null,"sort":[1643677228000]}

After some more troubleshooting, I found, that one of the web server had older version of filebeat, which had customized pipeline, which didn't have event.original.

When I excluded that host - everything works, without issues.

Issue resolved for my case.

Even though I still puzzled, why"exists": { "field": "event.original" } doesn't work. BenB196 explanation make sense, so I guess that's why.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.