[SOLVED] Pipeline: field [timestamp] not present as part of path [timestamp]

Hello,

I have a weird issue using the Ingestion Pipeline. When I parse with a single Grok processor, I get all the fields parsed properly (including timestam). Howver, if I had a subsequent "date" processor, I get "[timestamp] not present as part of path [timestamp]".

Here is a test pipleline, tested against docker.elastic.co/elasticsearch/elasticsearch:5.3.3 ; which reproduces what we experience in our 5.3.3 (non-docker) cluster.

This simulattion works perfectly:

curl -u elastic:changeme 'localhost:9200/_ingest/pipeline/_simulate?pretty&verbose' -d'
{
    "pipeline": {
        "description": "Grok ingestion pipeline nginx logs",
        "version": 0,
        "processors": [
            {
                "grok": {
                    "field": "message",
                    "trace_match": true,
                    "patterns": [
                        "%{COMBINEDAPACHELOG}%{GREEDYDATA:additional_fields}"
                    ]
                }
            }
        ]
    },
    "docs": [
        {
            "_index": "index",
            "_type": "nginx",
            "_id": "id",
            "_source": {
                "message": "1.2.3.4 - - [28/Mar/2018:18:21:44 +0200] \"GET / HTTP/1.1\" 302 213 \"-\" \"Mozilla\""
            }
        }
    ]
}'

and yields:

"processor_results" : [
{
    "doc" : {
    "_id" : "id",
    "_type" : "nginx",
    "_index" : "index",
    "_source" : {
        "ident" : "-",
        "verb" : "GET",
        "additional_fields" : "",
        "message" : "1.2.3.4 - - [28/Mar/2018:18:21:44 +0200] \"GET / HTTP/1.1\" 302 213 \"-\" \"Mozilla\"",
        "response" : "302",
        "httpversion" : "1.1",
        "timestamp" : "28/Mar/2018:18:21:44 +0200"
...

ow if I add a date processor, it fails complaining about timestamp not being present (despite being properly extracted above):

curl -u elastic:changeme 'localhost:9200/_ingest/pipeline/_simulate?pretty&verbose' -d'
{
    "pipeline": {
        "description": "Grok ingestion pipeline nginx logs",
        "version": 0,
        "processors": [
            {
                "grok": {
                    "field": "message",
                    "trace_match": true,
                    "patterns": [
                        "%{COMBINEDAPACHELOG}%{GREEDYDATA:additional_fields}"
                    ]
                },
                "date": {
                    "field": "timestamp",
                    "formats": [
                        "dd/MMM/yyyy:HH:mm:ss Z"
                    ],
                    "timezone": "Europe/Paris"
                }
            }
        ]
    },
    "docs": [
        {
            "_index": "index",
            "_type": "nginx",
            "_id": "id",
            "_source": {
                "message": "1.2.3.4 - - [28/Mar/2018:18:21:44 +0200] \"GET / HTTP/1.1\" 302 213 \"-\" \"Mozilla\""
            }
        }
    ]
}'

The response is:

{
"docs" : [
    {
    "processor_results" : [
        {
        "error" : {
            "root_cause" : [
            {
                "type" : "illegal_argument_exception",
                "reason" : "field [timestamp] not present as part of path [timestamp]"
            }
            ],
            "type" : "illegal_argument_exception",
            "reason" : "field [timestamp] not present as part of path [timestamp]"
        }
        }
    ]
    }
]
}

Any clue about what is going on ?

Thanks

M

Ok I figured it out. Each processor is a JSON object and should be enclose by staches ({}).
Thus, this contruct is wrong!

processors": [
        {
            "grok": {
                "field": "message",
                "trace_match": true,
                "patterns": [
                    "%{COMBINEDAPACHELOG}%{GREEDYDATA:additional_fields}"
                ]
            },
            "date": {

because date and grok are in the same object. It should look like:

{ 
    "grok": ....
},{
    "date": ....
}

So the complete (working) simulate example in shown in the original question should look like this:

curl -u elastic:changeme  -XPOST 'localhost:9200/_ingest/pipeline/_simulate' -H"Content-Type: application/json" -d'
{
    "pipeline": {
        "description": "Grok ingestion pipeline nginx logs",
        "version": 0,
        "processors": [
            {
                "grok": {
                    "field": "message",
                    "trace_match": true,
                    "patterns": [
                        "%{COMBINEDAPACHELOG}%{GREEDYDATA:additional_fields}"
                    ]
                }
            },{
                "date" : {
                    "field" : "timestamp",
                    "formats" : [
                      "dd/MMM/yyyy:HH:mm:ss Z"
                    ],
                    "timezone" : "Europe/Paris"
                  }
                }
            
        ]
    },
    "docs": [
        {
            "_index": "index",
            "_type": "nginx",
            "_id": "id",
            "_source": {
                "message": "1.2.3.4 - - [28/Mar/2018:18:21:44 +0200] \"GET / HTTP/1.1\" 302 213 \"-\" \"Mozilla\""
            }
        }
    ]
}'
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.