Error while running transform `task encountered irrecoverable failure`

Hey team,

I'm running ELK 8.13
I have a transform that occasionally fails to run.
If I restart it - the failure persists.
If I recreate it - the error goes away for a few days and that returns.

It fails with this error:

task encountered irrecoverable failure: 
[.ds-logs-stocklog-default-2024.06.05-000016/eYTuyXMETM-AvFH3-PTJyw] 
org.elasticsearch.index.query.QueryShardException: 
failed to create query: 
failed to parse date field [1672185600000] with format [strict_date_optional_time]: 
[failed to parse date field [1672185600000] with format [strict_date_optional_time]]; 
org.elasticsearch.ElasticsearchParseException: 
failed to parse date field [1672185600000] with format [strict_date_optional_time]: 
[failed to parse date field [1672185600000] with format [strict_date_optional_time]]; 
java.lang.IllegalArgumentException: 
failed to parse date field [1672185600000] with format [strict_date_optional_time]


The field that it fails on contains the values:

      "ref_date": "2024-04-03T00:00:00",

which is not a strict_date_optional_time field, but the transform does work if I delete and recreate the transform...
Can you please help me troubleshoot and understand this error?

Thanks!

Are there any dates or ranges included in the source query part of the Transform config? If it's always failing on "ref_date": "2024-04-03T00:00:00",, it might just take time for the Transform to get to that doc once it's deleted and recreated, which would explain why it works for a bit and then fails.

What is the index mapping for the ref_date field, and can it be changed to strict_date_optional_time||epoch_millis as a quick fix?

Hey Patrick

The docs are never deleted \ recreated so that is probably not the issue.

I also thought about changing the mapping, but if I look in the index I can't find any documents where there's a value that looks like a timestamp.

Are you suggesting that 2024-04-03T00:00:00 is the timestamp value? :open_mouth:

Kinda? Maybe? Here's what I'm thinking

The error is coming from the search query - it isn't coming from the documents. Here is how I was able to reproduce it:

# create index
PUT /test_date_parsing

# create index mapping
PUT /test_date_parsing/_mapping
{
  "properties": {
    "ref_date": {
      "type": "date",
      "format": "strict_date_optional_time"
    }
  }
}

POST /test_date_parsing/_doc
{
  "ref_date": "2024-04-03T00:00:00"
}

GET /test_date_parsing/_search
{
  "query": {
    "range": {
      "ref_date": {
        "gte": "1672185600000"
      }
    }
  }
}

This will throw the same error, because the search query's date is in the wrong format.

Transforms just runs a search request, and it will add a similar range term to the query to slice up the search requests into checkpoints. The date value will always be in epoch_millis (that I can see, anyway).

If your Transform Config has the sync.time.field set to ref_date, then Transform will add something like "range": { "ref_date": { "gte": "1672185600000" } to the search request. If the source destination's mapping for ref_date is strict_date_optional_time and not strict_date_optional_time||epoch_millis, then the Transform will fail to perform the search request with that error.

If your sync.time.field is not pointing to ref_date, then I would guess there is an additional search query term that is looking at ref_date and supplying an epoch value.

1 Like

Thanks @Patrick_Whelan

I found the most random bug in the process.
The order in which the format is specified matters.

If I use "strict_date_optional_time||epoch_millis" it doesn't work,
But if I use "epoch_millis||strict_date_optional_time" it does

To test this you can create a test component template:

PUT _component_template/test-logs-generic
{
  "template": {
    "mappings": {
      "properties": {
        "goods.ref_date": {
          "type": "date",
         # This doesn't work
         "format": "strict_date_optional_time||epoch_millis", 
         # This works
#        "format": "epoch_millis||strict_date_optional_time", 
          "ignore_malformed": false
        }
      }
    }
  }
}

Then create an index template that uses it:

PUT _index_template/test-logs-template
{
  "index_patterns": ["test-logs-*"],
  "priority": 1,
  "composed_of": ["test-logs-generic"]
}

Then test the output with the simulate API:

POST _index_template/_simulate/test-logs-template

You will see that the mapping differs in both cases:

If strict_date_optional_time is first:

          "properties": {
            "ref_date": {
              "type": "date"
            }
          }

But if epoch_millis is first:

          "properties": {
            "ref_date": {
              "type": "date",
              "format": "epoch_millis||strict_date_optional_time"
            }
          }

Let me know if you agree this is an actual bug. I can open an issue (or maybe you want to pass it to the team)

Yeah that looks like a bug, just from the fact that the simulate works depending on the order. That's a great catch. Can you open an issue, and I'll try to follow up with the correct team?