Cannot post document to TimeSeries DataStream after upgrade from 8.6.2 to 8.7.1

Hi everyone,
We had some Timeseries DataStream in version 8.6.2 and we have just upgraded our cluster to version 8.7.1. However, after the upgrade, new documents cannot be posted to those TSDS and we got the following errors:

{
  "error": {
    "root_cause": [
      {
        "type": "null_pointer_exception",
        "reason": """Cannot invoke "org.apache.lucene.index.PointValues.getMinPackedValue()" because "tsPointValues" is null"""
      }
    ],
    "type": "null_pointer_exception",
    "reason": """Cannot invoke "org.apache.lucene.index.PointValues.getMinPackedValue()" because "tsPointValues" is null"""
  },
  "status": 500
}

even the data is good and we definitely have valid @timestamp

Does anyone have the same issue and know how to fix it? I would be common error for ones who has TSDS before the GA (8.7.0) and want to move to latest version of ES.

Best regards,

Hey,

This is a bug and has been fixed via: Fix NPE when indexing a document that just has been deleted in a tsdb index by martijnvg · Pull Request #96461 · elastic/elasticsearch · GitHub
This fix will be available in the next bug fix release.

Were you performing deletes or updating documents directly against backing indices of the tsdb data stream?

There is no workaround, but I think that executing a flush or force merge could resolve this error. But you may be at risk of running into this error again.

Martijn

1 Like

Thank you, we will wait for the new release!

Hi,
We upgrade our cluster to 8.8.1 which contains the fix you mentioned. However, there is another error occurs when we try to post documents to the backing index which was created in 8.6.2.

POST micrometer-metrics/_doc
{
  "@timestamp": "2023-06-05T17:00:12.561Z",
  "area": "nonheap",
  "dimension1": "a",
  "dimension2": "b",
  "dimension3": "c",
  "dimension4": "d",
  "id": "JIT data cache",
  "name": "jvm_memory_max",
  "type": "gauge",
  "value": 402653184
}

and the error is

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": """cannot change field "value" from points dimensionCount=1, indexDimensionCount=1, numBytes=8 to inconsistent dimensionCount=0, indexDimensionCount=0, numBytes=0"""
      }
    ],
    "type": "illegal_argument_exception",
    "reason": """cannot change field "value" from points dimensionCount=1, indexDimensionCount=1, numBytes=8 to inconsistent dimensionCount=0, indexDimensionCount=0, numBytes=0"""
  },
  "status": 400
}

Do you have a way to overcome this issue?

Thanks for reporting this. This is unfortunately a different issue.
I will look into this and get back to you.

How is the field value mapped in your template and was this changed?
Are you able to reproduce this issue in a minimalistic way?

Hi,
The value in my mapping is quite simple and haven't change since 8.6.2

It looks like this

        "value": {
          "type": "double",
          "time_series_metric": "gauge"
        }

All my services are running on AWS so unfortunately not be able to reproduce it easily.
But the steps so far is

  • Create TSDS in 8.6.2
  • Migrate to 8.7.1
  • Migrate to 8.8.1

I have to reopen the old backing index since because it blocks write:

PUT .ds-micrometer-.../_settings
{
"blocks":{
    "write":null
}
}

However, i cann't imagine that it is the cause
Setttings of the index

{
  ".ds-micrometer-metrics-2023.06.05-000079": {
    "settings": {
      "index": {
        "hidden": "true",
        "time_series": {
          "end_time": "2023-06-06T16:34:09.000Z",
          "start_time": "2023-06-05T11:08:49.851Z"
        },
        "blocks": {
          "read_only_allow_delete": "false"
        },
        "provided_name": ".ds-micrometer-metrics-2023.06.05-000079",
        "creation_date": "1685938054920",
        "priority": "50",
        "number_of_replicas": "0",
        "routing_path": [
          "a",
          "b",
          "c",
          "d"
        ],
        "uuid": "ZsTKxh0bRxuC_VvrMxjsKA",
        "version": {
          "created": "8060299"
        },
        "lifecycle": {
          "name": "metrics-timeseries-datastream",
          "indexing_complete": "true"
        },
        "mode": "time_series",
        "codec": "best_compression",
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_warm,data_hot"
            }
          }
        },
        "number_of_shards": "1",
        "look_ahead_time": "7h"
      }
    }
  }
}

I think the cause of the bug is that at the time when you adopted tsdb it was in tech preview. One change that was been made before GA-ing tsdb was changing the default index attribute of all gauge fields to false. It used to be set to true before. Basically gauge fields aren't indexed by default now. I think this contributed to the error that you're seeing.

I don't think that there is a work around for this. Are you able to rollover your alias or data stream? In a new index this error shouldn't occur.

Hi,
That would be very big issue for us and many users already used the feature even before it is GA. It was recommended as replacement for rollup too.
Would it be possible to have another release to auto fix this one when we migrate to that release?

Best regards,

Yes, if we rollover the tsds then the new backing index works but because of "look-ahead" time, documents will go to the old one for some hours before go to the new one. And when they go to old one they will be dropped because of this issue.
Btw, The endtime of old index should be at the moment we do the rollover but it doesn't, it is still : now + look-ahead-time

This is unfortunate and apologies that this happened. However tech preview features should never be used in production. I don't think there was communication about using downsampling as replacement for rollover before it GA-ed.

Correct, the the old tsdb backing will remain to have this issue. Unfortunately other workarounds I can come up with are not ideal:

  • Create a new data stream. The old and new data stream can be queried together via the Search API. New data should be indexed in the new data stream.
  • Downgrade the current data stream (but removing the index.mode setting from template and perform a rollover). Then all data will be indexed into the latest backing index. Then after some time (end_time of last tsdb index + look_ahead_time) enable upgrade to tsdb (by adding index.mode setting and performing a rollover)

Hi,
The 1st workaround suggests that there will be some hours that document will be dropped?

The 2nd workaroud mean that:

    1. down grade from TSDS to DS by remove index.mode from template
    1. do a rollover => then there will be new backing index in DS (not TSDS)
    1. all new data will go to this new backing index after (endtime of the previous tsds index + look ahead)
    1. then upgrade to 8.8.1
    1. then change index.mode back to TSDS
    1. do another rollOver to have "good" backing index
      Do i understand it correctly? The downside of this one is that there are some hours when the data in DS we cannot do the "downsample"?

How is about another way:

  • change the lookahead to be 10 min
  • do rollover to have new backing index with 10min look ahead
  • wait till document goes to new backing index
  • upgrade to 8.8.1 and do another rollover right after

Then we have all in TSDS and have to accept that there is 10min that data will be lost?
What do you think if it is possible?

I think that the upgrade step isn't needed (assuming that you use 8.7.1). But this should work.
And is basically the rollover workaround that I mentioned earlier. Maybe you can reduce the lookahead to an even lower value?

No there will not be data loss, but your data is not in two data streams instead of one. The old data stream can optionally be removed later based on your data retention policy. So you need to configure your queries/dashboard/etc. to include it as well. So depending on how much work that is this may not be feasible.

Clear. Then for the 1st option, we have to change the application which generates the data so it will send to the new TSDT instead of the 1st one?

Yes, but also the application that queries the data.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.