Apm-server keeps retrying to insert an incorrect data

Elasticsearch version: 7.17.5

APM Server version: 7.17.5

APM Agent language and version: open-telemetry .net,

Original install method (e.g. download page, yum, deb, from source, etc.) and version: deb

Fresh install or upgraded from other version? fresh

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant):

We have our .net application that happened to send an invalid data (float val of -Infinity or +Infinity). The data caused the APM server to enter a retry loop, in which the apm always failed with message "failed to perform any bulk index operations: unsupported float value: -Inf". The loop caused the new data to be ignored by the apm (we were eventually getting "queue is full" error). After this error, no new data were inserted into the elastic, and the only thing that resolved this issue was the apm-server restart.

While we can fix this on the application side, I wonder if there is any settings on the apm-server (either ignore or discard the invalid data)?

Steps to reproduce:

  1. .NET app, with elastic apm and open telemetry library
  2. Start a new Activity
  3. Set a tag to the activity with float value of +-Infinity
  4. See the following error in apm logs and don't see any new data in the elastic

Provide logs and/or server output (if relevant):

Logs that keep repeating roughly three times per second

feb 23 05:22:48 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:48.618+0100","log.logger":"publisher","log.origin":{"file.name":"pipeline/retry.go","file.line":217},"message":"  done","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:48 <server-name> apm-server[91168]: {"log.level":"error","@timestamp":"2023-02-23T05:22:48.618+0100","log.logger":"elasticsearch","log.origin":{"file.name":"elasticsearch/client.go","file.line":226},"message":"failed to perform any bulk index operations: unsupported float value: -Inf","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"error","@timestamp":"2023-02-23T05:22:49.048+0100","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":180},"message":"failed to publish events: unsupported float value: +Inf","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.048+0100","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":143},"message":"Connecting to backoff(elasticsearch(<serverip>))","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.048+0100","log.logger":"publisher","log.origin":{"file.name":"pipeline/retry.go","file.line":213},"message":"retryer: send wait signal to consumer","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.048+0100","log.logger":"publisher","log.origin":{"file.name":"pipeline/retry.go","file.line":217},"message":"  done","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.049+0100","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":285},"message":"Attempting to connect to Elasticsearch version 7.17.5","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.095+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":93},"message":"Overwrite ILM setup is disabled.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.099+0100","log.origin":{"file.name":"template/load.go","file.line":110},"message":"Template \"apm-7.17.5\" already exists and will not be overwritten.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.099+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":212},"message":"Finished loading index template.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.100+0100","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":170},"message":"ILM policy apm-rollover-30-days exists already.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.100+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":248},"message":"ILM policy apm-rollover-30-days successfully loaded.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.101+0100","log.origin":{"file.name":"template/load.go","file.line":110},"message":"Template \"apm-7.17.5-span\" already exists and will not be overwritten.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.101+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":229},"message":"Finished template setup for apm-7.17.5-span.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.102+0100","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":126},"message":"Index Alias apm-7.17.5-span exists already.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.103+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":265},"message":"Write alias apm-7.17.5-span successfully generated.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.103+0100","log.origin":{"file.name":"template/load.go","file.line":110},"message":"Template \"apm-7.17.5-transaction\" already exists and will not be overwritten.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.103+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":229},"message":"Finished template setup for apm-7.17.5-transaction.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.106+0100","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":126},"message":"Index Alias apm-7.17.5-transaction exists already.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.106+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":265},"message":"Write alias apm-7.17.5-transaction successfully generated.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.107+0100","log.origin":{"file.name":"template/load.go","file.line":110},"message":"Template \"apm-7.17.5-error\" already exists and will not be overwritten.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.107+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":229},"message":"Finished template setup for apm-7.17.5-error.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.109+0100","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":126},"message":"Index Alias apm-7.17.5-error exists already.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.109+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":265},"message":"Write alias apm-7.17.5-error successfully generated.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.110+0100","log.origin":{"file.name":"template/load.go","file.line":110},"message":"Template \"apm-7.17.5-metric\" already exists and will not be overwritten.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.110+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":229},"message":"Finished template setup for apm-7.17.5-metric.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.112+0100","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":126},"message":"Index Alias apm-7.17.5-metric exists already.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.112+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":265},"message":"Write alias apm-7.17.5-metric successfully generated.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.113+0100","log.origin":{"file.name":"template/load.go","file.line":110},"message":"Template \"apm-7.17.5-profile\" already exists and will not be overwritten.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.113+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":229},"message":"Finished template setup for apm-7.17.5-profile.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.115+0100","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":126},"message":"Index Alias apm-7.17.5-profile exists already.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.115+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":265},"message":"Write alias apm-7.17.5-profile successfully generated.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.115+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/manager.go","file.line":130},"message":"Finished index management setup.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.117+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.118+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_data_stream_migration","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.119+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_user_agent","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.120+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_user_geo","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.120+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_ingest_timestamp","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.121+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_remove_span_metadata","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.122+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_error_grouping_name","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.123+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":55},"message":"Pipeline already registered: apm_metrics_dynamic_template","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.123+0100","log.logger":"pipelines","log.origin":{"file.name":"pipeline/register.go","file.line":58},"message":"Registered Ingest Pipelines successfully.","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.123+0100","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":151},"message":"Connection to backoff(elasticsearch(<serverip>)) established","service.name":"apm-server","ecs.version":"1.6.0"}
feb 23 05:22:49 <server-name> apm-server[91168]: {"log.level":"info","@timestamp":"2023-02-23T05:22:49.123+0100","log.logger":"publisher","log.origin":{"file.name":"pipeline/retry.go","file.line":213},"message":"retryer: send wait signal to consumer","service.name":"apm-server","ecs.version":"1.6.0"}

apm configuration:

apm-server:
  host: ":8200"
  max_header_size: 1048576
  idle_timeout: 90s
  read_timeout: 300s
  write_timeout: 300s
  max_event_size: 1457600
queue:
  mem:
    events: 40000
output.elasticsearch:
        hosts: ["<elastic-host-ip>"]
        worker: 5
        bulk_max_size: 0
        timeout: 300

Hi @blazek , I managed to reproduce your problem with a 7.17.5 APM server with OTel input. Since infinity is not a valid JSON float, there will be an encoding error right before sending the event off to Elasticsearch.

There is indeed a workaround to ignore/discard the invalid data for 7.17 APM server which will no longer be available from APM server 8.0+. It will be to add a processor to APM server config as described in filebeat docs through a top level processor key in apm-server.yml or passing a command line option.

I have confirmed that it works by starting standalone APM server with command line option -E processors=[{"drop_fields":{"fields":["labels.infinity"],"ignore_missing":true}}], which will drop the label field that contains an infinity float value and APM server will no longer throw an error.

Thank you for your response. I guess we will have to use open telemetry collector, to provide some "filter" gate before the data go to the elasticsearch.

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.