Adding traces-apm.rum@custom pipeline breaks RUM traces

Hi there,

RUM traces work fine in the UI until I create a traces-apm.rum@custom pipeline with a pipeline processor. The presence of the pipeline processor breaks RUM transaction traces creating a perpetual loading animation in the UI under trace sample in Kibana.

After extensive testing I can resolve this by adding a arbitrary set processor after the pipeline processor and also a arbitrary set processor in the failure processors. I can't understand why this is necessary, the pipeline processor does not fail and the conditional on the pipeline processor is working correctly.

If I remove the set processors, tracing breaks. The failure processor is not hit. Is this a bug or am I missing something?

Cheers,
Jamie

@jrhut welcome to the forum!

The presence of the pipeline processor breaks RUM transaction traces creating a perpetual loading animation in the UI under trace sample in Kibana.

This sounds strange, and not something I've observed before. Can you share your traces-apm.rum-genericsearch pipeline? Can you see an errors in the browser console when the UI goes into this perpetual loading state?

Hi @axw cheers!

I can for sure, here it is:

[
  {
    "urldecode": {
      "field": "url.original",
      "target_field": "search.query"
    }
  },
  {
    "gsub": {
      "field": "search.query",
      "pattern": ".*queryParamSearchFilters=",
      "replacement": ""
    }
  },
  {
    "gsub": {
      "field": "search.query",
      "pattern": "[\"\\(\\)\\{\\}\\[\\]]",
      "replacement": ""
    }
  },
  {
    "gsub": {
      "field": "search.query",
      "pattern": "(filters:)",
      "replacement": ""
    }
  },
  {
    "kv": {
      "field": "search.query",
      "field_split": ",",
      "value_split": ":",
      "include_keys": [
        "filterKey",
        "filterValue"
      ]
    }
  },
  {
    "rename": {
      "field": "filterKey",
      "target_field": "search.keys"
    }
  },
  {
    "rename": {
      "field": "filterValue",
      "target_field": "search.values"
    }
  },
  {
    "remove": {
      "field": "search.query"
    }
  }
]

I think after a little more testing the loading state occurred because there was simply no trace sample for the transaction. I can't reproduce the loading trace as I have some trace samples now but if you set the time span to for example 1 second with no transactions it produces the same effect. This screenshot shows the disparity between total transactions and trace samples with a sample rate of 1.

transaction-ss

Also, just to confirm that the data was definitely going to the cluster I checked the browser post requests and all the trace data was there and the request successful.

The failure processor is not hit.

I guess you verified that by checking if any docs had custom.pipes: failure? It sounds like the ingest pipeline is failing, since if there's no processors defined there then it'll just throw an exception.

If you remove the set processors, but configure ignore_failure: true on the pipeline processor, does that allow trace events to be ingested?

Are there any errors reported in the APM Server log?

I think after a little more testing the loading state occurred because there was simply no trace sample for the transaction. I can't reproduce the loading trace as I have some trace samples now but if you set the time span to for example 1 second with no transactions it produces the same effect. This screenshot shows the disparity between total transactions and trace samples with a sample rate of 1.

The latency distribution is based off pre-aggregated metrics (calculated by APM Server), so if individual trace events are being dropped due to a failing ingest pipeline then that may explain the disparity.

Hi @axw

I just tested with ignore_failure: true and still the same result. This time actually I don't get the trace but something does appear in the trace sample. See:

I also set a failure set processor on the genericsearch pipeline but that was never hit.

I'm not sure how to access the APM Server log, my deployment is on Elastic Cloud?

Cheers,
Jamie

In the https://cloud.elastic.co console you can enable logging & metrics, either sending them to a dedicated deployment or to the same deployment (not recommended for production, but fine for this kind of debugging). See How to set up monitoring | Elasticsearch Service Documentation | Elastic

Alternatively you can DM me the deployment ID -- it'll be in the URL in the Elastic Cloud console -- and I can take a look at the server logs.

I just sent you a DM with the deployment ID if that's ok.

1 Like

What do you mean by "I don't get the trace"? If there's a trace sample, what's missing?

This is what it looks like when working. The photo I posted earlier wasn't really a trace just a page load.

1 Like