How to integrate the "metrics" and "trace" data

I want to integrate the "metrics" and "trace" data, the "metrics" data is grabbed by "prometheus". Now I use "openTelemetry" "receivers.prometheus_simple" to collect the data and send it to "APM The data is sent to "APM" and cannot be integrated with "trace". The "service.name" of both is different.

Step:
1.run apm,es,kibana:

docker network create -d bridge my-jaeger-net
docker run --name elasticsearch --network=my-jaeger-net -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -d elasticsearch:7.13.2
docker run --name kibana --network=my-jaeger-net -p 5601:5601 -d kibana:7.13.2
docker run -d -p 8200:8200 --name=apm-server --network=my-jaeger-net --user=apm-server elastic/apm-server:7.13.2 --strict.perms=false -e -E output.elasticsearch.hosts=["elasticsearch:9200"]
  1. run openTelemetry: 0.29.0
docker run  --name collector --network my-jaeger-net\
  		  -v otelcontribcol_config.yaml:/otelcontribcol_config.yaml \
          -d otel/opentelemetry-collector-contrib:0.27.0 --config=/config.yaml

config :

receivers:
  jaeger:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14250
  prometheus_simple:
    collection_interval: 10s
    endpoint: "main2:9102"

processors:
  memory_limiter:
    check_interval: 1s
    limit_mib: 2000
  batch:
    
exporters:
  logging/detail:
    loglevel: debug

  otlp/elastic:
    endpoint: apm-server:8200
    insecure: true
    headers:
      Authorization: "Bearer my_secret_token"

service:
  pipelines:
    traces:
      receivers: [ jaeger ] #接收端配置为jaeger。
      exporters: [ otlp/elastic ] #发送端配置为alibabacloud_logservice/sls-trace。
      processors: [ memory_limiter, batch ]
    metrics:
      receivers: [ prometheus_simple ]
      exporters: [ otlp/elastic ]

3.run main2 service
main:9102/metrics response:

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 5.87e-05
go_gc_duration_seconds{quantile="0.25"} 0.0001382
go_gc_duration_seconds{quantile="0.5"} 0.0003104
go_gc_duration_seconds{quantile="0.75"} 0.000651
go_gc_duration_seconds{quantile="1"} 0.0023786
go_gc_duration_seconds_sum 0.0035369
go_gc_duration_seconds_count 5

go_goroutines 260

go_info{version="go1.16.6"} 1

go_memstats_alloc_bytes 5.843192e+06

go_memstats_alloc_bytes_total 1.7327112e+07

go_memstats_buck_hash_sys_bytes 1.449204e+06

go_memstats_frees_total 26693

go_memstats_gc_cpu_fraction 0.01836957988419095

go_memstats_gc_sys_bytes 5.213408e+06

go_memstats_heap_alloc_bytes 5.843192e+06

go_memstats_heap_idle_bytes 5.8195968e+07

go_memstats_heap_inuse_bytes 7.53664e+06

go_memstats_heap_objects 24186

go_memstats_heap_released_bytes 5.709824e+07

go_memstats_heap_sys_bytes 6.5732608e+07

go_memstats_last_gc_time_seconds 1.6263290546848054e+09

go_memstats_lookups_total 0

go_memstats_mallocs_total 50879

go_memstats_mcache_inuse_bytes 7200

go_memstats_mcache_sys_bytes 16384

go_memstats_mspan_inuse_bytes 124440

go_memstats_mspan_sys_bytes 131072

go_memstats_next_gc_bytes 7.063104e+06

go_memstats_other_sys_bytes 1.399348e+06

go_memstats_stack_inuse_bytes 1.376256e+06

go_memstats_stack_sys_bytes 1.376256e+06

go_memstats_sys_bytes 7.531828e+07

go_threads 12

gorm_dbstats_idle{db="jianjiu",micro_name="lb.example.test11"} 0

gorm_dbstats_in_use{db="jianjiu",micro_name="lb.example.test11"} 0


gorm_dbstats_max_idle_closed{db="jianjiu",micro_name="lb.example.test11"} 0


gorm_dbstats_max_lifetime_closed{db="jianjiu",micro_name="lb.example.test11"} 0

gorm_dbstats_max_open_connections{db="jianjiu",micro_name="lb.example.test11"} 0


gorm_dbstats_open_connections{db="jianjiu",micro_name="lb.example.test11"} 0


gorm_dbstats_wait_count{db="jianjiu",micro_name="lb.example.test11"} 0

gorm_dbstats_wait_duration{db="jianjiu",micro_name="lb.example.test11"} 0

process_cpu_seconds_total 1.86

process_max_fds 1.048576e+06

process_open_fds 22

process_resident_memory_bytes 4.3569152e+07

process_start_time_seconds 1.62632905214e+09

process_virtual_memory_bytes 7.70568192e+08

process_virtual_memory_max_bytes 1.8446744073709552e+19

promhttp_metric_handler_requests_in_flight{micro_name="lb.example.test11"} 1

promhttp_metric_handler_requests_total{code="200",micro_name="lb.example.test11"} 3
promhttp_metric_handler_requests_total{code="500",micro_name="lb.example.test11"} 0
promhttp_metric_handler_requests_total{code="503",micro_name="lb.example.test11"} 0

The result is as follows: (I have only grabbed the "metrics" of "lb.example.test11" so far)

{
  "_index": "apm-7.13.3-metric-000001",
  "_type": "_doc",
  "_id": "xMzcqHoBcStLGm-E-rvj",
  "_version": 1,
  "_score": null,
  "fields": {
    "host.hostname": [
      "main2"
    ],
    "process_virtual_memory_bytes": [
      770830340
    ],
    "go_memstats_mspan_sys_bytes": [
      180224
    ],
    "go_memstats_last_gc_time_seconds": [
      1626330620
    ],
    "go_memstats_stack_sys_bytes": [
      1474560
    ],
    "service.language.name": [
      "unknown"
    ],
    "labels.instance": [
      "main2:9102"
    ],
    "scrape_samples_post_metric_relabeling": [
      92
    ],
    "labels.scheme": [
      "http"
    ],
    "processor.event": [
      "metric"
    ],
    "agent.name": [
      "otlp"
    ],
    "process_start_time_seconds": [
      1626330240
    ],
    "host.name": [
      "main2"
    ],
    "up": [
      1
    ],
    "go_memstats_heap_alloc_bytes": [
      9424888
    ],
    "go_memstats_mspan_inuse_bytes": [
      155448
    ],
    "go_memstats_lookups_total": [
      0
    ],
    "go_memstats_frees_total": [
      134718
    ],
    "go_memstats_alloc_bytes": [
      9424888
    ],
    "processor.name": [
      "metric"
    ],
    "go_memstats_sys_bytes": [
      75580424
    ],
    "go_memstats_buck_hash_sys_bytes": [
      1455916
    ],
    "go_memstats_mallocs_total": [
      174129
    ],
    "ecs.version": [
      "1.8.0"
    ],
    "observer.type": [
      "apm-server"
    ],
    "observer.version": [
      "7.13.3"
    ],
    "go_memstats_gc_cpu_fraction": [
      0.000038187492
    ],
    "agent.version": [
      "unknown"
    ],
    "scrape_duration_seconds": [
      0.0027382
    ],
    "process_resident_memory_bytes": [
      28372992
    ],
    "process_cpu_seconds_total": [
      70.78
    ],
    "service.node.name": [
      "main2"
    ],
    "go_memstats_heap_idle_bytes": [
      54247424
    ],
    "go_memstats_heap_sys_bytes": [
      65634304
    ],
    "go_memstats_gc_sys_bytes": [
      5462296
    ],
    "process_open_fds": [
      22
    ],
    "scrape_series_added": [
      92
    ],
    "go_memstats_stack_inuse_bytes": [
      1474560
    ],
    "process_max_fds": [
      1048576
    ],
    "go_memstats_heap_objects": [
      39411
    ],
    "go_memstats_mcache_inuse_bytes": [
      7200
    ],
    "go_memstats_other_sys_bytes": [
      1356740
    ],
    "go_threads": [
      12
    ],
    "go_memstats_heap_inuse_bytes": [
      11386880
    ],
    "go_memstats_mcache_sys_bytes": [
      16384
    ],
    "go_memstats_alloc_bytes_total": [
      37136680
    ],
    "labels.port": [
      "9102"
    ],
    "go_goroutines": [
      265
    ],
    "service.name": [
      "prometheus_simple_main2_9102"
    ],
    "labels.job": [
      "prometheus_simple/main2:9102"
    ],
    "go_memstats_heap_released_bytes": [
      49135616
    ],
    "observer.version_major": [
      7
    ],
    "process_virtual_memory_max_bytes": [
      18446744000000000000
    ],
    "observer.hostname": [
      "797c4982dd79"
    ],
    "scrape_samples_scraped": [
      92
    ],
    "metricset.name": [
      "app"
    ],
    "event.ingested": [
      "2021-07-15T06:31:32.316Z"
    ],
    "@timestamp": [
      "2021-07-15T06:31:31.343Z"
    ],
    "go_memstats_next_gc_bytes": [
      16465936
    ]
  },
  "sort": [
    1626330691343
  ]
}

Can you please explain us

  • Where the traces come from: application type, instrumentation type... The receiver seems to be Jaeger for traces and your screenshot indicates a "lb" which probably stands for "load balancer"
  • Where the prometheus metrics come from: what type of prometheus exporter do you use for the metrics...

Note that some system that are not yet supporting instrumentation with OpenTelemetry support OpenCensus in addition to Jaeger and may, thanks to OpenCensus, export metrics in addition to traces.

For your prometheus scrapper configuration, did you consider defining the label service.name using the external_labels capability of the scrape_config configuration? See docs prometheus/configuration.md at v2.28.1 · prometheus/prometheus · GitHub

The "trace" comes from "uber-jaeger-client -> jaeger-agent -> openTelemetry".
"trace" information:

The "metric" is crawling the service's port 9102 through the "prometheus_simple" of "openTelemetry". The service exposes "metrics" data via "prometheus/client_golang/prometheus".

import 	(
       "github.com/prometheus/client_golang/prometheus/promhttp"
       "net/http"
)
http.Handle("/metrics", promhttp.Handler())

The "lb" is our company abbreviation.

Thanks @tttoad,

Did you consider switching from the Jaeger client (traces) + Prometheus clients (metrics) to an OpenTelemetry client that would handle both traces and metrics?

If you don't want to or can't change the instrumentation library, I would evaluate switching on the Otel Collector from the prometheus_simplereceiver to the more advanced prometheus receiver and add a label in the scrape_config using the external_labels.

Would this pattern make sense to you?

Thank you for your answer!
Do you already support the "metrics" data reported by the "OpenTelemetry-client" ? We will consider switching to "OpenTelemetry-client" for reporting information...

Yes, we have been supporting Otel Metrics for a while, the last missing bit is to support Histograms and we should fix it soon. Note that you can instrument your applications with OpenTelemetry agents or SDKs and use an OpenTelemetry Collector next to your applications to push both to Elastic Observability, to Prometheus and to any other destination.

See our documentation OpenTelemetry integration | APM User Guide [8.11] | Elastic

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.