Parse logfiles (haproxy/nginx/apache) and send to APM [continued]

[This continues the topic Parse logfiles (haproxy/nginx/apache) and send to APM which unfortunately does not allow me to reply anymore.]

Hi @axw,

thanks again for your pointers! I finally managed to add all the APM fields you mentioned. However, unfortunately, the data is not showing up in the APM UI. I've attached two examples, one from our python agent that is displayed in APM just fine, and one from haproxy, which isn't. I'm a bit confused because the data from the python agent is in a different index (apm-7.7.0-transaction-000001) than the one from haproxy (which uses a date suffix, apm-7.7.0-transaction-2020.06.05); but on the other hand, the APM settings page /app/apm#/settings/apm-indices is showing simply apm-* so maybe that's not the issue? Am I maybe still missing required APM fields?

I'm grateful for any ideas,
thanks,
Wolfgang

transaction-python.json
{
  "_index": "apm-7.7.0-transaction-000001",
  "_type": "_doc",
  "_id": "YqHng3IB6RMduFLNCUhD",
  "_version": 1,
  "_score": null,
  "_source": {
    "parent": {
      "id": "389448df-55e0-4033-a010-558e6d159989"
    },
    "observer": {
      "hostname": "5ba24827b391",
      "name": "instance-0000000011",
      "id": "7460172d-831f-4a17-a553-d15793619877",
      "ephemeral_id": "96034419-3656-45da-ae83-c159248faecb",
      "type": "apm-server",
      "version": "7.7.0",
      "version_major": 7
    },
    "process": {
      "args": [
        "/srv/friedbert/deployment/work/web/bin/gunicorn",
        "--paste",
        "web.ini",
        "--preload"
      ],
      "pid": 1109,
      "ppid": 1037
    },
    "agent": {
      "name": "python",
      "version": "5.6.0"
    },
    "trace": {
      "id": "937f9c0c-c3e6-411b-bfe3-8be0f2609c16"
    },
    "@timestamp": "2020-06-05T09:54:05.271Z",
    "ecs": {
      "version": "1.5.0"
    },
    "service": {
      "node": {
        "name": "friedbert01"
      },
      "environment": "staging",
      "name": "friedbert",
      "runtime": {
        "name": "CPython",
        "version": "3.7.5"
      },
      "language": {
        "name": "python",
        "version": "3.7.5"
      },
      "version": "3.573"
    },
    "host": {
      "hostname": "friedbert01",
      "os": {
        "platform": "linux"
      },
      "ip": "10.45.0.255",
      "name": "friedbert01",
      "architecture": "x86_64"
    },
    "processor": {
      "name": "transaction",
      "event": "transaction"
    },
    "transaction": {
      "duration": {
        "us": 1877286
      },
      "result": "200",
      "name": "centerpage",
      "span_count": {
        "dropped": 0,
        "started": 8
      },
      "id": "13948d1f57847bd9",
      "type": "request",
      "sampled": true
    },
    "timestamp": {
      "us": 1591350845271376
    }
  },
  "fields": {
    "@timestamp": [
      "2020-06-05T09:54:05.271Z"
    ]
  },
  "sort": [
    1591350845271
  ]
}
transaction-haproxy.json
{
  "_index": "apm-7.7.0-transaction-2020.06.05",
  "_type": "_doc",
  "_id": "jAPmg3IBfwri0xah93gV",
  "_version": 1,
  "_score": null,
  "_source": {
    "tt": 2012,
    "Fastly_Client_IP": "194.77.156.138",
    "request": "GET /wirtschaft/index HTTP/1.1",
    "parent": {
      "id": "b8624141-0689-4ae7-87da-f7813854478f"
    },
    "syslog_time": "05/Jun/2020:11:54:05 +0200",
    "c_port": "47738",
    "agent": {
      "name": "haproxy",
      "version": "-"
    },
    "ps": "haproxy",
    "tw": 0,
    "status_code": 200,
    "actconn": 95,
    "srv_conn": 0,
    "f_end": "fastly_frontend~",
    "pid": 5140,
    "User_Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:77.0) Gecko/20100101 Firefox/77.0",
    "trace": {
      "id": "937f9c0c-c3e6-411b-bfe3-8be0f2609c16"
    },
    "ecs": {
      "version": "1.5.0"
    },
    "beconn": 0,
    "host": {
      "hostname": "cdn-endpoint01-20190403.staging.zeit.de",
      "name": "cdn-endpoint01-20190403.staging.zeit.de"
    },
    "timestamp": {
      "us": 1591350847
    },
    "res_headers": "",
    "syslog_host": "cdn-endpoint01-20190403",
    "b_server": "livecrm-front01",
    "feconn": 95,
    "srv_queue": 0,
    "res_cookie": "-",
    "Referer": "https://www.staging.zeit.de/gesellschaft/index",
    "Host": "www.staging.zeit.de",
    "Fastly_SSL": "1",
    "req_cookie": "-",
    "c_ip": "23.235.43.38",
    "processor": {
      "name": "transaction",
      "event": "transaction"
    },
    "b_end": "c1-frontends",
    "tc": 0,
    "retries": 0,
    "backend_queue": 0,
    "@timestamp": "2020-06-05T09:54:05.229Z",
    "bytes": 62189,
    "service": {
      "environment": "staging",
      "name": "haproxy",
      "runtime": {
        "name": "-",
        "version": "-"
      },
      "language": {
        "name": "-"
      }
    },
    "t_state": "----",
    "tq": 0,
    "tr": 2007,
    "transaction": {
      "result": 200,
      "duration": {
        "us": 2012
      },
      "name": "cdn-endpoint",
      "id": "389448df-55e0-4033-a010-558e6d159989",
      "span_count": {
        "started": 0
      },
      "type": "request",
      "sampled": false
    }
  },
  "fields": {
    "@timestamp": [
      "2020-06-05T09:54:05.229Z"
    ]
  },
  "sort": [
    1591350845229
  ]
}

Hi @wosc,

I just indexed that document, and found the same - it didn't show up. There appears to be an assumption in the UI that the observer.version_major field exists. This field indicates the major version of APM Server, which is assumed to have indexed the document. If you set that to 7, it should work:

Looking over the fields you're indexing, I have a few recommendations. In general, I'd suggest trying to fit your fields into ECS as much as possible. Here's some specific suggestions:

Finally, if you specify the "apm" ingest node pipeline while indexing, you'll get geoIP enrichment (e.g. for mapping your users), and user-agent will be parsed (e.g. for a breakdown by browser/device).

I'm a bit confused because the data from the python agent is in a different index ( apm-7.7.0-transaction-000001 ) than the one from haproxy (which uses a date suffix, apm-7.7.0-transaction-2020.06.05 )

The -000001 suffix comes from using ILM (index lifecycle management). In a default setup, apm-7.7.0-transaction will be an alias which writes to the "hot" index (e.g. apm-7.7.0-transaction-000001). When the necessary preconditions are met (time passes, or index size grows sufficiently large) the index will be rolled over.

Presumably you were not writing to that alias. If you did, I would expect your documents to go in the same index as written to by APM Server.

Hi @axw,

it works! Thank you so much! After adding the observer.version_major field, the data shows up in APM now. :slight_smile:

I had previously scoured the JSON schemas in https://github.com/elastic/apm-server/tree/master/docs/spec, service.json, transaction.json, etc. for required fields (and I've just tried it, they are all required, the data does not show up in APM e.g. if I leave out the dummy value for runtime.name or say the timestamp.us -- even though it has @timestamp). The implementation detail about the observer version of course is not part of those schemas. :wink:

Thank you as well for the explanation about ILM/aliases (while I'm very grateful for the elasticcloud-hosted Kibana we use, it tends to seem a little opaque to me at times, so thanks for straightening that out) and the hints about the ECS schema -- we'll go on and to the next, to be sure.

Best regards,
Wolfgang

1 Like

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.