Dec 16th, 2021: [en] Logging and monitoring HTTP requests reaching your Elasticsearch cluster

Have you ever wanted to take a look at the HTTP requests served by your Elasticsearch nodes?
This article is made for you!

Except if you have a load balancer or reverse proxy in front of your Elasticsearch nodes, capturing HTTP requests can be done using the following 3 Elasticsearch features:

The best way to get all the HTTP activity against the Elasticsearch cluster is using the HTTP tracer, but we'll go through all the options.

In this article, we will also show how to gather information and statistics about the HTTP connections and APIs used by clients connected to your Elasticsearch nodes.

We'll provide a few recommendations.

HTTP Tracer

HTTP Tracer is a feature introduced in 7.7.
It is possible to enable and disable it dynamically via the Cluster Settings API.

Please be aware the amount of logged events will be proportional to the activity of the clients connected to Elasticsearch and the APIs you'll want to observe.

How to enable HTTP tracer

The following example shows how to enable the HTTP tracer on 2 specific request paths:

  • /?myindex*, meaning any request against myindex*
  • /?anotherindex/_search*, meaning any search request against the index or alias named anotherindex

To enable it:

PUT _cluster/settings
{
   "transient" : {
      "logger.org.elasticsearch.http.HttpTracer" : "TRACE",
      "http.tracer.include" : [ "*myindex*", "*anotherindex/_search*" ]
   }
}

To disable it:

PUT _cluster/settings
{
   "transient" : {
      "logger.org.elasticsearch.http.HttpTracer" : null,
      "http.tracer.include" : null
   }
}

On Elasticsearch Service (Elastic Cloud), you'll need to enable at least the Logs shipping as described in our documentation.

Sample

[elasticsearch.server][TRACE] [null][POST][/myindex/_search?pretty=true] received request from [Netty4HttpChannel{localAddress=/172.17.0.19:18000, remoteAddress=/10.42.1.40:51468}]

Audit logging

As stated in the article overview, the audit logging requires a Gold or higher subscription.

Audit logging is way more powerful than HTTP tracer and it will log both the HTTP and Transport requests.

How to enable audit logging

In order to enable audit logging, your Elasticsearch nodes will require a valid license and you'll need to set xpack.security.audit.enabled to true in elasticsearch.yml.
You will need to restart Elasticsearch. For more details, refer to our documentation.

Please be aware starting Elasticsearch 7.0 onwards it is only possible to write the logs to a file. If you need to ingest such logs, you have to install Filebeat with the Elasticsearch module. The format of the audit logs is JSON.

In prior versions, there was an option to index the traces to the cluster itself or to a remote cluster, but it has been deprecated in 6.x.

Checking the audit log output, you'll notice the request will be translated in actions (e.g. indices:data/read/search).

To be able to see the search requests in the logs, you'll also need to add xpack.security.audit.logfile.events.emit_request_body: true.

On Elasticsearch Service (ESS), you'll need to enable at least the Logs shipping as described in our documentation. We have a dedicated blog post about auditing on ESS you might find interesting.

Sample

{"type":"audit", "timestamp":"2021-12-13T16:02:22,000+0000", "node.id":"w2fFJqtfQZu4MX_vEQNMsg", "event.type":"transport", "event.action":"access_granted", "authentication.type":"REALM", "user.name":"kibana", "user.realm":"reserved", "user.roles":["kibana_system"], "origin.type":"rest", "origin.address":"172.19.0.5:37230", "request.id":"7KL7nS1xSB69NLK1ocLc5g", "action":"indices:data/read/search", "request.name":"SearchRequest", "indices":["*","-*"]}

Slow log

Slow logging settings allow the user to log search or indexing requests which take more than a specific amount of time.

How to enable slow log

Slow logs can be enabled per index.
The thresholds can be configured by the user and each one can be associated with a logging level.

The example below will log searches on the index or alias myindex when it takes more than 10s for the query phase and more than 1s for the fetch phase as WARN.
Indexing requests on the index or alias myindex taking more than 2s will be logged as WARN.

PUT /myindex/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s",
  "index.search.slowlog.threshold.query.trace": "500ms",
  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.threshold.fetch.info": "800ms",
  "index.search.slowlog.threshold.fetch.debug": "500ms",
  "index.search.slowlog.threshold.fetch.trace": "200ms",
  "index.indexing.slowlog.threshold.index.warn": "2s",
  "index.indexing.slowlog.threshold.index.info": "1s",
  "index.indexing.slowlog.threshold.index.debug": "500s",
  "index.indexing.slowlog.threshold.index.trace": "250ms"
}

As usual, you can set the thresholds to null to remove the settings on the index.
E.g.

PUT /myindex/_settings
{
  "index.search.slowlog.*": null,
  "index.indexing.slowlog.*": null
}

For more details, please refer to our documentation.

Sample

[WARN ][i.s.s.query              ] [node-0] [myindex][0] took[11.6s], took_millis[199], total_hits[0 hits], stats[], search_type[QUERY_THEN_FETCH], total_shards[1], source[{"query":{"match_all":{"boost":1.0}}}], id[MY_USER_ID],

Nodes Stats API HTTP Stats

It is possible to obtain the HTTP stats using the following request:

GET _nodes/stats?filter_path=**.http*

You will get statistics for each node, e.g.

{
  "nodes" : {
    "5C7nuryMT_CJS9OeJJG7-A" : {
      "http" : {
        "current_open" : 581,
        "total_opened" : 723750,
        "clients" : [
          {
            "id" : 642320133,
            "agent" : "go-elasticsearch/7.13.0 (linux amd64; Go go1.17)",
            "local_address" : "172.17.0.19:18000",
            "remote_address" : "172.17.42.1:39224",
            "last_uri" : "/_nodes/instance-0000000025/_repositories_metering",
            "opened_time_millis" : 1639028276246,
            "last_request_time_millis" : 1639028276246,
            "request_count" : 2,
            "request_size_bytes" : 0
          },
          {
            "id" : 467673723,
            "agent" : "go-elasticsearch/7.13.0 (linux amd64; Go go1.17)",
            "local_address" : "172.17.0.19:18000",
            "remote_address" : "172.17.42.1:38272",
            "last_uri" : "/_xpack/usage",
            "opened_time_millis" : 1638780021149,
            "last_request_time_millis" : 1638780021149,
            "request_count" : 3,
            "request_size_bytes" : 0
          }, ...

Useful information you can get from such stats:

  • Currently open HTTP connections
  • Total opened HTTP connections
  • List of clients connected to the cluster with several metrics and metadata such as:
    • The user agent
    • Local and remote address
    • Last request path seen
    • Number of requests from such client
    • Request size
    • Last request timestamp and last connection opened timestamp

Connecting clients to Elasticsearch clusters

Most of the official Elasticsearch clients support native client side load balancing.
The hosts setting accepts a list of hosts, which will be used in round robin.
Examples:

Most of the official Elastic Elasticsearch clients support node sniffing.
Sniffing means the client will use the nodes provided in the list of hosts just to discover the cluster topology.
It is possible to customize the nodes which will be used once the cluster is discovered.
Examples:

If your cluster is behind a load balancer and/or the nodes cannot be directly reached by the client, disable sniffing/discovery.
In the case of Elasticsearch Service (ESS), the official Elastic Elasticsearch clients offer the ability to provide the cloud ID. When clients are connected to a cluster hosted on ESS, sniffing must be disabled as the clients will not have any direct access to each node of your cluster. Sniffing is automatically disabled when using the ESS cloud ID.
Check out our documentation for a few examples on how to connect to Elastic Elasticsearch Service.

In general:

  • avoid connecting clients to nodes having the master role (especially if they're master dedicated nodes)
  • if your clients will send indexing requests which require an ingest pipeline, provide the list of nodes having the ingest role (dedicated ingest nodes or ingest and data roles). If your cluster is using data tiers or you have a hot/warm architecture using node attributes, target the hot nodes
  • if your clients will send search and aggregation requests, prefer coordinating dedicated nodes. If you do not have dedicated coordinating nodes, target data nodes directly.
5 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.