Fluentd cannot connect to Elasticsearch

Hello everybody,
First of all, happy new year to everybody :slight_smile:
I need your help with the ELK stack.
I have installed Elastic , Kibana and Fluentd with HelmChart.

I have healthy cluster of Elasticsearch ,I can confirm it via CURL and version is 8.5.1.

{
  "name" : "elasticsearch-master-0",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "uXctgao6QEqbfKhzw5TLuA",
  "version" : {
    "number" : "8.5.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "c1310c45fc534583afe2c1c03046491efba2bba2",
    "build_date" : "2022-11-09T21:02:20.169855900Z",
    "build_snapshot" : false,
    "lucene_version" : "9.4.1",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

{
  "cluster_name" : "elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 11,
  "active_shards" : 22,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

I have modified the following details on the config map of fluentd-forwarder-cm and restarted the daemon.

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-forwarder-cm
  namespace: monitoring
  uid: 9ee610ce-eb53-4faa-b47a-fe53da264892
  resourceVersion: '30569'
  creationTimestamp: '2023-01-01T15:06:22Z'
  labels:
    app.kubernetes.io/component: forwarder
    app.kubernetes.io/instance: fluentd
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: fluentd
    helm.sh/chart: fluentd-5.5.11
  annotations:
    meta.helm.sh/release-name: fluentd
    meta.helm.sh/release-namespace: monitoring
  managedFields:
    - manager: helm
      operation: Update
      apiVersion: v1
      time: '2023-01-01T15:06:22Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:data:
          .: {}
          f:fluentd.conf: {}
          f:metrics.conf: {}
        f:metadata:
          f:annotations:
            .: {}
            f:meta.helm.sh/release-name: {}
            f:meta.helm.sh/release-namespace: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/component: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/name: {}
            f:helm.sh/chart: {}
    - manager: node-fetch
      operation: Update
      apiVersion: v1
      time: '2023-01-01T15:39:44Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:data:
          f:fluentd-inputs.conf: {}
          f:fluentd-output.conf: {}
  selfLink: /api/v1/namespaces/monitoring/configmaps/fluentd-forwarder-cm
data:
  fluentd-inputs.conf: |
    # HTTP input for the liveness and readiness probes
    <source>
      @type http
      port 9880
    </source>
    # Get the logs from the containers running in the node
    <source>
      @type tail
      path /var/log/containers/*-app*.log
      pos_file /opt/bitnami/fluentd/logs/buffers/fluentd-docker.pos
      tag kubernetes.*
      read_from_head true
      format json
    </source>
    # enrich with kubernetes metadata
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
  fluentd-output.conf: |
    # Throw the healthcheck to the standard output instead of forwarding it
    <match fluentd.healthcheck>
      @type null
    </match>
    # Forward all logs to the aggregators

    <match kubernetes.var.log.containers.**java-app**.log>
      @type elasticsearch
      include_tag_key true
      host "https://elasticsearch-master.monitoring.svc.cluster.local:443"
      port "9200"
      index_name "java-app-logs"
      scheme https
      ssl_verify false
      <buffer>
        @type file
        path /opt/bitnami/fluentd/logs/buffers/java-logs.buffer
        flush_thread_count 2
        flush_interval 5s
      </buffer>
    </match>
    # <match **>
    #   @type forward
    #   <server>
    #     host fluentd-0.fluentd-headless.monitoring.svc.cluster.local
    #     port 24224
    #   </server>
    #   <buffer>
    #     @type file
    #     path /opt/bitnami/fluentd/logs/buffers/logs.buffer
    #     flush_thread_count 2
    #     flush_interval 5s
    #   </buffer>
    # </match>
  fluentd.conf: |
    # Ignore fluentd own events
    <match fluent.**>
      @type null
    </match>

    @include fluentd-inputs.conf
    @include fluentd-output.conf
  metrics.conf: |
    # Prometheus Exporter Plugin
    # input plugin that exports metrics
    <source>
      @type prometheus
      port 24231
    </source>
    # input plugin that collects metrics from MonitorAgent
    <source>
      @type prometheus_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
    # input plugin that collects metrics for output plugin
    <source>
      @type prometheus_output_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
    # input plugin that collects metrics for in_tail plugin
    <source>
      @type prometheus_tail_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
binaryData: {}

After restart ,I see that my fluentd pods are failing with following logs.

2023-01-01 16:59:38 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. no address for https (Resolv::ResolvError)
2023-01-01 16:59:38 +0000 [warn]: #0 Remaining retry: 10. Retry to communicate after 32 second(s).
The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.
2023-01-01 17:00:19 +0000 [info]: Received graceful stop
2023-01-01 17:00:42 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. no address for https (Resolv::ResolvError)
2023-01-01 17:00:42 +0000 [warn]: #0 Remaining retry: 9. Retry to communicate after 64 second(s).
The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.

I changed this line host "https://elasticsearch-master.monitoring.svc.cluster.local:443" to host "elasticsearch-master.monitoring.svc.cluster.local" and now I see the following logs

2023-01-01 17:05:52 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. [401] {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
2023-01-01 17:05:52 +0000 [warn]: #0 Remaining retry: 12. Retry to communicate after 8 second(s).
The client is unable to verify that the server is Elasticsearch due to security privileges on the server side. Some functionality may not be compatible if the server is running an unsupported product.

Can you please help me?

Hi @Zeynal_Hajili I am not a fluentd expert but It Looks like you are missing the authentication credentials (user password) for elasticsearch in the fluentd config.

What did you use for that first curl command?

Always helps if show the actual command not just the output.

I would set up fluent with the same creds and port.

hi Stephen,
To be honest, I did this CURL command before changing Configmap,so I assume it is normal to see this output.
Could you please let me where I should specify credentials? and my configmap seems correct?

Please show the entire curl command you ran to check elasticsearch.. Does it still work?

Then refer to the fluentd docs and add

user elastic 
password mysecret

Hi Stephen,

I will definitely add those lines to the fluentd config and what about Elasticsearch? Where should add this credentials part?

CURL output actually shows expected output

zhajili$ curl -k -XGET 'https://localhost:9200'
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}zhajili$ 

Looks like you should add it to this section... This is more of a fluentd quest not be elasticsearch... So perhaps ask is that forum.

You can try this... This is basic curl syntax

$ curl -k -XGET -u elastic:password 'https://localhost:9200'

Get curl to work then use the same credentials and port in the fluentd configmap

Thanks Stephen,I will again spin up my EKS cluster and check .

My previous question was actually where I should update this credentials in Elasticsearch section? I assume that fluentd will try to authenticate with this username/password ,but what about Elasticsearch? How it will check? Where this credentials should be stored/updated?

Apologies You lost me... I no longer know what you are trying to accomplish... The config map above is about fluent... With a small section on connecting to elasticsearch... I gave you my suggestion ...
If you have questions on a fluentd configmap perhaps you should probably visit the fluent forum

When you "spin up elasticsearch" either you are enabled security or not... As part of that process basic elastic credentials are created... You need to get those and then use them...

Technically You would actually create specific users / roles for fluent that are publishers but at this point that seems to be a little more advanced/ complex than just getting this up and running.

Hello Stephen,
Yes, you are totally right.

In my values.yaml I have following

createCert: true 

# Disable it to use your own elastic-credential Secret.
secret:
  enabled: true
  password: "Krakow123" 

So I assume that I should add elastic/Krakow123 combination?

Because with these credentials I can log to Kibana GUI

Seems like a good idea...

Hi Stephen,

Thank you I can confirm that I can now do CURL and see following output

zhajili$ curl -k -XGET -u elastic:Krakow123 'https://localhost:9200'
{
  "name" : "elasticsearch-master-2",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "xog4MgwdRHyioFjiFqsdxA",
  "version" : {
    "number" : "8.5.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "c1310c45fc534583afe2c1c03046491efba2bba2",
    "build_date" : "2022-11-09T21:02:20.169855900Z",
    "build_snapshot" : false,
    "lucene_version" : "9.4.1",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}
zhajili$ 

However I see following logs in fluentd ,any tip that I can check ?

The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.
2023-01-01 20:41:45 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. no address for https (Resolv::ResolvError)
2023-01-01 20:41:45 +0000 [warn]: #0 Remaining retry: 13. Retry to communicate after 4 second(s).
The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.
2023-01-01 20:41:53 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. no address for https (Resolv::ResolvError)
2023-01-01 20:41:53 +0000 [warn]: #0 Remaining retry: 12. Retry to communicate after 8 second(s).
The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.
2023-01-01 20:42:09 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. no address for https (Resolv::ResolvError)
2023-01-01 20:42:09 +0000 [warn]: #0 Remaining retry: 11. Retry to communicate after 16 second(s).
The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.

current configmap for is like below

    <match kubernetes.var.log.containers.**java-app**.log>
      @type elasticsearch
      include_tag_key true
      host "https://elasticsearch-master.monitoring.svc.cluster.local"
      port "9200"
      user elastic 
      password Krakow123
      scheme https
      ssl_verify false
      index_name "java-app-logs"
      <buffer>
        @type file
        path /opt/bitnami/fluentd/logs/buffers/java-logs.buffer
        flush_thread_count 2
        flush_interval 5s
      </buffer>
    </match>

    <match kubernetes.var.log.containers.**node-app**.log>
      @type elasticsearch
      include_tag_key true
      host "https://elasticsearch-master.monitoring.svc.cluster.local"
      port "9200"
      user elastic 
      password Krakow123
      scheme https
      ssl_verify false
      index_name "node-app-logs"
      <buffer>
        @type file
        path /opt/bitnami/fluentd/logs/buffers/node-logs.buffer
        flush_thread_count 2
        flush_interval 5s
      </buffer>
    </match>

Thank you in advance!

Turn up the the debug logging on fluent.

But no, I'm not an expert on fluent and now you're having issues connecting from fluent. Perhaps take a look at the docs.

You can exec into the containers and see if the curl still works with the same information you're providing fluent.

Perhaps try without the https? I'm just looking at the docs... The error above doesn't show the actual host it's trying to connect to...

host "elasticsearch-master.monitoring.svc.cluster.local"

If you exec into the container can you do the curl with the same host port etc

That's how I would debug

It seems that i have wrong configuration on Configmap ,I removed https from host since I have scheme with https.

Now I do not see any warning regarding those warning ,but I see following warnings now regarding pattern match.

2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798454633Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.797Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"hello elastic world\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798641753Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"This is some great stuff\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798653641Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"Some more entries for our logging\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798657185Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"another line\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798708113Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"This never stops\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798712257Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"Logging logging all the way\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798742171Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"I think this is enough\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.798754372Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.798Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"nope, one more!\"}"
2023-01-01 20:51:22 +0000 [warn]: #0 pattern not matched: "2023-01-01T20:21:28.802429993Z stdout F {\"level\":30,\"time\":\"2023-01-01T20:21:28.802Z\",\"pid\":1,\"hostname\":\"node-app-96b85dc67-x7gsw\",\"msg\":\"app listening on port 3000!\"}

Anybody can help?

Stephen,

Thank you very much for all your help ,finally it works :slight_smile:

I added correct pattern for source and deleted https from host as you suggested ,I really appreciate your help :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.