RedHat OpenShift fluentd to ECE connection configuration

Does anyone have any experience with getting OpenShift thats using fluentd as an ingest pipeline processor to connect to ECE? Since fluentd is very similar to logstash I have a logstash config that is working, in that it can ship beat data from my computer to logstash and connect to the ECE cluster, but we cant get the fluentd settings to work. Heres our logstash config that were trying to mirror for fluentd.

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => ["https://xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx.xx.gov:443"]
    manage_template => false
    index => "%{[@metadata][beat]}-%{[@metadata][version]}"
    user => "logstash"
    password => "<redacted>"
    ssl => true
    #ssl_certificate_verification => false
    cacert => "/etc/logstash/globalcert/root_ca_pub.pem"
    }
}

We've manually tested our cert and it works but were missing something in our fluentd config or have an extra setting.

What does your fluentd config look like? Do you have information on how it fails (eg logged errors)

<store>
      @type elasticsearch
      @id elasticsearch-apps
      @log_level debug
      host "#{ENV['ES_HOST']}"
      port "#{ENV['ES_PORT']}"
      scheme https
      ssl_version TLSv1_2
      target_index_key viaq_index_name
      id_key viaq_msg_id
      remove_keys viaq_index_name
      user "#{ENV['FLUENT_ELASTICSEARCH_USER']}"
      password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD']}"

 

      client_key "#{ENV['ES_CLIENT_KEY']}"
      client_cert "#{ENV['ES_CLIENT_CERT']}"
      ca_file "#{ENV['ES_CA']}"

Error:
if you don't want to do mutual TLS, then don't set ES_CLIENT_KEY or ES_CLIENT_CERT (they're not set)

What happens if you remove client_key and client_cert?

I had a look at the fluentd docs (https://www.rubydoc.info/gems/fluent-plugin-elasticsearch/1.10.0 ?) and I couldn't see any reason why it wouldn't be able to connect to ECE (one thing I couldn't confirm is that we only support HTTP/1.1 - that's a common gotcha when using nginx), but since the error seems to be specific to the client TLS settings, let's remove them and see if we get a different error at least

(the other thing that I wasn't sure about was whether we supported ssl_version TLSv1_2, if you still get SSL errors after removing the client settings might be worth trying changing that to one of SSLv23, TLSv1, TLSv1_1

1 Like

We were able to get logs in! We made some settings on our end as well as the fluentd side so were trying to fine tune things down to exactly what made it work. Will post a fine tuned config for others soon hopefully. Appreciate the insight Alex!

1 Like

To mirror the above logstash config for our OpenShift/fluentd connection we used:
OpenShift/fluentd config was put together by my colleague Adam

Find these lines and change accordingly - These URLs are custom URLs that specify a logical cluster within ECE. Youll need a new URL for new environments
- name: ES_HOST
  value: xxxxxxxxxxxxxxxxxxxxxxxxxx.gov
- name: ES_PORT
  value: "9243"
- name: OPS_HOST
  value: xxxxxxxxxxxxxxxxxxxxxxxxxx.gov
- name: OPS_PORT
  value: "9243" 
 
Add these variables - Get these credentials from the ECE team (Ryan)
- name: FLUENT_ELASTICSEARCH_USER
  value: logstash_send_data
- name: FLUENT_ELASTICSEARCH_PASSWORD
  value: <redacted>
 
Comment out these variables (were going to do this without mutual TLS). 
#- name: ES_CLIENT_CERT
   #value: /etc/fluent/keys/cert
#- name: ES_CLIENT_KEY
   #value: /etc/fluent/keys/key
#- name: OPS_CLIENT_CERT
   #value: /etc/fluent/keys/ops-cert
#- name: OPS_CLIENT_KEY
   #value: /etc/fluent/keys/ops-key
 
Edit these variables
- name: ES_CA
  value: /etc/fluent/elasticsearch/keys/our_cert.pem
- name: OPS_CA
  value: /etc/fluent/elasticsearch/keys/our_cert.pem

Add this block under "volumes". Were going to pass in an edited /etc/fluent/configs.d/openshift/output-es-config.conf and a file for the certificate without disrupting the current key set.
- configMap:
    defaultMode: 420
    name: logging-output-elasticsearch
  name: output-es-config
- name: <your-cert>
  secret:
    defaultMode: 420
    secretName: <your-cert>
 
Add this block under "volumeMounts"
- mountPath: /etc/fluent/configs.d/openshift/output-es-config.conf
  name: output-es-config
  readOnly: true
  subPath: output-es-config.conf
- mountPath: /etc/fluent/elasticsearch/keys
  name: its-cert
  readOnly: true
 
Create modified output-es-config.conf
  
Create a local file (well turn it into a config map soon) named output-es-config.conf. In the future, this config file may no longer match up well with the default available in the package. So you may want to copy the original out and make manual updates. **The principal problem I found was hardcoded username and password. The other problem was the lines for ES_CLIENT_CERT / ES_CLIENT_KEY / OPS_CLIENT_CERT / OPS_CLIENT_KEY. The documentation would lead you to believe you can leave the environment variables blank in the daemonset and the resulting config would skip them. Instead, we had to remove those lines completely.**
 
<store>
  @type elasticsearch
  @id elasticsearch-apps
  @log_level trace
  with_transporter_log true
  host "#{ENV['ES_HOST']}"
  port "#{ENV['ES_PORT']}"
  scheme https
  ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
  ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1_2'}"
  target_index_key viaq_index_name
  id_key viaq_msg_id
  remove_keys viaq_index_name
  user "#{ENV['FLUENT_ELASTICSEARCH_USER']}"
  password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD']}

  ca_file "#{ENV['ES_CA']}"
 
  type_name com.redhat.viaq.common
  retry_tag "retry_es"
 
  reload_connections "#{ENV['ES_RELOAD_CONNECTIONS'] || 'true'}"
  reload_after "#{ENV['ES_RELOAD_AFTER'] || '200'}"
  sniffer_class_name "#{ENV['ES_SNIFFER_CLASS_NAME'] || 'Fluent::ElasticsearchSimpleSniffer'}"
  reload_on_failure false
  flush_interval "#{ENV['ES_FLUSH_INTERVAL'] || '1s'}"
  max_retry_wait "#{ENV['ES_RETRY_WAIT'] || '300'}"
  disable_retry_limit true
  buffer_type file
  buffer_path '/var/lib/fluentd/buffer-output-es-config'
  buffer_queue_limit "#{ENV['BUFFER_QUEUE_LIMIT'] || '32' }"
  buffer_chunk_limit "#{ENV['BUFFER_SIZE_LIMIT'] || '8m' }"
  buffer_queue_full_action "#{ENV['BUFFER_QUEUE_FULL_ACTION'] || 'block'}"
  flush_at_shutdown "#{ENV['FLUSH_AT_SHUTDOWN'] || 'false'}"
 
  write_operation 'create'
 
  request_timeout 2147483648
</store>
 
oc create configmap logging-output-elasticsearch --from-file=output-es-config.conf

As Alex correctly pointed out we needed to remove the client_key and client_cert lines from our fluentd config. We had originally left them in, not due to their use, but due to the fact that the OpenShift/fluentd documentation found in the link specified that the configuration should work if the fields were left blank. For some clarification, is ECE with TLS not mutual or mutual? From my understanding its not mutual but feel free to correct me.

Configure your externally hosted Elasticsearch instance for TLS:
* If your externally hosted Elasticsearch instance does not use TLS, update the _CLIENT_CERT, _CLIENT_KEY, and _CA variables to be empty.
*** If your externally hosted Elasticsearch instance uses TLS, but not mutual TLS, update the _CLIENT_CERT and _CLIENT_KEY variables to be empty. Then patch or recreate the fluentd secret with the appropriate _CA value for communicating with your Elasticsearch instance.**
* If your externally hosted Elasticsearch instance uses Mutual TLS, patch or recreate the fluentd secret with your client key, client cert, and CA. The provided Elasticsearch instance uses mutual TLS
https://docs.openshift.com/container-platform/4.1/logging/config/efk-logging-external.html

Thanks for that information, that's greatly appreciated

To answer your question - ECE does not support mutual TLS externally (some of the internal comms between instances within clusters is mutual)

Alex

1 Like

Youre welcome! Appreciate the info on TLS and couldn't find a specific answer so I figured I'd ask. Hopefully this helps save some other people time if they have this issue.

Ryan

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.