Getting "empty text" and "invalid version format" exceptions

Hello,

I am using Elasticsearch version 2.3.0 and Logstash version 2.2.2. At the moment, I am getting "empty text" and "invalid version format" java exceptions for my Elasticsearch node, which are shown in the logs below.

Is there some way to add a filter in my logstash.conf file so it only passes on data that has a valid format and is not empty to elasticsearch? Is there thing else I can do to fix this problem?

Any help is greatly appreciated.

Here are the logs

/var/log/elasticsearch/elasticsearch.log

[2016-06-10 10:15:50,295][WARN ][http.netty               ] [John Doe] Caught exception while handling client http traffic, closing connection [id: 0x24afc07
f, /10.13.37.35:46630 => /10.13.37.99:9200]
java.lang.IllegalArgumentException: invalid version format: ヘ^V{ᄀESCᆬ;A○(Bᅣᅤ^ᄍナ^Gᄁヘ%ᄚNᅰ<^Pᄍ!チ   ᅤᄍ^ZᅫYBYF?ᄈᆲモ↑<U+FFD0>H.�ᆭ<U+FFFE>ロᆴ7�I<U+FFC8>ᅢ←ンA)ヨᅤN_ᄌᄡワPᅨ
Zᄆ%U<U+FFFF>^N^@^@<U+FFFF><U+FFFF>B(Fノ2W^@^@^@^A2C^@^@^@○X^|マᄏJ↓0^Pニ}<U+FFDE>ᅣLMᅨメᄉ<U+FFDE>=ᆱJロ^T  ᄂKユ&(<U+FFF2>Xᅲᅠヒ<U+FFD1>%ᄚ^X<U+FFBF>{ᅥI^U^Hチ)$<U+FFFE>_゚ᄒ
←゚ᆰᆰ<U+FFFA>G<U+FFF3>ᄚᅡᅤAᅱᆪᅫ^Zᅯ
        at org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:94)
        at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
        at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75)
        at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:191)
        at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:102)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:485)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
       ...
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[2016-06-10 10:14:59,454][WARN ][http.netty               ] [John Doe] Caught exception while handling client http traffic, closing connection [id: 0xdbd98da3, /10.13.37.28:34468 => /10.13.37.99:9200]
java.lang.IllegalArgumentException: invalid version format: <U+FFC8>ᄌXᅫ9ヨᅵノBEᆬᅧE*ハ¬7ᆴZ^WG31ᄆモ^^^Yᅭ1^O&ホタ<U+FFD0>3       ̄ᄁEᅠ
        at org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:94)
        at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
        at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75)
        at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:191)
        at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:102)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
 ...
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Below are my logstash and elasticsearch configuration files. In my elasticsearch config file, the only thing I configured is the network.host.

/etc/logstash/logstash.conf

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => ["10.13.37.99:9200"]
    manage_template => false
    protocol=>http
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
    stdout {
      codec => rubydebug
    }
  }
}

/etc/elasticsearch/elasticsearch.conf (for brevity I only include the Network section)

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.13.37.99
#
# Set a custom port for HTTP:
#
# http.port: 9200

This is incorrect:

output {
  elasticsearch {
    hosts => ["10.13.37.99:9200"]
    manage_template => false
    protocol=>http
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
    stdout {
      codec => rubydebug
    }
  }
}

Should be something like:

output {
  elasticsearch {
    hosts => ["10.13.37.99:9200"]
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }

  stdout {
      codec => rubydebug
  }
}

Thank you @dadoonet. I just made that change to the logstash config file, but I am still getting the same "empty text" and "invalid version format" java exceptions.

Could you please tell me if there is anything I should try?

Are you sure you are using exactly this config?
I mean: can you share your exact elasticsearch.yml and logstash config files?

Looks like to me that you are not using the http protocol but the other client or you are using something like SSL (https)?

Yes definitely. Here is my elasticsearch.yml file

$ cat /etc/elasticsearch/elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
# cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
# node.name: node-1
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
# path.data: /path/to/data
#
# Path to log files:
#
# path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
# bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.13.37.99
#
# Set a custom port for HTTP:
#
# http.port: 9200
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
# discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
# discovery.zen.minimum_master_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true

When I run systemctl status logstash here is what I get

$ systemctl status logstash -l
● logstash.service - LSB: Starts Logstash as a daemon.
   Loaded: loaded (/etc/rc.d/init.d/logstash)
   Active: active (running) since Fri 2016-06-10 12:07:49 EDT; 17min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1055 ExecStart=/etc/rc.d/init.d/logstash start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/logstash.service
           └─1101 java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Djava.io.tmpdir=/var/lib/logstash -Xmx1g -Xss2048k -Djffi.boot.library.path=/opt/logstash/vendor/jruby/lib/jni -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Djava.io.tmpdir=/var/lib/logstash -XX:HeapDumpPath=/opt/logstash/heapdump.hprof -Xbootclasspath/a:/opt/logstash/vendor/jruby/lib/jruby.jar -classpath : -Djruby.home=/opt/logstash/vendor/jruby -Djruby.lib=/opt/logstash/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main --1.9 /opt/logstash/lib/bootstrap/environment.rb logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log

This shows that the /etc/logstash/conf.d directory is being used by logstash. There is only one config file in /etc/logstash/conf.d as shown below

$ ls /etc/logstash/conf.d
logstash.conf

Here is what's inside that /etc/logstash/conf.d/logstash.conf

$ cat /etc/logstash/conf.d/logstash.conf
input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => ["10.13.37.99:9200"]
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }

  stdout {
      codec => rubydebug
  }
}

I guess you restarted logstash service, right?

BTW don't you need to add this?

filter { }

Yes, I restarted the logstash service. I didn't add the empty filter in the past and it seemed to work, but I just added it.

Is it possible that me shipping zipped log files using filebeat, will result in the "invalid version format" exceptions?

I think that totally explains what you see.

I did not look at the doc/code but does filebeat uncompress on the fly and stream the content? I don't think so.

Awesome! I'm glad you think so. Thank you for all of your help. For anyone else that has this problem, I reconfigured filebeat to exclude files that end in .gz. I just used the exclude_files: [".gz$"] option in the /etc/filebeat/filebeat.yml file . More on this can be found here. https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html