Data Import to Elasticsearch


(Geri Ruccolo) #1

I'm trying to post data to elasticsearch with curl command. Works fine IF the data is json. But I'm trying to load plain text - specifically log4j logs. Keep getting {"error":"ElasticsearchParseException[Failed to derive xcontent]","status":400}
which is SO unhelpful.
Hope someone can help me correct problem and steer me in right direction.

Thanks

Geri


(Rohit) #2

Can you add the content? Also, find this link which might be helpful .


(Mark Harwood) #3

Elasticsearch needs valid JSON.

To parse logfiles into elasticsearch take a look at logstash https://www.elastic.co/products/logstash


(Geri Ruccolo) #4

Thanks guys for the quick reply. Unfortunately I have to use log4j logfiles which of course are not json. Also I've read both links provided over and over & I know that logstash briefly speaks of plain text but give no examples. I've also tried many of the ideas from googling - they don't work. Beginning to look as if I've hit a brick wall


(Geri Ruccolo) #5

I can add anything you'd like to see


(Geri Ruccolo) #6

Maybe this will clarify things. I have a successful .conf file and all 3 products (Logstash, Elasticsearch and Kibana) all start successfully. I'm trying the following command - that's causing the problem
curl -s -XPOST localhost:9200/_bulk --data @catalina.2015-10-13.log

If I use a json file instead of the log, it works.

Reason that I'm doing this is although Logstash runs fine and if I query elasticsearch it seems to have some data but when I try to view it in Kibana, I can only view the json data (which is straight sample data from Logstash site) - nothing else. I even tried making part of the log in json and saved it as a json file. the XPOST works, elasticsearch acknowledges the file but cannot see in Kibana.


(Rohit) #7

I'm not familiar with logstash but, from reading perspective , I have found that you need to have 3 things.

  1. input file (this can be the log file).
  2. Filter (to parse content and make events).
  3. Output for elasticsearch.

Not sure if you had chance to browse this : http://blog.sematext.com/2013/12/19/getting-started-with-logstash/ .

Sorry was not much of help on logstash.


(Sarwar Bhuiyan) #8

Geri,

If you have got logstash running successfully, you would have seen that the actual indexing is to be done by Logstash and not by you running the curl command uploading a log file. As Mark mentioned before, Elasticsearch accepts data in JSON form only so people usually use scripts or something like logstash or fluentd to read and transform data from their formats (like logs, in your case) into an appropriate format which reflects what you want to do with it in Elasticsearch. With logstash, the simplest config would just take the log file, parse each line (assuming you have a log per line) and wrap it in a JSON with a timestamp and a message field. You mention you are using log4j so it won't always be single line per log record (e.g. when errors occur, the stack traces go over multiple lines). The logstash file input config can be configured to use the multiline codec and separate log records based on a regex pattern.

To see how the multiline codec can be used, see https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html. The example shows stin input but will probably work with the file input as well.

Once you've tested a logstash config, all you really need is to have logstash running and monitoring a directory of log files and it should take care of indexing JSON objects into Elasticsearch.

You might be interested in the getting started with logstash page in the docs. https://www.elastic.co/guide/en/logstash/current/advanced-pipeline.html

Long term, you're welcome to sign up for the training although a lot of the information is present on the site: https://www.elastic.co/guide/index.html

Hope this helps.

Sarwar


(system) #9