Loading basic data from file with curl for testing. Do I need to use bulk?

gjws · September 11, 2016, 2:01am

Hi all,

I've been creating dashboards in kibana for my customers for quite a while now, but I've never been involved in the importing of the data. I've just got a new customer looking for a solution to their logging issues and I suggested they take a look at ELK, but they have zero skills in the area and have asked if I could help them. I've got some sample data from the customer:

  {
  "service": "ui-svc",
  "datestamp": "28-08-2013",
  "timestamp": "00:11:55",
  "level": "DEBUG",
  "uuid": "08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"

},
{
"service": "shipping-svc",
"datestamp": "27-08-2013",
"timestamp": "22:22:31",
"level": "INFO",
"uuid": "08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"
},
{
"service": "catalog-svc",
"datestamp": "28-08-2013",
"timestamp": "14:23:37",
"level": "INFO",
"uuid": "8467ebcd-f586-441a-9257-7caa70ba9dd8"
},

I've created an index with very basic mappings, setting everything to string as elastic doesn't like the existing format of the 'datestamp' field:

curl -XPUT http://localhost:9200/microservices -d '
{
"mappings" : {
"default" : {
"properties" : {
"service" : {"type": "string", "index" : "not_analyzed" },
"datestamp" : {"type": "string" },
"level" : { "type" : "string", "index" : "not_analyzed" },
"timestamp" : { "type" : "string" },
"uuid" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}
';

Now I would like to import the data. There are around 5000 entries as above and I'm unsure of the best way to approach this. I've tried importing a small subset using the following curl command:

curl -XPOST 'http://localhost:9200/microservices/transactions' -d @smallRandomJSON.json

I then check what was loaded and see there is a single document, even though my source file contains around 5000 records, formatted as I showed above:

curl -X GET 'http://localhost:9200/microservices/transactions/_count?pretty=true'
{
"count" : 1,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}

Why is elastic treating all these individual records as a single document, is that just what happens when you use this method, or have I screwed up the creation of the index? Should I be using the 'bulk' command instead to have them all treated as individual documents? If I need to use bulk then I guess I'll need to write something to format the raw log files appropriately right?

Sorry for the really basic questions, but I've been experimenting for several hours and thought I should ensure I'm heading down the right track before proceeding much further!

dadoonet · September 11, 2016, 6:00am

Please format your code.

You should use bulk API but it will require that you modify your source file to add a header and transform the JSON doc to a single line JSON doc.

Also you can split your document and generate one file per document. Then look at FSCrawler project to load the files.

gjws · September 11, 2016, 10:13am

Thank you for your reply. I've re-written the data based on the bulk documentation:

{"index":{"_index":"microservices","_type":"transaction","_id":"1"}}
{"service":"ui-svc","datestamp":"28-08-2013","timestamp":"00:11:55", "level":"DEBUG", "uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"}
{"index":{"_index":"microservices","_type":"transaction","_id":"2"}}
{"service": "shipping-svc", "datestamp": "27-08-2013", "timestamp": "22:22:31", "level": "INFO", "uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"3"}}
{"service": "catalog-svc", "datestamp": "28-08-2013", "timestamp": "14:23:37", "level": "INFO", "uuid":"8467ebcd-f586-441a-9257-7caa70ba9dd8"}
{"index":{"_index":"microservices","_type":"transaction","_id":"4"}}
{"service": "checkout-svc","datestamp": "27-08-2013","timestamp": "09:43:10","level": "WARNING","uuid":"8f62e470-a058-48ed-b681-447849776363"},
{"index":{"_index":"microservices","_type":"transaction","_id":"5"}}
{"service": "ui-svc","datestamp": "28-08-2013","timestamp": "10:50:44","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"6"}}
{"service": "payment-svc","datestamp": "28-08-2013","timestamp": "18:31:21","level": "WARNING","uuid":"8f62e470-a058-48ed-b681-447849776363"},
{"index":{"_index":"microservices","_type":"transaction","_id":"7"}}
{"service": "payment-svc","datestamp": "27-08-2013","timestamp": "10:28:28","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"8"}}
{"service": "checkout-svc","datestamp": "28-08-2013","timestamp": "12:40:21","level": "WARNING","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"9"}}
{"service": "payment-svc","datestamp": "28-08-2013","timestamp": "07:59:38","level": "ERROR","uuid":"8467ebcd-f586-441a-9257-7caa70ba9dd8"},
{"index":{"_index":"microservices","_type":"transaction","_id":"10"}}
{"service": "ui-svc","datestamp": "27-08-2013","timestamp": "02:02:34","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"11"}}
{"service": "checkout-svc","datestamp": "28-08-2013","timestamp": "16:41:49","level": "DEBUG","uuid":"8467ebcd-f586-441a-9257-7caa70ba9dd8"},
{"index":{"_index":"microservices","_type":"transaction","_id":"12"}}
{"service": "shipping-svc","datestamp": "27-08-2013","timestamp": "17:37:51","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"13"}}
{"service": "payment-svc","datestamp": "28-08-2013","timestamp": "14:21:02","level": "WARNING","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"}
\n

curl -s -XPOST "http://localhost:9200/_bulk" --data-binary @bulkJSOJN.json

In it goes! I'll get my client to tweak his application so it outputs this natively as part of logging to make it easy. Now it's back to Kibana

Topic		Replies	Views
How to index bulk of documents all at once? Elasticsearch	4	364	July 6, 2017
How to bulk load in elasticsearch? Elasticsearch	2	1435	July 5, 2017
Doubt regarding bulk api in elasticsearch Elasticsearch	5	537	July 5, 2017
Bulk insert file having many json entries into Elasticsearch Elasticsearch	4	27806	July 5, 2017
Bulk ingest example for Elasticsearch Elasticsearch	3	762	April 13, 2018

Loading basic data from file with curl for testing. Do I need to use bulk?

Related topics