Loading basic data from file with curl for testing. Do I need to use bulk?

Hi all,

I've been creating dashboards in kibana for my customers for quite a while now, but I've never been involved in the importing of the data. I've just got a new customer looking for a solution to their logging issues and I suggested they take a look at ELK, but they have zero skills in the area and have asked if I could help them. I've got some sample data from the customer:

  {
  "service": "ui-svc",
  "datestamp": "28-08-2013",
  "timestamp": "00:11:55",
  "level": "DEBUG",
  "uuid": "08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"

},
{
"service": "shipping-svc",
"datestamp": "27-08-2013",
"timestamp": "22:22:31",
"level": "INFO",
"uuid": "08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"
},
{
"service": "catalog-svc",
"datestamp": "28-08-2013",
"timestamp": "14:23:37",
"level": "INFO",
"uuid": "8467ebcd-f586-441a-9257-7caa70ba9dd8"
},

I've created an index with very basic mappings, setting everything to string as elastic doesn't like the existing format of the 'datestamp' field:

curl -XPUT http://localhost:9200/microservices -d '
{
"mappings" : {
"default" : {
"properties" : {
"service" : {"type": "string", "index" : "not_analyzed" },
"datestamp" : {"type": "string" },
"level" : { "type" : "string", "index" : "not_analyzed" },
"timestamp" : { "type" : "string" },
"uuid" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}
';

Now I would like to import the data. There are around 5000 entries as above and I'm unsure of the best way to approach this. I've tried importing a small subset using the following curl command:

curl -XPOST 'http://localhost:9200/microservices/transactions' -d @smallRandomJSON.json

I then check what was loaded and see there is a single document, even though my source file contains around 5000 records, formatted as I showed above:

curl -X GET 'http://localhost:9200/microservices/transactions/_count?pretty=true'
{
"count" : 1,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}

Why is elastic treating all these individual records as a single document, is that just what happens when you use this method, or have I screwed up the creation of the index? Should I be using the 'bulk' command instead to have them all treated as individual documents? If I need to use bulk then I guess I'll need to write something to format the raw log files appropriately right?

Sorry for the really basic questions, but I've been experimenting for several hours and thought I should ensure I'm heading down the right track before proceeding much further!

Please format your code.

You should use bulk API but it will require that you modify your source file to add a header and transform the JSON doc to a single line JSON doc.

Also you can split your document and generate one file per document. Then look at FSCrawler project to load the files.

Thank you for your reply. I've re-written the data based on the bulk documentation:

{"index":{"_index":"microservices","_type":"transaction","_id":"1"}}
{"service":"ui-svc","datestamp":"28-08-2013","timestamp":"00:11:55", "level":"DEBUG", "uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"}
{"index":{"_index":"microservices","_type":"transaction","_id":"2"}}
{"service": "shipping-svc", "datestamp": "27-08-2013", "timestamp": "22:22:31", "level": "INFO", "uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"3"}}
{"service": "catalog-svc", "datestamp": "28-08-2013", "timestamp": "14:23:37", "level": "INFO", "uuid":"8467ebcd-f586-441a-9257-7caa70ba9dd8"}
{"index":{"_index":"microservices","_type":"transaction","_id":"4"}}
{"service": "checkout-svc","datestamp": "27-08-2013","timestamp": "09:43:10","level": "WARNING","uuid":"8f62e470-a058-48ed-b681-447849776363"},
{"index":{"_index":"microservices","_type":"transaction","_id":"5"}}
{"service": "ui-svc","datestamp": "28-08-2013","timestamp": "10:50:44","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"6"}}
{"service": "payment-svc","datestamp": "28-08-2013","timestamp": "18:31:21","level": "WARNING","uuid":"8f62e470-a058-48ed-b681-447849776363"},
{"index":{"_index":"microservices","_type":"transaction","_id":"7"}}
{"service": "payment-svc","datestamp": "27-08-2013","timestamp": "10:28:28","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"8"}}
{"service": "checkout-svc","datestamp": "28-08-2013","timestamp": "12:40:21","level": "WARNING","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"9"}}
{"service": "payment-svc","datestamp": "28-08-2013","timestamp": "07:59:38","level": "ERROR","uuid":"8467ebcd-f586-441a-9257-7caa70ba9dd8"},
{"index":{"_index":"microservices","_type":"transaction","_id":"10"}}
{"service": "ui-svc","datestamp": "27-08-2013","timestamp": "02:02:34","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"11"}}
{"service": "checkout-svc","datestamp": "28-08-2013","timestamp": "16:41:49","level": "DEBUG","uuid":"8467ebcd-f586-441a-9257-7caa70ba9dd8"},
{"index":{"_index":"microservices","_type":"transaction","_id":"12"}}
{"service": "shipping-svc","datestamp": "27-08-2013","timestamp": "17:37:51","level": "ERROR","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"},
{"index":{"_index":"microservices","_type":"transaction","_id":"13"}}
{"service": "payment-svc","datestamp": "28-08-2013","timestamp": "14:21:02","level": "WARNING","uuid":"08ef39d7-dd14-4c72-8d2d-7e9074f3e2ba"}
\n

curl -s -XPOST "http://localhost:9200/_bulk" --data-binary @bulkJSOJN.json

In it goes! I'll get my client to tweak his application so it outputs this natively as part of logging to make it easy. Now it's back to Kibana :slight_smile: