How to index bulk of documents all at once?

Daniel_Guo · November 27, 2013, 1:10pm

I have some documents, they are all of the same index and type, and have
the same fields, for example:
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

If I want to index them to my ES server all at once, I use the bulk
interface like this:
# curl -s -XPOST localhost:9200/_bulk --data-binary @data_file

the data_file looks like:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"nation" : "USA", "city" : "California", "year" : ["2012", "2014", "2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

But it only index the first document, If I change the data_file as the
following:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

it works, but the data_file becomes much bigger.

Is there a better way to import Json documents to ES ? Thanks very much.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/adf0a1f7-bfab-4065-9ce6-c49da1286500%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

dadoonet · November 27, 2013, 2:29pm

I guess that the smallest data_file you can have is:
{"index" : { }}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"index" : { }}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"index" : { }}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

curl -s -XPOST localhost:9200/country/city/_bulk --data-binary @data_file

HTH

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 27 novembre 2013 at 14:10:41, Daniel Guo (daniel5hbs@gmail.com) a écrit:

I have some documents, they are all of the same index and type, and have the same fields, for example:
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

If I want to index them to my ES server all at once, I use the bulk interface like this:

curl -s -XPOST localhost:9200/_bulk --data-binary @data_file

the data_file looks like:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"nation" : "USA", "city" : "California", "year" : ["2012", "2014", "2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

But it only index the first document, If I change the data_file as the following:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

it works, but the data_file becomes much bigger.

Is there a better way to import Json documents to ES ? Thanks very much.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/adf0a1f7-bfab-4065-9ce6-c49da1286500%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529601cf.140e0f76.3e14%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.

Daniel_Guo · November 28, 2013, 1:46am

David, you provide a great improvement. Thank you!
Is there other ways to load data from to ES server?

On Wednesday, November 27, 2013 10:29:35 PM UTC+8, David Pilato wrote:

I guess that the smallest data_file you can have is:

{"index" : { }}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"
]}
{"index" : { }}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"
]}
{"index" : { }}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"
]}

curl -s -XPOST localhost:9200/country/city/_bulk --data-binary
@data_file

HTH

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 27 novembre 2013 at 14:10:41, Daniel Guo (danie...@gmail.com<javascript:>)
a écrit:

I have some documents, they are all of the same index and type, and have
the same fields, for example:
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012",
"2013"]}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014",
"2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"
]}

If I want to index them to my ES server all at once, I use the bulk
interface like this:
# curl -s -XPOST localhost:9200/_bulk --data-binary @data_file

the data_file looks like:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"
]}
{"nation" : "USA", "city" : "California", "year" : ["2012", "2014",
"2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"
]}

But it only index the first document, If I change the data_file as the
following:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"
]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014",
"2015"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"
]}

it works, but the data_file becomes much bigger.

Is there a better way to import Json documents to ES ? Thanks very much.

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/adf0a1f7-bfab-4065-9ce6-c49da1286500%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/60b58840-45de-497a-aab1-870429b70c62%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

dadoonet · November 28, 2013, 5:55am

I think Bulk is the best practice.

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 nov. 2013 à 02:46, Daniel Guo daniel5hbs@gmail.com a écrit :

David, you provide a great improvement. Thank you!
Is there other ways to load data from to ES server?

On Wednesday, November 27, 2013 10:29:35 PM UTC+8, David Pilato wrote:
I guess that the smallest data_file you can have is:

{"index" : { }}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"index" : { }}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"index" : { }}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

curl -s -XPOST localhost:9200/country/city/_bulk --data-binary @data_file

HTH

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 27 novembre 2013 at 14:10:41, Daniel Guo (danie...@gmail.com) a écrit:

I have some documents, they are all of the same index and type, and have the same fields, for example:
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

If I want to index them to my ES server all at once, I use the bulk interface like this:

curl -s -XPOST localhost:9200/_bulk --data-binary @data_file

the data_file looks like:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"nation" : "USA", "city" : "California", "year" : ["2012", "2014", "2015"]}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

But it only index the first document, If I change the data_file as the following:
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Tianjin", "year" : ["2011", "2012", "2013"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "USA", "city" : "Califorlia", "year" : ["2012", "2014", "2015"]}
{"index" : {"_index" : "country", "_type" : "city"}}
{"nation" : "China", "city" : "Beijing", "year" : ["2012", "2014", "2015"]}

it works, but the data_file becomes much bigger.

Is there a better way to import Json documents to ES ? Thanks very much.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/adf0a1f7-bfab-4065-9ce6-c49da1286500%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/60b58840-45de-497a-aab1-870429b70c62%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/96920FBA-2738-4319-A4A8-2B6FDB609899%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Help setting up bulk index Elasticsearch	6	388	May 28, 2018
Indexing via Bulk API Elasticsearch	2	276	July 6, 2017
How to bulk insert documents? Elasticsearch	6	2933	July 5, 2017
Elastic bulk API Multiple docs with single action Elasticsearch	2	491	March 21, 2017
Bulk Index 1000 document json file into ES using curl commands Elasticsearch	4	1921	July 5, 2017

How to index bulk of documents all at once?

curl -s -XPOST localhost:9200/_bulk --data-binary @data_file

Is there a better way to import Json documents to ES ? Thanks very much.

Is there a better way to import Json documents to ES ? Thanks very much.

curl -s -XPOST localhost:9200/_bulk --data-binary @data_file

Is there a better way to import Json documents to ES ? Thanks very much.

Related topics