Logstash uses native API when you choose elasticsearch output: http://logstash.net/docs/1.3.3/outputs/elasticsearch
About machines separation, I would say that you should test it. If your nodes are not really intensively used (CPU / IO), you can probably use the same machine for extracting content and produce JSON docs.
HTH
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 28 janvier 2014 at 20:14:07, ZenMaster80 (sabdalla80@gmail.com) a écrit:
Thanks David, I will certainly look into logstash. Do you think it is a good idea to separate data analysis and indexing into 2 different machines since both require lots of cpu time.
If I use logstash to send files over to ES, will I be able to use native Java API or http, and is there any preference to the API? I have noticed there are somethings that aren't very easy and may be don't even work in the native API?
Thanks again
On Tuesday, January 28, 2014 1:05:32 PM UTC-5, David Pilato wrote:
Did you try https://github.com/dadoonet/fsriver?
Never tested it with so many docs but may be it could help you here?
If you have already generated json files on a server, then I would recommend trying logstash to send them into elasticsearch.
My 2 cents
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 28 janvier 2014 at 16:46:06, ZenMaster80 (sabda...@gmail.com) a écrit:
I would like to get your perspective on how to load json to index server in my scenario.
We have about 15 million documents in html/pdf/... on Server 1
I would like to process the data and convert to json on server 2
I would like the indexer to index json n a separate machine/server server 3
Ideally I thought on Server 2, as I prepare json and have it ready in memory, I can feed it to indexer. But since data processing is cpu intensive, I want indexing to be done on a separate machines/server.
How do you guys deal with this since I can no longer feed in-memory json to the indexer on separate machine? Do I just grab files from server 2 and index them then?
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f536d58c-89ab-4609-b5ca-cef44e2b879a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52ea06f7.41b71efb.45fa%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.