I have a project that used an old search engine and I would like to move
things to ElasticSearch. I have been doing some reading, and I wanted some
perspective on how to approach the problem.
- I have bundles(folders) of text/html/pdf/img documents, each folder has
an average of 50-100 documents, document is about 100K in Size. - The number of folders and documents can increase and decrease, mostly
increase but very slightly.
I understand that txt/html will need to be turned into JSON now, and
somehow I will have to create an index and add these documents to the index
for indexing. I have some questions that I don't fully understand still.
1- How do I know how many indices do I need?
2- How do I know how many shards to allocate when creating the index?
3- How do I know how many nodes needed, and how do I make things scale up
and down? Is there a way to idle things when no indexing is happening?
4- How do I add documents to the index for indexing? I always see example
with JSON snippets, but in reality I have something like
folder1{doc1,doc2,..doc100}, folder2{docA...docN} ...
5- This is probably a dumb question...Is there a preferable language to use
for the indexing calls? If I were to build an app to call the REST API,
which language I need to use to do this if at all?
Thanks again for the help.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/39e218f3-395c-44b9-bac1-cc2994e26391%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.