Hi everyone,
I work as a sysadmin and I discovered Elastic Search about two months ago.
As I'm new to this technology, many questions are running through my mind
especially regarding cluster architecture.
We have a 3 nodes cluster running ES 0.90.3. Our customer's application is
using ES (with the mapper-attachment plugin) as a search engine for a wide
variety of documents (PDF, OpenXML documents, images, etc...). Each user is
then able to search for their own set of documents. Our customer, who is
also new to this technology, decided to design their architecture as
follows : 1 user -> 1 index containing their documents. Each index is split
into 6 shards + 1 replicas (i.e. 12 shards per index)
As I'm gathering more and more pieces of information each day, I found
myself watching this very interesting video : Big Data, Search and Analyticshttp://www.elasticsearch.org/videos/big-data-search-and-analytics/.
After watching this video, I think it's time to re-consider our cluster
design.
Indeed, as data is growing, we currently are in the following situation :
- 1477 indices
- about 17000 shards o_O
- about 1700000 documents
- about 25 GB of data
Therefore, I think it's WAY TOO BIG especially regarding shards. Plus, the
amount of documents is not that high. One of our node eventually crashed
and it took hours to fully recover. I think the actual design is clearly
the main culprit. That's why I'd like to have a better approach like
suggesting our customer to use ES "routing" capabilities, having less
indices and less shards per index. Maybe one index with 3 or 4 shards (to
match the number of nodes) and use routing based on user ID as the video
suggests.
So, I'd like to know if this should be the correct approach as our customer
data flow seem to match the "users data flow" definition mentioned in the
video or am I heading the wrong way (again;) ? What should be the "best
practices" in this situation ?
Thanks for your time
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.