Why use multiple ElasticSearch indices for one web application?

(bdb) #1

Being new to ES, i would like some advice on the following, please...

In asking a questions relating to using ES for web applications, suggestions have been made to have one index for things like user profiles, another index for data, etc., and several other ones for logs.

Having these all on a cluster with several web applications, this seems like things could get messy or disorganized.

In that case, are people using one cluster per application? I am a bit confused because when I read articles about indexing logs, they seem to refer to storing the data in multiple indices, rather than types within an index.

Secondly, why not have one index per app, with types for logs, user profiles, data, etc.?

Is there some benefit to using multiple indices rather than many types within an index for a web application, of so, what kinds of naming conventions are typical?

(Mark Walkom) #2

If you were using a more traditional RDBMS, would you put all that data into the same table (irrespective of type)?

(bdb) #3

Traditionally, all in different tables, which translate to types for ES.

I believe I found the answer to this when looking up pagination https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html .

Deep Paging in Distributed Systems

To understand why deep paging is problematic, let’s imagine that we are searching within a single index with five primary shards. When we request the first page of results (results 1 to 10), each shard produces its own top 10 results and returns them to the requesting node, which then sorts all 50 results in order to select the overall top 10.

Now imagine that we ask for page 1,000—results 10,001 to 10,010. Everything works in the same way except that each shard has to produce its top 10,010 results. The requesting node then sorts through all 50,050 results and discards 50,040 of them!

You can see that, in a distributed system, the cost of sorting results grows exponentially the deeper we page. There is a good reason that web search engines don’t return more than 1,000 results for any query.

(Mark Walkom) #4

Depends who you talk to.

I wouldn't be putting all that into the same index.

(Loren Siebert) #5

This one came up before and both @warkolm and I chimed in here: Types and Indices. One to one?

(system) #6