Several questions on ES in production environment

Han_JU · December 24, 2013, 1:47pm

Hi,

We're approaching the first release of our product and we use ElasticSearch
as a key component in our system. But there's still some questions and
doubts so I'd like to listen to the more experienced users and
ElasticSearch folks here.

We use ElasticSearch as a search tool but also the storage of all
documents. It means that the front-end retrieves fields from ES just as if
it's a database. We've already disable the index (index: no) on the fields
that don't need to be searched (list of ids etc.) but is this a good usage
of ElasticSearch? Given that we expected to have ~ 1 billion documents (~
1.4kb each) in our first 3 months in a single index.
We will use thrift to push documents in production because we've seen a
performance gain. Is there any downside of using thrift over plain json?
Some of our queries uses regexp filter. In my comprehension this needs
to load the target field of every document to see if it matches, so it's
pretty costly for an index of 1 billion docs?

We're also benchmarking our ES setup but your advice and experience are
very appreciated! Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/563d3c8f-cb3c-4c8c-a13d-2f8fcf6c42ce%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

otisg · December 25, 2013, 3:44am

Hi,

On Tuesday, December 24, 2013 8:47:54 AM UTC-5, Han JU wrote:

Hi,

We're approaching the first release of our product and we use
Elasticsearch as a key component in our system. But there's still some
questions and doubts so I'd like to listen to the more experienced users
and Elasticsearch folks here.

We use Elasticsearch as a search tool but also the storage of all
documents. It means that the front-end retrieves fields from ES just as if
it's a database. We've already disable the index (index: no) on the fields
that don't need to be searched (list of ids etc.) but is this a good usage
of Elasticsearch? Given that we expected to have ~ 1 billion documents (~
1.4kb each) in our first 3 months in a single index.

1.4KB is pretty small, so that's fine. Often keeping it all in ES is
simpler - doesn't require another hope to another server (e.g. a DB) to
retrieve display data, there is one moving piece fewer, which makes
everything simple. I'd keep your display data in ES and worry about
changing it later IFF you have issues.

We will use thrift to push documents in production because we've seen a
performance gain. Is there any downside of using thrift over plain json?

Some of our queries uses regexp filter. In my comprehension this needs
to load the target field of every document to see if it matches, so it's
pretty costly for an index of 1 billion docs?

Yes, regexps are not the fastest. What are you trying to do that requires
regexp filter?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f26e7656-61cf-4609-8182-d7c6d406a5cc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_JU · December 26, 2013, 9:40am

Thanks Otis.

The reason that we use regexp filter is also the simplicity. We have a
layer that translates front-end format query to Elasticsearch query DSL.
Definitly we need use something like phrase_query and some custom analyser.

在 2013年12月25日星期三UTC+1上午4时44分11秒，Otis Gospodnetic写道：

Hi,

On Tuesday, December 24, 2013 8:47:54 AM UTC-5, Han JU wrote:

Hi,

We're approaching the first release of our product and we use
Elasticsearch as a key component in our system. But there's still some
questions and doubts so I'd like to listen to the more experienced users
and Elasticsearch folks here.

We use Elasticsearch as a search tool but also the storage of all
documents. It means that the front-end retrieves fields from ES just as if
it's a database. We've already disable the index (index: no) on the fields
that don't need to be searched (list of ids etc.) but is this a good usage
of Elasticsearch? Given that we expected to have ~ 1 billion documents (~
1.4kb each) in our first 3 months in a single index.

1.4KB is pretty small, so that's fine. Often keeping it all in ES is
simpler - doesn't require another hope to another server (e.g. a DB) to
retrieve display data, there is one moving piece fewer, which makes
everything simple. I'd keep your display data in ES and worry about
changing it later IFF you have issues.

We will use thrift to push documents in production because we've seen
a performance gain. Is there any downside of using thrift over plain json?

Some of our queries uses regexp filter. In my comprehension this needs
to load the target field of every document to see if it matches, so it's
pretty costly for an index of 1 billion docs?

Yes, regexps are not the fastest. What are you trying to do that requires
regexp filter?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4665964d-b3f2-42ca-980f-b543496686f5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Who uses ES in production? Elasticsearch	12	660	March 16, 2011
New to ES Elasticsearch	7	329	December 9, 2011
ElasticSearch as a database Elasticsearch	10	423	May 17, 2011
Some queries on ElasticSearch Elasticsearch	4	333	February 1, 2012
Questions relating to elastic search Elasticsearch	2	990	February 8, 2013

Several questions on ES in production environment

Otis

Otis

Related topics