I currently implement all my production application client queries directly
in Java, and use a BoolQueryBuilder to wrap all of my indexed field
queries. I currently only use a filter for geo distance queries. The
toString method creates a very nice pretty-printed JSON form of the search
that the Java API can accept for testing and demonstration purposes.
http://jontai.me/blog/2012/10/using-elasticsearch-to-speed-up-filtering/ is
an interesting article. I'm not using MongoDB; when using ElasticSearch it
is my one and only DB. So keeping _source enabled is necessary. And I've
already disabled the _all field and seen the greatly improved build results
he sees.
But the migration from queries to filters is what caught my attention. I
had already been looking at this, and have some question:
Instead of using the static QueryBuilders.boolQuery method to create a
BoolQueryBuilder, I was considering using a FilterBuilders.boolFilter
method to create a BoolFilterBuilder. It seems to have the the andFilter,
notFilter, and orFilter counterparts to the BoolQueryBuilder's must,
mustNot, and should methods. Is the only difference between queries and
filters really just scoring?
Do I really need to create a QueryBuilders.matchAll query builder and then
add filters to it?
Of course, there doesn't seem to be a counterpart for phrase matching in
the filter query world. So when I detect a blank inside a term string, I
create a phrase match query as follows:
MatchQueryBuilder mqb = matchPhraseQuery(field, qterm.getValue());
mqb.slop(qterm.getSlop());
return mqb;
But by default, I use the fieldQueryBuilder, since it automatically
recognizes strings such as A+B as a phrase, and it also recognizes certain
Chinese characters as individual words of a phrase. Very nice, and fully
compatible with values of one term or a phrase.
FieldQueryBuilder fqb = fieldQuery(field, qterm.getValue());
fqb.defaultOperator(FieldQueryBuilder.Operator.AND);
fqb.autoGeneratePhraseQueries(true);
fqb.enablePositionIncrements(true);
fqb.phraseSlop(qterm.getSlop());
return fqb;
Is there some requirement or benefit to constructing a search using a
top-level QueryBuilders.matchAll and adding the complex tree of filter
builders to it? Or can I bypass the query builders? Or is phrase matching
something that makes it impossible to generically throw either a single
term or a phrase into the query (as I can easily do with query builders).
The caching isn't all that interesting: Ad-hoc queries that are complex
vary widely, and are rarely the same from call to call and across clients.
So once the search engine is warmed up, the non-cached steady state
response times are the most interesting. (For example, a cached query can
return in a few milliseconds, but the first instance of that query took 35
seconds and it only returned 2 matches across those 78M documents.)
Or do I really need to wait until I throw enough machines at this to wring
out the best performance from ad-hoc complex searches?
In the meantime, my application's most commonly used query is get-by-ID
(index.type.id) and that performs brilliantly fast when not cached, even
for databases that approach 100M documents. So I have some time to research
and experiment.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.