I want to enable search on my website. Features would be full-text, field-based, wildcard-based, supporting AND/OR/NOT, etc.
Using the Elastic Search DSL to implement these features is doable. However, I will need to parse the user's query and create appropriate clauses for the final DSL-based query.
On the other hand, if I pass the user's query to a 'query_string' then I can by-pass the parsing.
The 'Elasticsearch: The Definitive Guide' book warns against using query_string in production. If I still go ahead with using query_string, are there any caveats I should be aware of, or anything to guard against?
I know this is a very broad question, but some insight from experienced users will be very helpful.
as it is often the case, I think it depends. You have to consider your requirements and to weigh the tradeoffs of the solutions you have in mind.
If you go with the query string query, it is easier for you to get started. But you are tied to Elasticsearch's query string syntax and if it changes, this directly affects your users. You also cannot (easily) change or tune queries (say, you want customize boosting).
Depending on your requirements it can make sense to use query string query, for example if you develop a company-internal application which has a small user base and which is also advanced (in the sense that they can understand and want the complex query syntax). However, if you have a (potentially large) public user base or casual users you will be better off in the long run by decoupling yourself from the Elasticsearch query string syntax and go with the query DSL. It also allows you to customize the query syntax you offer your users to better fit your domain.
I've made the mistake of exposing query_string to users. It gets you into a funky local minima of work where you keep thinking "just one more regex and they won't be able to crash it". And eventually you end up with two dozen regexes that make query_string "safe" to expose. But that isn't a pleasant avenue to walk.
Can I interest you in simple query string? It may not have everything that you want but it is much less explode-y. It'll try to execute all kinds of garbage input that query_string will just barf back to the user.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.