I have stumbled across the requirement to allow users the freedom of searching using full boolean operators. The users show be allowed to specify fields to search in, but at the same time some fields should not be available.
Is there any way to secure the query_string with a set of legal field names?
And in general, is there any way to make sure the query string isn't "hacked" to override field names. Would like to avoid creating my own query parser.
Elastic has a product that has support for field level security, essentially making it look like fields you don't have access to don't exist. So that is an option.
query_string is fairly problematic because when you give invalid syntax it can sometimes fail with errors that require you to interpret the stack trace to figure out how the syntax is invalid. It is also very, very easy for a user to execute very expensive queries like wildcard and fuzzy with query string. So it generally isn't a thing you should expose to users.
Simple query string is better if it is expressive enough for you. But I expect it isn't because you can't change the fields, iirc.
The real situation is a little more complicated since there are two sets of queries each with sets of allowed fields, one on a recent date range, the other for historic data.
I suspect Shield wouldn't help in that regard.
Guess we have to roll up our sleeves....
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.