User friendly/google-like queries


(Adam Warski) #1

I am wondering if ElasticSearch provides any utilities or if there are
some guidelines for implementing user-friendly queries?

I suppose the query would need to be somehow translated into a form
understandable by Lucene/ElasticSearch.
This means probably:

  • escaping special characters
  • treating +word, -word appropriately
  • treating "word word" as a phrase
  • being tolerant for unclosed " and '

--
Regards,
Adam Warski


(Joaquin Cuenca Abela) #2

Hi Adam,

maybe there is a better way to do it, but what I'm doing is sending
the user's query to elasticsearch, and if I get back a 500, I
"sanitize" the query doing (in Python):

q = re.sub(r'([-+!(){}[]^"~*?:\]|&&|||)', r'\\1', q.strip())
q = re.sub(r'(^|(?<=\W))AND($|(?=\W))', 'and', q)
q = re.sub(r'(^|(?<=\W))OR($|(?=\W))', 'or', q)
q = re.sub(r'(^|(?<=\W))NOT($|(?=\W))', 'not', q)

The first line removes "dangerous" characters, and the next 3 lines
remove instances of AND, OR and NOT at the beginning or at the end of
the query.

I'm also interested on more graceful ways to deal with failures, but
so far this has been good enough for me.

Cheers,

On Mon, Jan 24, 2011 at 4:38 PM, Adam Warski adamwtw@gmail.com wrote:

I am wondering if ElasticSearch provides any utilities or if there are
some guidelines for implementing user-friendly queries?

I suppose the query would need to be somehow translated into a form
understandable by Lucene/ElasticSearch.
This means probably:

  • escaping special characters
  • treating +word, -word appropriately
  • treating "word word" as a phrase
  • being tolerant for unclosed " and '

--
Regards,
Adam Warski

--
Joaquin Cuenca Abela


(Karussell) #3

I would be also interested :slight_smile:

On 24 Jan., 22:18, Joaquin Cuenca Abela joaq...@cuencaabela.com
wrote:

Hi Adam,

maybe there is a better way to do it, but what I'm doing is sending
the user's query to elasticsearch, and if I get back a 500, I
"sanitize" the query doing (in Python):

q = re.sub(r'([-+!(){}[]^"~*?:\]|&&|||)', r'\\1', q.strip())
q = re.sub(r'(^|(?<=\W))AND($|(?=\W))', 'and', q)
q = re.sub(r'(^|(?<=\W))OR($|(?=\W))', 'or', q)
q = re.sub(r'(^|(?<=\W))NOT($|(?=\W))', 'not', q)

The first line removes "dangerous" characters, and the next 3 lines
remove instances of AND, OR and NOT at the beginning or at the end of
the query.

I'm also interested on more graceful ways to deal with failures, but
so far this has been good enough for me.

Cheers,

On Mon, Jan 24, 2011 at 4:38 PM, Adam Warski adam...@gmail.com wrote:

I am wondering if ElasticSearch provides any utilities or if there are
some guidelines for implementing user-friendly queries?

I suppose the query would need to be somehow translated into a form
understandable by Lucene/ElasticSearch.
This means probably:

  • escaping special characters
  • treating +word, -word appropriately
  • treating "word word" as a phrase
  • being tolerant for unclosed " and '

--
Regards,
Adam Warski

--
Joaquin Cuenca Abela


(system) #4