Query pre-processing before execution?


(Otis Gospodnetić) #1

Hi,

What is the best way to pre-process a query a bit before ES executes it?
(e.g. I want to shingle the query string a bit and expand/rewrite a query
before letting ES execute it)

I can create a custom Rest Action and a new API endpoint, but I'd prefer to
hide custom query pre-processing behind the standard ES query API.

Is there any way to do that?

Thanks,
Otis

Elasticsearch Performance Monitoring * Log Management * Search Analytics

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

I would rather use the analyzer/token filter machinery of Lucene for
search/index extensions, plugging this into ES is a breeze.

If you want field specific mangling, I would use the field mapper to create
a new field type. There, you have read access to the whole (immutable)
document source and you can pre-process the field input data in the given
document context before indexing.

Jörg

On Wed, Aug 20, 2014 at 12:00 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

What is the best way to pre-process a query a bit before ES executes it?
(e.g. I want to shingle the query string a bit and expand/rewrite a query
before letting ES execute it)

I can create a custom Rest Action and a new API endpoint, but I'd prefer
to hide custom query pre-processing behind the standard ES query API.

Is there any way to do that?

Thanks,
Otis

Elasticsearch Performance Monitoring * Log Management * Search Analytics
http://sematext.com/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFzggLcxBa0axOcyZ2v5CxG8v9NmquFcmHp%2ByiS8Z0F4w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Pawel-2) #3

Hi Joerg,
You are right about analyzer. I have also a little different case or maybe
I missed something (and analyzer-way can also handle my case).

I'd like to process a query and add additional filter to each of queries.
To build this filter external service should be queried to fetch additional
data and use this data to build proper filter. Filter can be build for a
few fields: for example fileld1:foo AND field2:bar AND field3:test. Do you
have any suggestions?

--
Paweł

On Thu, Aug 21, 2014 at 10:31 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I would rather use the analyzer/token filter machinery of Lucene for
search/index extensions, plugging this into ES is a breeze.

If you want field specific mangling, I would use the field mapper to
create a new field type. There, you have read access to the whole
(immutable) document source and you can pre-process the field input data in
the given document context before indexing.

Jörg

On Wed, Aug 20, 2014 at 12:00 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

What is the best way to pre-process a query a bit before ES executes it?
(e.g. I want to shingle the query string a bit and expand/rewrite a query
before letting ES execute it)

I can create a custom Rest Action and a new API endpoint, but I'd prefer
to hide custom query pre-processing behind the standard ES query API.

Is there any way to do that?

Thanks,
Otis

Elasticsearch Performance Monitoring * Log Management * Search Analytics
http://sematext.com/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFzggLcxBa0axOcyZ2v5CxG8v9NmquFcmHp%2ByiS8Z0F4w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFzggLcxBa0axOcyZ2v5CxG8v9NmquFcmHp%2ByiS8Z0F4w%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMx70HZ2rOyU%3D%3DQh63j9wYVvjju%2BJMZsO1DQxKnZjN8Xw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

I do not fully understand what an external filter service is but I remember
such a question before. It does not matter where the filter terms come
from, you can set up your application, and add filter terms at ES language
level from there. This is the most flexible and scalable approach.

It is not feasible to build a long string of fileld1:foo AND field2:bar AND
field3:test. You should really use the ES Java API or the DSL to build
filters, not Lucene Query language.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html

A special case is where a service already provides Lucene bitsets and this
should be processed in ES. I have not enough imagination how this can work
at all, regarding the distributed nature of ES, but you can plug Lucene
filters (bitsets) via the filter API into ES queries, see
org.elasticsearch.index.query.TermsFilterParser
for an example how ES transforms JSON filter terms to Lucene filters.

Jörg

On Mon, Aug 25, 2014 at 10:58 AM, Pawel pawelmiszcz@gmail.com wrote:

Hi Joerg,
You are right about analyzer. I have also a little different case or maybe
I missed something (and analyzer-way can also handle my case).

I'd like to process a query and add additional filter to each of queries.
To build this filter external service should be queried to fetch additional
data and use this data to build proper filter. Filter can be build for a
few fields: for example fileld1:foo AND field2:bar AND field3:test. Do you
have any suggestions?

--
Paweł

On Thu, Aug 21, 2014 at 10:31 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I would rather use the analyzer/token filter machinery of Lucene for
search/index extensions, plugging this into ES is a breeze.

If you want field specific mangling, I would use the field mapper to
create a new field type. There, you have read access to the whole
(immutable) document source and you can pre-process the field input data in
the given document context before indexing.

Jörg

On Wed, Aug 20, 2014 at 12:00 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

What is the best way to pre-process a query a bit before ES executes it?
(e.g. I want to shingle the query string a bit and expand/rewrite a query
before letting ES execute it)

I can create a custom Rest Action and a new API endpoint, but I'd prefer
to hide custom query pre-processing behind the standard ES query API.

Is there any way to do that?

Thanks,
Otis

Elasticsearch Performance Monitoring * Log Management * Search Analytics
http://sematext.com/

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e99854af-65e1-42c0-9dab-e384c8c281e6%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFzggLcxBa0axOcyZ2v5CxG8v9NmquFcmHp%2ByiS8Z0F4w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFzggLcxBa0axOcyZ2v5CxG8v9NmquFcmHp%2ByiS8Z0F4w%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMx70HZ2rOyU%3D%3DQh63j9wYVvjju%2BJMZsO1DQxKnZjN8Xw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMx70HZ2rOyU%3D%3DQh63j9wYVvjju%2BJMZsO1DQxKnZjN8Xw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGzwd%2BrEkCq3SG%3Dap2UjLezS1xDtShcnxfxpfinnwrDLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Otis Gospodnetić) #5

Hi,

On Monday, August 25, 2014 11:40:53 AM UTC+2, Jörg Prante wrote:

I do not fully understand what an external filter service is but I
remember such a question before. It does not matter where the filter terms
come from, you can set up your application, and add filter terms at ES
language level from there. This is the most flexible and scalable approach.

I think by "your application" you mean the client making the call to ES to
execute a query, right?
If yes, I agree. But that requires this client application to do all this
work. What if one wants to alter the query on the server/ES-side without
the client having to do the work?

It is not feasible to build a long string of fileld1:foo AND field2:bar AND

field3:test. You should really use the ES Java API or the DSL to build
filters, not Lucene Query language.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html

I think the Lucene query-like string was just "pseudo-query" to avoid
typing in the JSON variant of that.

A special case is where a service already provides Lucene bitsets and this

should be processed in ES. I have not enough imagination how this can work
at all, regarding the distributed nature of ES, but you can plug Lucene
filters (bitsets) via the filter API into ES queries, see org.elasticsearch.index.query.TermsFilterParser
for an example how ES transforms JSON filter terms to Lucene filters.

Thanks Jörg, will look into that!

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Mon, Aug 25, 2014 at 10:58 AM, Pawel <pawel...@gmail.com <javascript:>>
wrote:

Hi Joerg,
You are right about analyzer. I have also a little different case or
maybe I missed something (and analyzer-way can also handle my case).

I'd like to process a query and add additional filter to each of queries.
To build this filter external service should be queried to fetch additional
data and use this data to build proper filter. Filter can be build for a
few fields: for example fileld1:foo AND field2:bar AND field3:test. Do you
have any suggestions?

--
Paweł

On Thu, Aug 21, 2014 at 10:31 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

I would rather use the analyzer/token filter machinery of Lucene for
search/index extensions, plugging this into ES is a breeze.

If you want field specific mangling, I would use the field mapper to
create a new field type. There, you have read access to the whole
(immutable) document source and you can pre-process the field input data in
the given document context before indexing.

Jörg

On Wed, Aug 20, 2014 at 12:00 PM, Otis Gospodnetic <
otis.gos...@gmail.com <javascript:>> wrote:

Hi,

What is the best way to pre-process a query a bit before ES executes
it? (e.g. I want to shingle the query string a bit and expand/rewrite a
query before letting ES execute it)

I can create a custom Rest Action and a new API endpoint, but I'd
prefer to hide custom query pre-processing behind the standard ES query API.

Is there any way to do that?

Thanks,
Otis

Elasticsearch Performance Monitoring * Log Management * Search Analytics
http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d1f3ea20-1118-4044-92d6-f96fa0f83cd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(paolociccarese) #6

Hi Otis,
did you find a good way to do this? I have a similar need and I was
wondering what would be the best way to do automatically add a filter for
each query.
In my case that filter will take care of the access restriction at the
document level according to the access model provided by Apache Accumulo.

Best,
Paolo

On Monday, August 25, 2014 at 7:27:30 AM UTC-4, Otis Gospodnetic wrote:

Hi,

On Monday, August 25, 2014 11:40:53 AM UTC+2, Jörg Prante wrote:

I do not fully understand what an external filter service is but I
remember such a question before. It does not matter where the filter terms
come from, you can set up your application, and add filter terms at ES
language level from there. This is the most flexible and scalable approach.

I think by "your application" you mean the client making the call to ES to
execute a query, right?
If yes, I agree. But that requires this client application to do all this
work. What if one wants to alter the query on the server/ES-side without
the client having to do the work?

It is not feasible to build a long string of fileld1:foo AND

field2:bar AND field3:test. You should really use the ES Java API or the
DSL to build filters, not Lucene Query language.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html

I think the Lucene query-like string was just "pseudo-query" to avoid
typing in the JSON variant of that.

A special case is where a service already provides Lucene bitsets and this

should be processed in ES. I have not enough imagination how this can work
at all, regarding the distributed nature of ES, but you can plug Lucene
filters (bitsets) via the filter API into ES queries, see org.elasticsearch.index.query.TermsFilterParser
for an example how ES transforms JSON filter terms to Lucene filters.

Thanks Jörg, will look into that!

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Mon, Aug 25, 2014 at 10:58 AM, Pawel pawel...@gmail.com wrote:

Hi Joerg,
You are right about analyzer. I have also a little different case or
maybe I missed something (and analyzer-way can also handle my case).

I'd like to process a query and add additional filter to each of
queries. To build this filter external service should be queried to fetch
additional data and use this data to build proper filter. Filter can be
build for a few fields: for example fileld1:foo AND field2:bar AND
field3:test. Do you have any suggestions?

--
Paweł

On Thu, Aug 21, 2014 at 10:31 AM, joerg...@gmail.com <joerg...@gmail.com

wrote:

I would rather use the analyzer/token filter machinery of Lucene for
search/index extensions, plugging this into ES is a breeze.

If you want field specific mangling, I would use the field mapper to
create a new field type. There, you have read access to the whole
(immutable) document source and you can pre-process the field input data in
the given document context before indexing.

Jörg

On Wed, Aug 20, 2014 at 12:00 PM, Otis Gospodnetic <
otis.gos...@gmail.com> wrote:

Hi,

What is the best way to pre-process a query a bit before ES executes
it? (e.g. I want to shingle the query string a bit and expand/rewrite a
query before letting ES execute it)

I can create a custom Rest Action and a new API endpoint, but I'd
prefer to hide custom query pre-processing behind the standard ES query API.

Is there any way to do that?

Thanks,
Otis

Elasticsearch Performance Monitoring * Log Management * Search
Analytics
http://sematext.com/

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/084ef06d-cb0e-411e-a67d-8d63736b916b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7