Analyzing queries in the client side of Elasticsearch but not on the server


(ohw) #1

Hi folks

I just asked a question in StackOverflow, please have a look if you have
encountered similar problem or have some input to it.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw ohw@zhihu.com wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you have
encountered similar problem or have some input to it.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEhKcBZBQ2m1oYqe6C7fMzc17APqJvRrHtqCEjecCRunA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(ohw) #3

Sure, here it is:


We are migrating our Lucene based search codebase to Elasticsearch. The
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query string,
and transform that to Lucene's Query object, which is then fed into
Lucene's IndexSearcher. However, in Elasticsearch we don't directly
interact with IndexSearcher, instead we can only build the queries in the
client side using Query DSL
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html and
send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we
decided that we can stick to our old approach by:

  1. Explicitly tell Elasticsearch to NOT analyze any query in the
    search time.
  2. Do ALL the query related analysis (tokenizing, synonym, etc) in the
    Java client.
  3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
    mapping to Lucene's Query

The questions are:

  1. Is this approach feasible?
  2. What are the potential problems in doing so?
  3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our
scorer scripts as a Elasticsearch plugin.


Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw <o...@zhihu.com <javascript:>> wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you have
encountered similar problem or have some input to it.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

The Query DSL is not equivalent to Lucene Query but close to, with
enhancements.

If you want to make use of Lucene Query, and you already decided to write a
plugin for scoring, so why don't you just add your query parsers to the
plugin?

Jörg

On Fri, Jun 6, 2014 at 9:39 AM, ohw ohw@zhihu.com wrote:

Sure, here it is:


We are migrating our Lucene based search codebase to Elasticsearch. The
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query string,
and transform that to Lucene's Query object, which is then fed into
Lucene's IndexSearcher. However, in Elasticsearch we don't directly
interact with IndexSearcher, instead we can only build the queries in the
client side using Query DSL
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html and
send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we
decided that we can stick to our old approach by:

  1. Explicitly tell Elasticsearch to NOT analyze any query in the
    search time.
  2. Do ALL the query related analysis (tokenizing, synonym, etc) in
    the Java client.
  3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
    mapping to Lucene's Query

The questions are:

  1. Is this approach feasible?
  2. What are the potential problems in doing so?
  3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our
scorer scripts as a Elasticsearch plugin.


Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you have
encountered similar problem or have some input to it.

Thanks in advance!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7REug8e%3DnoDRNHvucR%3DhcK9PvuZnr2iNinxs%3Dfe945w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(ohw) #5

Thank you Jörg, I didn't realize that I can plug the query parsers into
elasticsearch, would you please elaborate more on this?

On Fri, Jun 6, 2014 at 4:53 PM, joergprante@gmail.com <joergprante@gmail.com

wrote:

The Query DSL is not equivalent to Lucene Query but close to, with
enhancements.

If you want to make use of Lucene Query, and you already decided to write
a plugin for scoring, so why don't you just add your query parsers to the
plugin?

Jörg

On Fri, Jun 6, 2014 at 9:39 AM, ohw ohw@zhihu.com wrote:

Sure, here it is:


We are migrating our Lucene based search codebase to Elasticsearch. The
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query string,
and transform that to Lucene's Query object, which is then fed into
Lucene's IndexSearcher. However, in Elasticsearch we don't directly
interact with IndexSearcher, instead we can only build the queries in the
client side using Query DSL
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html and
send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we
decided that we can stick to our old approach by:

  1. Explicitly tell Elasticsearch to NOT analyze any query in the
    search time.
  2. Do ALL the query related analysis (tokenizing, synonym, etc) in
    the Java client.
  3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
    mapping to Lucene's Query

The questions are:

  1. Is this approach feasible?
  2. What are the potential problems in doing so?
  3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our
scorer scripts as a Elasticsearch plugin.


Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you
have encountered similar problem or have some input to it.

Thanks in advance!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/bfe7OXPAPKk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7REug8e%3DnoDRNHvucR%3DhcK9PvuZnr2iNinxs%3Dfe945w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7REug8e%3DnoDRNHvucR%3DhcK9PvuZnr2iNinxs%3Dfe945w%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJHQfZKr5UOV_HHpY95xq6%2Be1AhFuu4wc2b2B_V%2BC1pkUDw%2B%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #6

The idea is:

  • the basic entry point code for how search works is in
    org.elasticsearch.rest.action.search.RestSearchAction, if you want to
    expose an enhanced search to REST

  • building the query works with
    org.elasticsearch.search.builder.SearchSourceBuilder which has a convenient
    method query(queryBuilder) for Java API

  • org.elasticsearch.indices.query.IndicesQueriesModule is responsible for
    managing the query parsers, there are addQuery() and addFilter methods(),
    these methods must be invoked at plugin initialization time

  • so you can write a pair of My...QueryBuilder and My...QueryParser for
    each of your query in your plugin

  • copy/paste RestSearchAction to something like My...RestSearchAction with
    a custom endpoint, for example _mysearch, and then you can use your query
    implementation, wrapped in JSON, just like you would do in _search REST
    action. The new REST endpoint must be registered in the plugin
    initialization

  • for studying implementation details, the existing standard query
    parser/builder impls in org.elasticsearch.index.query are useful

Jörg

On Fri, Jun 6, 2014 at 11:19 AM, Heng Wang ohw@zhihu.com wrote:

Thank you Jörg, I didn't realize that I can plug the query parsers into
elasticsearch, would you please elaborate more on this?

On Fri, Jun 6, 2014 at 4:53 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

The Query DSL is not equivalent to Lucene Query but close to, with
enhancements.

If you want to make use of Lucene Query, and you already decided to write
a plugin for scoring, so why don't you just add your query parsers to the
plugin?

Jörg

On Fri, Jun 6, 2014 at 9:39 AM, ohw ohw@zhihu.com wrote:

Sure, here it is:


We are migrating our Lucene based search codebase to Elasticsearch. The
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query
string, and transform that to Lucene's Query object, which is then fed
into Lucene's IndexSearcher. However, in Elasticsearch we don't directly
interact with IndexSearcher, instead we can only build the queries in the
client side using Query DSL
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html and
send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we
decided that we can stick to our old approach by:

  1. Explicitly tell Elasticsearch to NOT analyze any query in the
    search time.
  2. Do ALL the query related analysis (tokenizing, synonym, etc) in
    the Java client.
  3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
    mapping to Lucene's Query

The questions are:

  1. Is this approach feasible?
  2. What are the potential problems in doing so?
  3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our
scorer scripts as a Elasticsearch plugin.


Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you
have encountered similar problem or have some input to it.

Thanks in advance!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/bfe7OXPAPKk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7REug8e%3DnoDRNHvucR%3DhcK9PvuZnr2iNinxs%3Dfe945w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7REug8e%3DnoDRNHvucR%3DhcK9PvuZnr2iNinxs%3Dfe945w%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJHQfZKr5UOV_HHpY95xq6%2Be1AhFuu4wc2b2B_V%2BC1pkUDw%2B%3Dw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAJHQfZKr5UOV_HHpY95xq6%2Be1AhFuu4wc2b2B_V%2BC1pkUDw%2B%3Dw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEcKOvTm6mF2nhs3gC_wFr3oJH-NaCfDhBFCLpaYxCoTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7