Question about highlight query

Hi, guys,
I have a question about highlight query in ES.
Below is my query,
{
"_source": [

.....
],
"highlight": {
"fields": {
"FDS_ATTACHMENTS": {
"type": "plain"
},
"FDS_ATTACHMENTS.no_stem": {
"type": "plain"
},
"FDS_ATTACHMENTS.with_case": {
"type": "plain"
},
"headline": {
"type": "plain"
},
"headline.no_stem": {
"type": "plain"
},
"headline.with_case": {
"type": "plain"
}
},
"fragment_size": 500,
"highlight_query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
},
"number_of_fragments": 50,
"post_tags": [
""
],
"pre_tags": [
""
],
"require_field_match": true
},
"query": {
"filtered": {
"filter": {
"range": {
"story_datetime": {
"gte": "20141221t000000",
"lte": "20141222t235959"
}
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
},
{
"span_term": {
"headline.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
},
{
"span_term": {
"headline2.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"_score": {
"ignore_unmapped": true,
"order": "desc"
}
},
{
"story_datetime": {
"order": "desc"
}
}
]
}

And here is a response I got,

  • of the Transactions set forth in the Offering Memorandum, and
    redeeming the Notes, if applicable and (d) conducting such other activities
    as are necessary or appropriate to carry out the activities described
    above. Prior to the Merger Date, the Company shall not own, hold or
    otherwise have any interest in any material assets other than cash and cash
    equivalents and its rights and obligations under the
    Merger Agreement.
    ARTICLE 5. SUCCESSORS Section 5.01

You could see that the slop between rights and Agreement are definitely more than 0, not adjacent at all!
Could someone give me suggestions that how I can change the query to make
sure that in all the segments, rights and agreement are adjacent.
I have set the slop to be 0 in the highlight query, and I don't know why ES
not skip this segment, since it does not match the criteria.

Thank you very much!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/691af78c-5f9a-46f3-a54a-895421c1e28e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No one knows anything about this? I really appreciate anything you offered.

On Monday, December 22, 2014 5:27:57 PM UTC-5, Yang Liu wrote:

Hi, guys,
I have a question about highlight query in ES.
Below is my query,
{
"_source": [

.....
],
"highlight": {
"fields": {
"FDS_ATTACHMENTS": {
"type": "plain"
},
"FDS_ATTACHMENTS.no_stem": {
"type": "plain"
},
"FDS_ATTACHMENTS.with_case": {
"type": "plain"
},
"headline": {
"type": "plain"
},
"headline.no_stem": {
"type": "plain"
},
"headline.with_case": {
"type": "plain"
}
},
"fragment_size": 500,
"highlight_query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
},
"number_of_fragments": 50,
"post_tags": [
""
],
"pre_tags": [
""
],
"require_field_match": true
},
"query": {
"filtered": {
"filter": {
"range": {
"story_datetime": {
"gte": "20141221t000000",
"lte": "20141222t235959"
}
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
},
{
"span_term": {
"headline.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
},
{
"span_term": {
"headline2.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"_score": {
"ignore_unmapped": true,
"order": "desc"
}
},
{
"story_datetime": {
"order": "desc"
}
}
]
}

And here is a response I got,

  • of the Transactions set forth in the Offering Memorandum, and
    redeeming the Notes, if applicable and (d) conducting such other activities
    as are necessary or appropriate to carry out the activities described
    above. Prior to the Merger Date, the Company shall not own, hold or
    otherwise have any interest in any material assets other than cash and cash
    equivalents and its rights and obligations under the
    Merger Agreement.
    ARTICLE 5. SUCCESSORS Section 5.01

You could see that the slop between rights and Agreement are definitely more than 0, not adjacent at
all!
Could someone give me suggestions that how I can change the query to make
sure that in all the segments, rights and agreement are adjacent.
I have set the slop to be 0 in the highlight query, and I don't know why
ES not skip this segment, since it does not match the criteria.

Thank you very much!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Highlighting isn't a nice pretty thing - its kind of a hacky. There are
three highlighters built in
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
to Elasticsearch and they all work differently. You should try all of them
and see if they do what you want. They all come at the problem from a
different perspective and have their own idiosyncrasies. I maintain a
highlighter
plugin https://github.com/wikimedia/search-highlighter as well that you
can use as a forth option. It merges lots of the implementation strategies
that the other ones use together and attempts to give you more options and
it might do what you need.

Nik

On Tue, Dec 23, 2014 at 12:44 PM, Yang Liu yl916@nyu.edu wrote:

No one knows anything about this? I really appreciate anything you offered.

On Monday, December 22, 2014 5:27:57 PM UTC-5, Yang Liu wrote:

Hi, guys,
I have a question about highlight query in ES.
Below is my query,
{
"_source": [

.....
],
"highlight": {
"fields": {
"FDS_ATTACHMENTS": {
"type": "plain"
},
"FDS_ATTACHMENTS.no_stem": {
"type": "plain"
},
"FDS_ATTACHMENTS.with_case": {
"type": "plain"
},
"headline": {
"type": "plain"
},
"headline.no_stem": {
"type": "plain"
},
"headline.with_case": {
"type": "plain"
}
},
"fragment_size": 500,
"highlight_query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
},
"number_of_fragments": 50,
"post_tags": [
""
],
"pre_tags": [
""
],
"require_field_match": true
},
"query": {
"filtered": {
"filter": {
"range": {
"story_datetime": {
"gte": "20141221t000000",
"lte": "20141222t235959"
}
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
},
{
"span_term": {
"headline.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
},
{
"span_term": {
"headline2.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"_score": {
"ignore_unmapped": true,
"order": "desc"
}
},
{
"story_datetime": {
"order": "desc"
}
}
]
}

And here is a response I got,

  • of the Transactions set forth in the Offering Memorandum, and
    redeeming the Notes, if applicable and (d) conducting such other activities
    as are necessary or appropriate to carry out the activities described
    above. Prior to the Merger Date, the Company shall not own, hold or
    otherwise have any interest in any material assets other than cash and cash
    equivalents and its rights and obligations under the
    Merger Agreement.
    ARTICLE 5. SUCCESSORS Section 5.01

You could see that the slop between rights and Agreement are definitely more than 0, not adjacent at
all!
Could someone give me suggestions that how I can change the query to make
sure that in all the segments, rights and agreement are adjacent.
I have set the slop to be 0 in the highlight query, and I don't know why
ES not skip this segment, since it does not match the criteria.

Thank you very much!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0PyNZ%2BZKhoRX6YqfNkZnctu9NsVsuZGh0yvudU3xDGLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

A bit off-topic, but I'd really like to see is the ability to perform
highlighting asynchronously, that is - first get the search results from
Elsaticsearch, process them and get the highlighted snippets on a second
wave, asynchronously.

The main problem with highlighting currently is that it is slow - because
of hackish recursive algorithms and mandatory I/O access. I'd like to avoid
doing 2-step searches (one search for the results, the other one is to
artificially propagate the highlights to the UI on a "second wave" - I
wonder if we can come up with a way to have ES propagate them
asynchronously for us?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 31, 2014 at 5:38 PM, Nikolas Everett nik9000@gmail.com wrote:

Highlighting isn't a nice pretty thing - its kind of a hacky. There are
three highlighters built in
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
to Elasticsearch and they all work differently. You should try all of them
and see if they do what you want. They all come at the problem from a
different perspective and have their own idiosyncrasies. I maintain a highlighter
plugin https://github.com/wikimedia/search-highlighter as well that you
can use as a forth option. It merges lots of the implementation strategies
that the other ones use together and attempts to give you more options and
it might do what you need.

Nik

On Tue, Dec 23, 2014 at 12:44 PM, Yang Liu yl916@nyu.edu wrote:

No one knows anything about this? I really appreciate anything you
offered.

On Monday, December 22, 2014 5:27:57 PM UTC-5, Yang Liu wrote:

Hi, guys,
I have a question about highlight query in ES.
Below is my query,
{
"_source": [

.....
],
"highlight": {
"fields": {
"FDS_ATTACHMENTS": {
"type": "plain"
},
"FDS_ATTACHMENTS.no_stem": {
"type": "plain"
},
"FDS_ATTACHMENTS.with_case": {
"type": "plain"
},
"headline": {
"type": "plain"
},
"headline.no_stem": {
"type": "plain"
},
"headline.with_case": {
"type": "plain"
}
},
"fragment_size": 500,
"highlight_query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
},
"number_of_fragments": 50,
"post_tags": [
""
],
"pre_tags": [
""
],
"require_field_match": true
},
"query": {
"filtered": {
"filter": {
"range": {
"story_datetime": {
"gte": "20141221t000000",
"lte": "20141222t235959"
}
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
},
{
"span_term": {
"headline.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
},
{
"span_term": {
"headline2.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"_score": {
"ignore_unmapped": true,
"order": "desc"
}
},
{
"story_datetime": {
"order": "desc"
}
}
]
}

And here is a response I got,

  • of the Transactions set forth in the Offering Memorandum, and
    redeeming the Notes, if applicable and (d) conducting such other activities
    as are necessary or appropriate to carry out the activities described
    above. Prior to the Merger Date, the Company shall not own, hold or
    otherwise have any interest in any material assets other than cash and cash
    equivalents and its rights and obligations under the
    Merger Agreement.
    ARTICLE 5. SUCCESSORS Section 5.01

You could see that the slop between rights and Agreement are definitely more than 0, not adjacent at
all!
Could someone give me suggestions that how I can change the query to
make sure that in all the segments, rights and agreement are adjacent.
I have set the slop to be 0 in the highlight query, and I don't know why
ES not skip this segment, since it does not match the criteria.

Thank you very much!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0PyNZ%2BZKhoRX6YqfNkZnctu9NsVsuZGh0yvudU3xDGLw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0PyNZ%2BZKhoRX6YqfNkZnctu9NsVsuZGh0yvudU3xDGLw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuOyebnTiFJ%2B3MNVP5-nb8_U0HZRJ%2BXP%2Bbu4QrroDQ%2B1w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I imagine you could repeat a search replacing the query with ID searches
and provide the search in the highlight query. I don't know how efficient
it'd be.

You could give the experimental highlighter a try for speed. Its generally
the fastest option I've tried and I don't believe it has any funky
recursive algorithms, though I think phrase detection isn't too quick but
at least its optional and less prone to crazy then the fvh. It also
minimizes io by giving you the option of doing hit detection using any of
the three methods used by the other highlighters. So you can reananlyze
short fields, use postings for some, and term vectors for others. You can
chose which one to use on the fly if you like too.

All and all I bet there are use cases where it'd make more sense to delay
highlighting - like fetching a chunk of results and only doing the
highlighting when the user scrolls to make the result visible. Or
something.

Nik
On Jan 1, 2015 6:03 PM, "Itamar Syn-Hershko" itamar@code972.com wrote:

A bit off-topic, but I'd really like to see is the ability to perform
highlighting asynchronously, that is - first get the search results from
Elsaticsearch, process them and get the highlighted snippets on a second
wave, asynchronously.

The main problem with highlighting currently is that it is slow - because
of hackish recursive algorithms and mandatory I/O access. I'd like to avoid
doing 2-step searches (one search for the results, the other one is to
artificially propagate the highlights to the UI on a "second wave" - I
wonder if we can come up with a way to have ES propagate them
asynchronously for us?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 31, 2014 at 5:38 PM, Nikolas Everett nik9000@gmail.com
wrote:

Highlighting isn't a nice pretty thing - its kind of a hacky. There are
three highlighters built in
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
to Elasticsearch and they all work differently. You should try all of them
and see if they do what you want. They all come at the problem from a
different perspective and have their own idiosyncrasies. I maintain a highlighter
plugin https://github.com/wikimedia/search-highlighter as well that
you can use as a forth option. It merges lots of the implementation
strategies that the other ones use together and attempts to give you more
options and it might do what you need.

Nik

On Tue, Dec 23, 2014 at 12:44 PM, Yang Liu yl916@nyu.edu wrote:

No one knows anything about this? I really appreciate anything you
offered.

On Monday, December 22, 2014 5:27:57 PM UTC-5, Yang Liu wrote:

Hi, guys,
I have a question about highlight query in ES.
Below is my query,
{
"_source": [

.....
],
"highlight": {
"fields": {
"FDS_ATTACHMENTS": {
"type": "plain"
},
"FDS_ATTACHMENTS.no_stem": {
"type": "plain"
},
"FDS_ATTACHMENTS.with_case": {
"type": "plain"
},
"headline": {
"type": "plain"
},
"headline.no_stem": {
"type": "plain"
},
"headline.with_case": {
"type": "plain"
}
},
"fragment_size": 500,
"highlight_query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
},
"number_of_fragments": 50,
"post_tags": [
""
],
"pre_tags": [
""
],
"require_field_match": true
},
"query": {
"filtered": {
"filter": {
"range": {
"story_datetime": {
"gte": "20141221t000000",
"lte": "20141222t235959"
}
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
}
],
"in_order": true,
"slop": 0
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "rights"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "agreement"
}
},
{
"span_term": {
"FDS_ATTACHMENTS.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline.no_stem": "rights"
}
},
{
"span_term": {
"headline.no_stem": "agreement"
}
},
{
"span_term": {
"headline.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
},
{
"span_near": {
"clauses": [
{
"span_term": {
"headline2.no_stem": "rights"
}
},
{
"span_term": {
"headline2.no_stem": "agreement"
}
},
{
"span_term": {
"headline2.no_stem": "merger"
}
}
],
"in_order": false,
"slop": 5
}
}
]
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"_score": {
"ignore_unmapped": true,
"order": "desc"
}
},
{
"story_datetime": {
"order": "desc"
}
}
]
}

And here is a response I got,

  • of the Transactions set forth in the Offering Memorandum, and
    redeeming the Notes, if applicable and (d) conducting such other activities
    as are necessary or appropriate to carry out the activities described
    above. Prior to the Merger Date, the Company shall not own, hold or
    otherwise have any interest in any material assets other than cash and cash
    equivalents and its rights and obligations under the
    Merger Agreement.
    ARTICLE 5. SUCCESSORS Section 5.01

You could see that the slop between rights and Agreement are definitely more than 0, not adjacent at
all!
Could someone give me suggestions that how I can change the query to
make sure that in all the segments, rights and agreement are adjacent.
I have set the slop to be 0 in the highlight query, and I don't know
why ES not skip this segment, since it does not match the criteria.

Thank you very much!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e35c12fd-a8bb-4edd-825d-f846eaeb02c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0PyNZ%2BZKhoRX6YqfNkZnctu9NsVsuZGh0yvudU3xDGLw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0PyNZ%2BZKhoRX6YqfNkZnctu9NsVsuZGh0yvudU3xDGLw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuOyebnTiFJ%2B3MNVP5-nb8_U0HZRJ%2BXP%2Bbu4QrroDQ%2B1w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuOyebnTiFJ%2B3MNVP5-nb8_U0HZRJ%2BXP%2Bbu4QrroDQ%2B1w%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd29sAkwZxtHtHgV7NkCkbZDpFcr9FkxjHathbwn6CXYzw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.