Sourceless Highlighting

gelmajjouti · February 18, 2013, 1:10pm

Hello,

I'm trying to make search highlighting work without storing sources in
ElasticSearch.

In other to do this, I intend to get search hits position from Lucene's *TermFreqVector
*class. I gather from an older thread that I need to write a plugin for
ElasticSearch.

I have two questions :

Is it feasible? Is there a way to extend that part of ElasticSearch?
If so, where should I hook my plugin?

Cheers,
Greg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · February 18, 2013, 3:45pm

Before you start and create a plugin / fork ES, you can just store a
specific field instead of the whole source and
highlight on that instead. Is that sufficient in your case?

Martijn

On 18 February 2013 14:10, gelmajjouti gelmajjouti@gmail.com wrote:

Hello,

I'm trying to make search highlighting work without storing sources in
Elasticsearch.

In other to do this, I intend to get search hits position from Lucene's *TermFreqVector
*class. I gather from an older thread that I need to write a plugin for
Elasticsearch.

I have two questions :

Is it feasible? Is there a way to extend that part of Elasticsearch?

If so, where should I hook my plugin?

Cheers,
Greg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

gelmajjouti · February 18, 2013, 3:55pm

Thanks for your reply.

Most of my data is searchable fulltext. I can't afford to store it in ES.

Greg

On Monday, February 18, 2013 4:45:12 PM UTC+1, Martijn v Groningen wrote:

Before you start and create a plugin / fork ES, you can just store a
specific field instead of the whole source and
highlight on that instead. Is that sufficient in your case?

Martijn

On 18 February 2013 14:10, gelmajjouti <gelma...@gmail.com <javascript:>>wrote:

Hello,

I'm trying to make search highlighting work without storing sources in
Elasticsearch.

In other to do this, I intend to get search hits position from Lucene's *TermFreqVector
*class. I gather from an older thread that I need to write a plugin for
Elasticsearch.

I have two questions :

Is it feasible? Is there a way to extend that part of
Elasticsearch?

If so, where should I hook my plugin?

Cheers,
Greg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

simonw_2 · February 18, 2013, 6:51pm

it might be a useful feature to add returning a stream of offset tuples
rather than highlighted strings. Even if we expose TermVectors they might
be too expensive to transfer while a "snipet descriptor" might be what some
people need.

simon

On Monday, February 18, 2013 4:55:39 PM UTC+1, gelmajjouti wrote:

Thanks for your reply.

Most of my data is searchable fulltext. I can't afford to store it in ES.

Greg

On Monday, February 18, 2013 4:45:12 PM UTC+1, Martijn v Groningen wrote:

Before you start and create a plugin / fork ES, you can just store a
specific field instead of the whole source and
highlight on that instead. Is that sufficient in your case?

Martijn

On 18 February 2013 14:10, gelmajjouti gelma...@gmail.com wrote:

Hello,

I'm trying to make search highlighting work without storing sources in
Elasticsearch.

In other to do this, I intend to get search hits position from Lucene's
*TermFreqVector *class. I gather from an older thread that I need to
write a plugin for Elasticsearch.

I have two questions :

Is it feasible? Is there a way to extend that part of
Elasticsearch?

If so, where should I hook my plugin?

Cheers,
Greg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

roytmana · February 18, 2013, 10:43pm

+100 To me getting data about highlighting is more useful than highlighted
text.
In case of searching _all (or any other non stored multi_field) I would
like to get say a list of matched tokens (ideally including any "fuzzy"
tokens) and I will do client side highlighting of my text based on that.
For cases when we search against set of stored/source fields, name of
field(s) with array of offset tuples (or matched tokens) would be great!.
We often end up highlighting data from our business objects
(JPA/JDO/Hibernate) and not elastic stored data so for this case, most
useful would be a highlight consisting of list of matched fields with array
of matched tokens from the source (case does not matter)

On Monday, February 18, 2013 1:51:11 PM UTC-5, simonw wrote:

it might be a useful feature to add returning a stream of offset tuples
rather than highlighted strings. Even if we expose TermVectors they might
be too expensive to transfer while a "snipet descriptor" might be what some
people need.

simon

On Monday, February 18, 2013 4:55:39 PM UTC+1, gelmajjouti wrote:

Thanks for your reply.

Most of my data is searchable fulltext. I can't afford to store it in ES.

Greg

On Monday, February 18, 2013 4:45:12 PM UTC+1, Martijn v Groningen wrote:

Before you start and create a plugin / fork ES, you can just store a
specific field instead of the whole source and
highlight on that instead. Is that sufficient in your case?

Martijn

On 18 February 2013 14:10, gelmajjouti gelma...@gmail.com wrote:

Hello,

I'm trying to make search highlighting work without storing sources in
Elasticsearch.

In other to do this, I intend to get search hits position from Lucene's
*TermFreqVector *class. I gather from an older thread that I need to
write a plugin for Elasticsearch.

I have two questions :

Is it feasible? Is there a way to extend that part of
Elasticsearch?

If so, where should I hook my plugin?

Cheers,
Greg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · February 18, 2013, 10:56pm

Not the cleanest solution, but you can extract some of this information
from the explain object.

--
Ivan

On Mon, Feb 18, 2013 at 2:43 PM, AlexR roytmana@gmail.com wrote:

+100 To me getting data about highlighting is more useful than highlighted
text.
In case of searching _all (or any other non stored multi_field) I would
like to get say a list of matched tokens (ideally including any "fuzzy"
tokens) and I will do client side highlighting of my text based on that.
For cases when we search against set of stored/source fields, name of
field(s) with array of offset tuples (or matched tokens) would be great!.
We often end up highlighting data from our business objects
(JPA/JDO/Hibernate) and not elastic stored data so for this case, most
useful would be a highlight consisting of list of matched fields with array
of matched tokens from the source (case does not matter)

On Monday, February 18, 2013 1:51:11 PM UTC-5, simonw wrote:

it might be a useful feature to add returning a stream of offset tuples
rather than highlighted strings. Even if we expose TermVectors they might
be too expensive to transfer while a "snipet descriptor" might be what some
people need.

simon

On Monday, February 18, 2013 4:55:39 PM UTC+1, gelmajjouti wrote:

Thanks for your reply.

Most of my data is searchable fulltext. I can't afford to store it in ES.

Greg

On Monday, February 18, 2013 4:45:12 PM UTC+1, Martijn v Groningen wrote:

Before you start and create a plugin / fork ES, you can just store a
specific field instead of the whole source and
highlight on that instead. Is that sufficient in your case?

Martijn

On 18 February 2013 14:10, gelmajjouti gelma...@gmail.com wrote:

Hello,

I'm trying to make search highlighting work without storing sources in
Elasticsearch.

In other to do this, I intend to get search hits position from
Lucene's *TermFreqVector *class. I gather from an older thread that I
need to write a plugin for Elasticsearch.

I have two questions :

Is it feasible? Is there a way to extend that part of
Elasticsearch?

If so, where should I hook my plugin?

Cheers,
Greg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.**com.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

ghoumard · February 19, 2013, 2:04am

+1 exactly a solution I might need. An array of matched field with position
instead of highlighted text would be great !

On Monday, February 18, 2013 5:43:35 PM UTC-5, AlexR wrote:

+100 To me getting data about highlighting is more useful than highlighted
text.
In case of searching _all (or any other non stored multi_field) I would
like to get say a list of matched tokens (ideally including any "fuzzy"
tokens) and I will do client side highlighting of my text based on that.
For cases when we search against set of stored/source fields, name of
field(s) with array of offset tuples (or matched tokens) would be great!.
We often end up highlighting data from our business objects
(JPA/JDO/Hibernate) and not elastic stored data so for this case, most
useful would be a highlight consisting of list of matched fields with array
of matched tokens from the source (case does not matter)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

roytmana · February 19, 2013, 3:20am

i do not know if it is possible, but if it would be fantastic if it were
possible to store start/end of individual fields in _all (or similar
composite fields) and translate highlight positions in _all to which source
fields got matched and positions in those fields in the source. That would
take care of virtually all highlighting needs!

On Mon, Feb 18, 2013 at 9:04 PM, Gildas Houmard ghoumard@gmail.com wrote:

+1 exactly a solution I might need. An array of matched field with
position instead of highlighted text would be great !

On Monday, February 18, 2013 5:43:35 PM UTC-5, AlexR wrote:

+100 To me getting data about highlighting is more useful than
highlighted text.
In case of searching _all (or any other non stored multi_field) I would
like to get say a list of matched tokens (ideally including any "fuzzy"
tokens) and I will do client side highlighting of my text based on that.
For cases when we search against set of stored/source fields, name of
field(s) with array of offset tuples (or matched tokens) would be great!.
We often end up highlighting data from our business objects
(JPA/JDO/Hibernate) and not elastic stored data so for this case, most
useful would be a highlight consisting of list of matched fields with array
of matched tokens from the source (case does not matter)

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Help with searching with highlights Elasticsearch	1	306	July 6, 2017
Highlighted field store in _source Elasticsearch	2	411	July 6, 2017
Remove _source when searching Elasticsearch	3	409	July 6, 2017
Highlighting attachments Elasticsearch	3	384	July 6, 2017
How highlight a text without index it Elasticsearch	13	877	August 21, 2019

Sourceless Highlighting

Related topics