Boost parts of text (content of HTML tags)


(anrikun) #1

Hi,
I know that it is possible to boost documents and fields at indexing time.
But is it possible to boost parts of text inside a field?
I want to index HTML content with Hn and STRONG tags that I would like to
get boosted accordingly.
I could index a structure like:
[{
"value": "Content of a H1 tag",
"boost": 5.0
}, {
"value": "Content of a STRONG tag",
"boost": 2.0
}, {
"value": "Some normal text",
"boost": 1.0
}]

But is there a way to let ES know how to handle this boost parameter?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

I have no good idea of doing this inside of elasticsearch on the top of my
head right now. Maybe you could split the tags on indexing in your
application and index into specific fields and boost these fields more on
query time?

Also, when you want to index HTML content, you should actually strip the
HTML before indexing your data, otherwise a tag gets indexed as
the word 'strong' in your search index.
See the html_strip char filter
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html

Hope this helps..

--Alex

On Fri, Oct 11, 2013 at 11:11 AM, anrikun henri.medot@gmail.com wrote:

Hi,
I know that it is possible to boost documents and fields at indexing time.
But is it possible to boost parts of text inside a field?
I want to index HTML content with Hn and STRONG tags that I would like to
get boosted accordingly.
I could index a structure like:
[{
"value": "Content of a H1 tag",
"boost": 5.0
}, {
"value": "Content of a STRONG tag",
"boost": 2.0
}, {
"value": "Some normal text",
"boost": 1.0
}]

But is there a way to let ES know how to handle this boost parameter?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3