Two questions about es usage


(lyes zaiko) #1

Hello all

I have to index a cluster of html pages, and I want to know if:

  1. Is there a direct mean with es that allows to escape html tags/special
    characters on the indexation or after search. Or do I have to escape them
    in my side?

  2. since my html pages are in different languages, is it possible to use a
    different stemmer in the indexation according to the language of each page?

I am using the http api!

Thank you all


(Clinton Gormley) #2

Hiya

  1. Is there a direct mean with es that allows to escape html
    tags/special characters on the indexation or after search. Or do I
    have to escape them in my side?

What do you mean by escape them? Do you mean strip them?

If so, then yes:

http://www.elasticsearch.org/guide/reference/index-modules/analysis/htmlstrip-charfilter.html

  1. since my html pages are in different languages, is it possible to
    use a different stemmer in the indexation according to the language
    of each page?

Yes:

http://www.elasticsearch.org/guide/reference/mapping/analyzer-field.html

clint


(system) #3