Analyzer doesn't seems to be working on special characters


(Anand kumar) #1

Hi all,

 While indexing the special characters such as copyright symbol, the 

analyzers seems to be not working.

For instance, the copyright symbol © is indexed as question mark symbol
inside black diamond � .

Any suggestion would be greatly appreciated. Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e143c15c-bf31-49da-91ec-16c542bb51d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #2

The problem might be with your encoding, not the analyzer. Your content is
in one format and either your output is in another or your viewer
(terminal, browser) is in another. Make sure everything is consistent
(UTF-8 for most people).

Where are you seeing the � character?

--
Ivan

On Thu, Aug 7, 2014 at 9:21 AM, Anand kumar anandv1000@gmail.com wrote:

Hi all,

 While indexing the special characters such as copyright symbol, the

analyzers seems to be not working.

For instance, the copyright symbol © is indexed as question mark symbol
inside black diamond � .

Any suggestion would be greatly appreciated. Thanks in advance.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e143c15c-bf31-49da-91ec-16c542bb51d7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e143c15c-bf31-49da-91ec-16c542bb51d7%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAeyq0CYaCsDgWZ5GVziTr39PVShCoay9yfg%3DjMi9KU%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Anand kumar) #3

I'm having that symbol in my ES indices only. Yeah some problem existing
with the encoding itself. I've tried using setHighlighterEncoding to
"HTML", but still the problem persists.

There is no problem in indexing and searching, but while displaying in the
browser, the issue emerges.

The actual text contains copyright symbol, but after the indexing it has
been changed to � symbol.

I just know wanna know, how this has been changing into this format and is
there any other way of encoding to resolve it?

Thanks a lot for your reply.

On Thu, Aug 7, 2014 at 9:56 PM, Ivan Brusic ivan@brusic.com wrote:

The problem might be with your encoding, not the analyzer. Your content is
in one format and either your output is in another or your viewer
(terminal, browser) is in another. Make sure everything is consistent
(UTF-8 for most people).

Where are you seeing the � character?

--
Ivan

On Thu, Aug 7, 2014 at 9:21 AM, Anand kumar anandv1000@gmail.com wrote:

Hi all,

 While indexing the special characters such as copyright symbol, the

analyzers seems to be not working.

For instance, the copyright symbol © is indexed as question mark symbol
inside black diamond � .

Any suggestion would be greatly appreciated. Thanks in advance.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e143c15c-bf31-49da-91ec-16c542bb51d7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e143c15c-bf31-49da-91ec-16c542bb51d7%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/_FN0tsUkJYs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAeyq0CYaCsDgWZ5GVziTr39PVShCoay9yfg%3DjMi9KU%3Dw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAeyq0CYaCsDgWZ5GVziTr39PVShCoay9yfg%3DjMi9KU%3Dw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Kind regards,
K.Anandkumar,
+91-96778 44774

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH9YyJTF8N1ivs-3mKE%3DrwG9pdAbqGYFXDWP5Wkp6rFQ6w0SVQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4