What do you mean when the content field still has html in it? Are you
saying that when you search for HTML, the document/field matches, or simply
that the response contains HTML? If it is the latter, then the behavior is
expected since the document source is preserved. The analysis chain will
only modify what is actually indexed.
Since you are doing a match all, which mean no query terms, you probably
are looking for the response to be modified. There is no good way to get
the analyzed content back. Highlighting is the most used workaround.
I have never used highlighting to return analyzed text, but that is what
you should do. Never responded back since I was hoping others would chime
in. An easy way to find out is to try it yourself!
I find it easier to index what I what and not have Elasticsearch do any
data munging.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.