Exclude unwanted test from index

Dear reader,

There may be Web pages that you want to suppress from search results when users search on certain words or phrases. For example, if a Web page consists of the text “the user conference page will be completed as soon as Jim returns from medical leave,” you might not want this page to appear in the results of a search on the terms “user conference.”

Does Elasticsearch have such a feature?

Best regards,

You can filter them out easily, eg https://www.elastic.co/guide/en/elasticsearch/reference/5.5/query-filter-context.html

Great! Thanks for your reply.

Best regards,

Henk

Hi Mark,

I've read the documentation about query/filter context your have pointed out to me and I'm not sure whether it is what I'm looking for.

What I'm looking for is a way te keep certain content from being indexed at all. GSA (Google search appliance) has such a feature. By placing content within certain tags (googleon/googleoff) that content will not be indexed. So, looking at the example web content I mentioned in my question, one would put "user conference" in these tags to exclude it from being indexed. That way every search on "user conference" would disregard the web page about Jim finishing the user conference page when he returns from medical leave.
If I want to achieve the same in Elasticsearch using the query/filter context I would have to specify in the query clause not to match, for example, "medical leave" in order to exclude the web page about Jim finishing the user conference page. At least that is my understanding of it. If so, it would not be the same feature as the googleon/googleoff feature provided by the GSA search engine. Would you agree with that?

Best regards,

Ahh ok.

For Elasticsearch you need to create a specific mapping that sets fields of a document to unindexed, see index | Elasticsearch Reference [5.5] | Elastic.

Ok, thanks. Now I have a clear grasp of what is possible and not regarding indexed content.

Best regards,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.