Functional filter on ES v 9.0.0, not working on v 8.18.0

Hello everyone,

After checking this post, I implemented a filter (in a v 9.0.0 Elasticsearch infrastructure) which looks as follows:

[
    {
        "bool": {
            "should": [
                {"term": {"metadata.provider.keyword": "SkillShare"}},
                {"term": {"metadata.provider.keyword": "EDX"}}
            ]
        }
    }
]

The previous filter works OK.

However I was asked to adapt it in a v 8.18.0 Elasticsearch infrastructure, and that filter is no longer working, i.e., I get not matches at the output.

Why?

BTW, I am using that filter in the following Python script:

import time
import os
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_elasticsearch import ElasticsearchStore

index_name = "my_index_name"
myfilter = [
    {
        "bool": {
            "should": [
                {"term": {"metadata.provider.keyword": "SkillShare"}},
                {"term": {"metadata.provider.keyword": "EDX"}}
            ]
        }
    }
]

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vector_db = ElasticsearchStore(
    es_url="https://4a15e4224aca4e99bac99a6582a4abab.us-central1.gcp.cloud.es.io:443",
    es_api_key="get_your_own_api_key",
    index_name=index_name,
    embedding=embeddings
)

# ---------------------------- SIMILARITY SEARCH -----------------------------------------
print("\n\nSIMILARITY SEARCH")

t1 = time.time()
results = vector_db.similarity_search(
    query="Skill: Typography Fundamentals; Job Title: Graphic Designer",
    k=2*number_of_recommendations,
    filter=myfilter
)
t2 = time.time()
for res in results:
    print(f"* {res.page_content} [{res.metadata}]")
    print(80*"-")
print("\n\nCompleted on {:.2f} seconds".format(t2-t1))

Why? What am I doing wrong? What would be the correct filter equivalence, for Elasticsearch v 8.18.0?

Thanks

Using some free GenAI providers plus my own troublehooting, I think I got it:

  • For ES v 9.0.0, the filter was:
myfilter = [
    {
        "bool": {
            "should": [
                {"term": {"metadata.provider.keyword": "SkillShare"}},
                {"term": {"metadata.provider.keyword": "Seneca"}}
            ]
        }
    }
]

But for ES v 8.18.0, the filter is:

myfilter = [
    {
        "bool": {
            "should": [
                {"term": {"metadata.provider.enum": "SkillShare"}},
                {"term": {"metadata.provider.enum": "Seneca"}}
            ],
            "minimum_should_match": 1
        }
    }
]

As mentioned by the AI itself:

"For metadata.provider, the mapping shows:

"provider": {
    "type": "text",
    "fields": {
        "enum": {  // This is your keyword field
            "type": "keyword",
            "ignore_above": 2048
        },
        // ... other subfields
    }
}

Instead of the standard .keyword naming convention, your index uses .enum for the keyword subfield."

Hence, this case is SOLVED now.