Term query is case sensitive for some but not for others

What could be the reason that a term query behaves case sensitive in some installations and case insensitive in others? In the mapping the affected field is of type keyword. Both installations use the same Docker Compose setup with ES 8.10.2 and both use exactly the same mapping for the index. (excerpt of which included at the bottom)

The only obvious difference I see is that one is on Intel Mac while the other is on Arm Mac.

Mapping excerpt

{
  "settings": {
    "analysis": {
      "analyzer": {
        "standard_html": {
          "char_filter": [
            "html_strip"
          ],
          "tokenizer": "standard",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "standard_html"
      },
      "title": {
        "type": "text",
        "analyzer": "standard_html"
      },
      "meta": {
        "properties": {
          "createdAt": {
            "type": "date",
            "format": "strict_date_time"
          },
          "project": {
            "properties": {
              "type": {
                "type": "keyword"
              },
              "name": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

Example query

GET /my-index/_search
{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                      "meta.project.type": {
                        "value": "COMPONENT"
                      }
                    }
                }
            ],
            "must": [
                {
                    "multi_match": {
                        "fields": [
                            "content",
                            "title"
                        ],
                        "query": "graph",
                        "tie_breaker": 0.3
                    }
                }
            ]
        }
    }
}

The actual value in the indexed documents for meta.project.type is "COMPONENT". The query works for me. However, a co-worker needs to query for "component" to get any hits (or use "case_insensitive": true).

Hi @paulmuller

You have an analyzer using the text field.
The meta.project.type field is keyword, that is, the filter using this field will only work if the search term is exactly the same as the one indexed.

Note the difference here:

POST idx_test/_analyze
{
 "field": "meta.project.type",
 "text": ["component", "COMPONENT"]
}

You can case_insensitive during the search query, but if you want to index lowercase data in a keyword field, I recommend using normalizer.
See more details here: normalizer | Elasticsearch Guide [8.14] | Elastic

Sorry, maybe a misunderstanding. I am aware of the text vs. keyword differences (I think) but that's not the point here.

What I am seeing is this

  • I filter for "meta.project.type": {"value": "COMPONENT"} and get hits that have "project": {"type": "COMPONENT"} - as expected.
  • A colleague who has the same ES setup through Docker Compose, the same mapping, the same source documents to index needs to filter for "meta.project.type": {"value": "component"} to get any hits.

I have no idea what might cause this behavior. We can work around it by using a case insensitive term query but I'd much rather understand the root cause.

Oh bummer, we found the issue. :slight_smile: :frowning:

I asked to colleague to send me their GET /my-index/_mapping output. It turns they we're NOT using the latest mapping definition as checked into Git when they created the index, sigh.