Text not stemmed after inserted in the index with language specific analyzer

makons · November 8, 2021, 2:52pm

I have 2 fields in my index, one containing the standardly analyzed content, while the other one uses a german analyzer. However, when I insert documents, they don't seem to be stemmed at all, even though they should be by the language-specific analyzer. I've created a minimal example that doesn't work for me:

Create an index:

PUT test-index
{
    "settings": {
        "index": {
            "mapping": {
                "total_fields": {
                    "limit": 1500
                }
            }
        },
        "number_of_shards": 1,
        "number_of_replicas": 0,
        "refresh_interval": "1s"
    },
    "mappings": {
        "_source": {
            "enabled": true
        },
        "dynamic_templates": [
            {
                "default_string": {
                    "match": "*",
                    "match_mapping_type": "string",
                    "mapping": {
                        "type": "keyword",
                        "index": true,
                        "store": false
                    }
                }
            }
        ],
        "properties": {
            "standard_content": {
                "type": "text",
                "index": true,
                "analyzer": "standard",
                "search_analyzer": "standard"
            },
            "stemmed_content": {
                "type": "text",
                "index": true,
                "analyzer": "german",
                "search_analyzer": "german"
            }
        }
    },
    "aliases": {}
}

Add a document:

POST test-index/_doc/test-doc
{
    "standard_content": "Das Wort Orangen sollte nach dem Stemming zu Orange werden.",
    "stemmed_content": "Das Wort Orangen sollte nach dem Stemming zu Orange werden."
}

Essentially, per test the word "Orangen" should be stemmed to "Orange", "sollte" to "soll", and so on. However, the result when I run a match_all query is identical as inserted.

I had a similar setup already working for stemming/preprocessing language-specific content, but for some reason, it doesn't work now, and I don't understand why. Hopefully, someone can see where I'm making a mistake. Thanks!

system · December 6, 2021, 2:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to search with correct stemming? Elastic Search elastic-app-search	3	318	March 14, 2024
Problem with english analyzer Elasticsearch	4	287	July 6, 2017
Stemming Problem Elasticsearch	7	347	July 6, 2017
Not able to see whether data stemmed or not! Elasticsearch	3	390	July 6, 2017
One language per document and multiple languages per index Elasticsearch	1	652	January 13, 2017

Text not stemmed after inserted in the index with language specific analyzer

Related topics