Search analyser + preserve special characters

Hi,
I am having somewhat of an issue here that I cannot seem to solve after a few different attempts and days worth of googling.

I am using the synonyms filter - and read that it would be better to do so search time instead of at index time.
So I did that, following this guide: https://www.elastic.co/blog/boosting-the-power-of-elasticsearch-with-synonyms

I have the following mapping:

{
    "company": {
        "mappings": {
            "properties": {
                "p_name": {
                    "type": "text",
                    "analyzer": "custom_index",
                    "search_analyzer": "rebuilt_danish"
                },
                "slug": {
                    "type": "text",
                    "analyzer": "custom_index",
                    "search_analyzer": "rebuilt_danish"
                }
            }
        }
    }
}

And the settings:

{
    "company": {
        "settings": {
            "index": {
                "analysis": {
                    "filter": {
                        "danish_synonyms": {
                            "type": "synonym",
                            "synonyms": [
                                "brdr, brdr., brødrene",
                                "&, og",
                                "adm, adm., administration, administrativ",
                                "v, v., v/, /v, ved",
                                "adv, adv., advokat",
                                "as, a/s, as., aktieselskab",
                                "aps, aps., anpartsselskab",
                                "afd, afd., afdeling",
                                "alm, alm., almindelig",
                                "a-kasse, a kasse, akasse, arbejdsløshedskasse",
                                "amu, arbejdsmarkedsuddannelse",
                                "att, att., attention"
                            ]
                        }
                    },
                    "analyzer": {
                        "custom_index": {
                            "filter": [
                                "lowercase"
                            ],
                            "type": "custom",
                            "preserve_original": "true",
                            "tokenizer": "standard"
                        },
                        "rebuilt_danish": {
                            "filter": [
                                "lowercase",
                                "danish_synonyms"
                            ],
                            "type": "custom",
                            "tokenizer": "whitespace"
                        }
                    }
                }
            }
        }
    }
}

I have tried a few different things to try and make it work - but here is what I am having issues with.
We have a company called "brdr. madsen & søn"
I have created synonyms so you also should be able to search "brødrene madsen og søn"
But the indexer strips off the &, so the search will be "brdr madsen søn" - which is not ideal.
I have tried creating a custom indexer, that should do it, tried preserving the original, tried a few things.
And yes - I also tried adding the danish language filters - but they made it even worse and will not work in this situation.

The other issue I have is with the search_analyzer at index time.
I use Elastic.co as host - and followed the guide by uploading a plugin with my synonym text file - but once I update it, it is not reflected - So I still need to restart the node for it to update - which restricts the use of _reload_search_analyzers, since it does not update/reload with the updates. So how can I make that work as well, together with elastic.co?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.