Search DSL Query value with '(single quote) and space in elastic 6.8

Hi, Actually, i am trying to do Search DSL Query value with '(single quote) like below query (1) and some times with single space Query (2) below, i am not getting expected results, Could you please suggest me for that. Because i have values in my content field. I am using elastic 6.8 version.
Thank You.

(1)
"match_phrase": {
"content": {
"query": "Workers'"
}
}

(2)
"match": {
"content": {
"query": "Workers "
}
}

Hello, this is an Elasticsearch question and it landed in the Kibana channel. How you can query heavily depends on your field mapping, eg. if the field was (also) created as a keyword, and if the entirety of the space separated text got in as one integral keyword rather than two keywords. This is a good starting point for text vs keyword; again, worth delving into Elasticsearch itself and ask there if questions remain

There are also options with individual tokenization, with proximity search, where scoring depends on word proximity.

Hi, To resolve above issue, I am trying to implement analyzer for my index, which needs to do search with special characters and white spaces. So, i am using n-gram analyzer.. Below my index..

Actually, my requirement to ingest pdf files(almost 2000 files), so, based on min and max-gram, it takes huge space and performance wise taking much time to get response for search.. Could you please suggest to improve performance and as well as if increase max-gram value to 100 or 200.
Thank You in Advance.
Joseph

PUT formslettersandtemplates4_50
{
"settings": {
"number_of_replicas" : "0",
"refresh_interval": null,
"analysis": {
"analyzer": {
"form_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 4,
"max_gram": 50,
"token_chars": [
"letter",
"digit",
"whitespace",
"punctuation",
"symbol"
]
}
}
}
},
"mappings": {
"doc" : {
"properties": {
"content": {
"type": "text",
"analyzer": "form_analyzer"
}
}
}
}
}

Hi, it's really more of an Elasticsearch question and I'm not equipped to give performance advice on this. It might be worth considering a training course where there's room to ask questions about performance characteristics—besides this being taught on the courses—or some kind of consulting, or any kind of contract with support from our side. These besides the option of asking the question or browsing in the Elasticsearch discuss channel.

Sure Thank You Robert for your time.

I will refer that.

I moved the question to #elastic-stack:elasticsearch

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script is something anyone can copy and paste in Kibana dev console, click on the run button to reproduce your use case. It will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

1 Like

Thank You David. I am trying by below analyzer and tokenizer and i am able to get the expected result.

From this, tokenize for all special characters and white spaces with alpha characters happened properly.

PUT
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "my_tokenizer"
}
},
"tokenizer" : {
"my_tokenizer": {
"type":"char_group",
"tokenize_on_chars":[
"whitespace", "symbol"]
}
}
}
},
"mappings": {
"doc": {
"properties" : {
"": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}