I want to perform a search like google when the user starts typing.
For an example my documents looks like this -
{
"fir_number": "12345",
"fir_id": "123",
"accused_first_name": "spider man",
"accused_relative_name": "super man",
"accused_dob": "2019-05-18T10:20:03Z",
"fir_reg_date": "2024-05-18T10:20:03Z",
"ipc_section": "123",
"ps_name": "police station",
"dist_name": "district",
"fir_status": "y",
"fir_content": "fir content goes here..."
}
{
"fir_number": "12345",
"fir_id": "123",
"accused_first_name": "bat man",
"accused_relative_name": "super woman",
"accused_dob": "2019-05-18T10:20:03Z",
"fir_reg_date": "2024-05-18T10:20:03Z",
"ipc_section": "123",
"ps_name": "police station",
"dist_name": "district",
"fir_status": "y",
"fir_content": "fir content goes here..."
}
For now I want to perform a search upon accused_first_name and accused_relative_name both these fields
Expectations are -
- If user starts typing they should get the suggestions.
if user types spi - they should get spider man. - If makes a typo they should get the correct suggestions.
if user types siper - they should get spider man. - If user types man, they should get suggestions like - spider man, bat man etc which all contains man here.
- If user types super wo - they should get only super woman as suggestion not super man, super woman both.
So, I have achieved the 1st and 2nd points by -
- Using completion mapping fields while creating the index
CreateIndexRequest request = new CreateIndexRequest.Builder().index("my_index")
.settings(settings -> settings.numberOfShards("1").numberOfReplicas("1"))
.mappings(
mappings -> mappings
.properties("accused_first_name",
new Property.Builder().completion(new CompletionProperty.Builder()
.analyzer("simple").preserveSeparators(true)
.preservePositionIncrements(true).maxInputLength(50).build()).build())
.properties("accused_relative_name",
new Property.Builder().completion(new CompletionProperty.Builder()
.analyzer("simple").preserveSeparators(true)
.preservePositionIncrements(true).maxInputLength(50).build()).build()))
.build();
CreateIndexResponse createIndexResponse = esJavaClient.indices().create(request);
- While searching I have used the suggest query with fuzziness
SearchRequest searchRequest = new SearchRequest.Builder()
.index("my_index")
.suggest(suggest -> suggest
.suggesters("first_name_suggestions", suggester -> suggester
.text("man")
.completion(completion -> completion
.field("accused_first_name")
.fuzzy(fuzzy -> fuzzy
.fuzziness("AUTO")
.minLength(3)
.prefixLength(1)
.transpositions(true)
)
)
)
.suggesters("relative_name_suggestions", suggester -> suggester
.text("man")
.completion(completion -> completion
.field("accused_relative_name")
.fuzzy(fuzzy -> fuzzy
.fuzziness("AUTO")
.minLength(3)
.prefixLength(1)
.transpositions(true)
)
)
)
)
.build();
But when I type only man I am not getting any result, but the expectations is to get the results which contains man in first_name and relative_name. I went through multiple articles and got to know that The completion suggester cannot perform full-text queries, which means that it cannot return suggestions based on words in the middle of a multi-word field ref link.
- So I came across edge n-gram token but in this way I am bit concerned about the performance as it will create a lot tokens for each of the doc which will gradually increase the size.
- I have a huge record like more 20 million and it will increase day by day so which would be the best way to achieve the search in this scenario. Kindly suggest.
Thanks