Search Like Google

I want to perform a search like google when the user starts typing.

For an example my documents looks like this -

{
            "fir_number": "12345",
            "fir_id": "123",
            "accused_first_name": "spider man",
            "accused_relative_name": "super man",
            "accused_dob": "2019-05-18T10:20:03Z",
            "fir_reg_date": "2024-05-18T10:20:03Z",
            "ipc_section": "123",
            "ps_name": "police station",
            "dist_name": "district",
            "fir_status": "y",
            "fir_content": "fir content goes here..."
        }

{
            "fir_number": "12345",
            "fir_id": "123",
            "accused_first_name": "bat man",
            "accused_relative_name": "super woman",
            "accused_dob": "2019-05-18T10:20:03Z",
            "fir_reg_date": "2024-05-18T10:20:03Z",
            "ipc_section": "123",
            "ps_name": "police station",
            "dist_name": "district",
            "fir_status": "y",
            "fir_content": "fir content goes here..."
        }

For now I want to perform a search upon accused_first_name and accused_relative_name both these fields

Expectations are -

  1. If user starts typing they should get the suggestions.
    if user types spi - they should get spider man.
  2. If makes a typo they should get the correct suggestions.
    if user types siper - they should get spider man.
  3. If user types man, they should get suggestions like - spider man, bat man etc which all contains man here.
  4. If user types super wo - they should get only super woman as suggestion not super man, super woman both.

So, I have achieved the 1st and 2nd points by -

  • Using completion mapping fields while creating the index
CreateIndexRequest request = new CreateIndexRequest.Builder().index("my_index")
				.settings(settings -> settings.numberOfShards("1").numberOfReplicas("1"))
				.mappings(
						mappings -> mappings
								.properties("accused_first_name",
										new Property.Builder().completion(new CompletionProperty.Builder()
												.analyzer("simple").preserveSeparators(true)
												.preservePositionIncrements(true).maxInputLength(50).build()).build())
								.properties("accused_relative_name",
										new Property.Builder().completion(new CompletionProperty.Builder()
												.analyzer("simple").preserveSeparators(true)
												.preservePositionIncrements(true).maxInputLength(50).build()).build()))
				.build();

		CreateIndexResponse createIndexResponse = esJavaClient.indices().create(request);
  • While searching I have used the suggest query with fuzziness
SearchRequest searchRequest = new SearchRequest.Builder()
                .index("my_index")
                .suggest(suggest -> suggest
                        .suggesters("first_name_suggestions", suggester -> suggester
                                .text("man") 
                                .completion(completion -> completion
                                        .field("accused_first_name") 
                                        .fuzzy(fuzzy -> fuzzy
                                                .fuzziness("AUTO")
                                                .minLength(3)
                                                .prefixLength(1)
                                                .transpositions(true)
                                        )
                                )
                        )
                        .suggesters("relative_name_suggestions", suggester -> suggester
                                .text("man") 
                                .completion(completion -> completion
                                        .field("accused_relative_name") 
                                        .fuzzy(fuzzy -> fuzzy
                                                .fuzziness("AUTO")
                                                .minLength(3)
                                                .prefixLength(1)
                                                .transpositions(true)
                                        )
                                )
                        )
                )
                .build();

But when I type only man I am not getting any result, but the expectations is to get the results which contains man in first_name and relative_name. I went through multiple articles and got to know that The completion suggester cannot perform full-text queries, which means that it cannot return suggestions based on words in the middle of a multi-word field ref link.

  • So I came across edge n-gram token but in this way I am bit concerned about the performance as it will create a lot tokens for each of the doc which will gradually increase the size.
  • I have a huge record like more 20 million and it will increase day by day so which would be the best way to achieve the search in this scenario. Kindly suggest.

Thanks

Hi @Aswini_Kumar_Rout

Did you try with Search-as-you-type field type | Elasticsearch Guide [8.14] | Elastic?

1 Like

Thanks for your suggestion
Earlier I was not aware of Search-as-you-type field type.

I went through the official doc and tried to implement the same but not getting the desired outputs.
Let me post what I have tried. Kindly let me know what I am doing wrong here -

Index Creation -

PUT /fir_index
{
  "mappings": {
    "properties": {
      "fir_number": { "type": "keyword" },
      "fir_id": { "type": "keyword" },
      "accused_first_name": {
        "type": "search_as_you_type"
      },
      "accused_relative_name": {
        "type": "search_as_you_type"
      },
      "accused_dob": { "type": "date" },
      "fir_reg_date": { "type": "date" },
      "ipc_section": { "type": "keyword" },
      "ps_name": { "type": "keyword" },
      "dist_name": { "type": "keyword" },
      "fir_status": { "type": "keyword" },
      "fir_content": { "type": "text" }
    }
  }
}

Docs Insertion -

POST /fir_index/_doc/1
{
            "fir_number": "12345",
            "fir_id": "123",
            "accused_first_name": "spider man",
            "accused_relative_name": "super man",
            "accused_dob": "2019-05-18T10:20:03Z",
            "fir_reg_date": "2024-05-18T10:20:03Z",
            "ipc_section": "123",
            "ps_name": "police station",
            "dist_name": "district",
            "fir_status": "y",
            "fir_content": "fir content goes here..."
} etc...

Search API using Java Client -

String query = "man";
		SearchResponse<Map> searchResponse = esJavaClient.search(
				s -> s.index("fir_index")
						.suggest(su -> su
								.suggesters("first_name_suggestions",
										sug -> sug.prefix(query)
												.completion(c -> c.field("accused_first_name").skipDuplicates(true)))
								.suggesters("relative_name_suggestions",
										sug -> sug.prefix(query).completion(
												c -> c.field("accused_relative_name").skipDuplicates(true)))),
				Map.class);

		List<Suggestion<Map>> firstNameSuggestionsList = searchResponse.suggest().get("first_name_suggestions");
		System.out.println("First Name Suggestions:");
		for (Suggestion<Map> suggestion : firstNameSuggestionsList) {
			for (CompletionSuggestOption<Map> option : suggestion.completion().options()) {
				System.out.println(option.text());
			}
		}

But here all shards are getting failed.

The below query is working fine means it returns all the related docs but however I am not getting the idea to show the suggestions in search bar as an user starts typing.

GET fir_index/_search
{
  "query": {
    "multi_match": {
      "query": "asw",
      "type": "bool_prefix",
      "fields": [
        "accused_first_name",
        "accused_first_name._2gram",
        "accused_first_name._3gram",
        "accused_relative_name",
        "accused_relative_name._2gram",
        "accused_relative_name._3gram"
      ]
    }
  }
}

In your java code you are still using Suggester Completion.
You must create a multi-match query and retrieve the fields that contain the name data.

var searchRequest = new SearchRequest.Builder()
    .index("idx_name")
    .query(MultiMatchQuery.of(m -> m
        .query("asw")
        .type(TextQueryType.BoolPrefix)
        .fuzziness("1")
        .fields(List.of(
            "accused_first_name",
            "accused_first_name._2gram",
            "accused_first_name._3gram",
            "accused_relative_name",
            "accused_relative_name._2gram",
            "accused_relative_name._3gram")
        )
    )._toQuery())
    .build();

var response = client.search(searchRequest, ObjectNode.class);

// extract accused_first_name and accused_relative_name

Thanks for the reply @RabBit_BR ,
I got this one but here we will get the whole document instead of the particular suggestions like how we get in suggest query. so its difficult to show the suggestions as we don't know to which particular field it got matched.

Like We should send back a List of suggestions in this method to show in search bar when user starts typing whether the text matches with accused_first_name or accused_relative_name. But if we get the whole doc I think it's difficult to extract that particular suggestion.

  • one more point I was checking with minor typo inputs but fuzziness is not working. (ex- mna instead of man)

  • seems the 4th point is also not working with this -
    If user types super wo - they should get only super woman as suggestion not super man, super woman both.

@RabBit_BR , could you please suggest if we can achieve our expectations with this or is there any other way or anything wrong I am doing here?

Thanks