How to match a field and a sentence with capital letter and special character?

Hi,
I'm trying to build an index with documents containing two fields : "name" and "description" and to search these documents.
I'm using the java RestHighLevelClient 6.7.0.
I'm building a document like this :

{
name :  "Mü"
description :  "My name is Mü"
}

and I want to match It when using the word
mu

So far I managed to :

  1. build an index
  2. set an analyser on indexation
  3. set an analyzer in the mapping
  4. insert a document
    5 search a document

The probleme is that I always have 0 hit in the search response. Which analyzer shoud i use ? And is the compound query adapted for this kind of search ?

The main method -

  public static void main(String[] args) throws IOException
  {
    String indexName = "debug-index";
    String token = "mu";

    ESClient es = new ESClient();
    es.createIndex(indexName);

    es.setIndexationAnalyzer(indexName);
    es.setMappingAnalyzer(indexName);

    es.insertDocument(indexName);

    es.searchDocument(indexName, token);

    es.deleteIndex(indexName);

    es.close();
  }

The EsClient class -

public class ESClient {
  private RestHighLevelClient client;

  public ESClient() {
    client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://localhost:9200")));
  }

  public void createIndex(String name) throws IOException {
    CreateIndexRequest request = new CreateIndexRequest(name);
    CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
    System.out.print("index is created : ");
    System.out.println(createIndexResponse.isAcknowledged());
  }

  public void createIndexWithMapping(String indexName) throws IOException {
    CreateIndexRequest request = new CreateIndexRequest(indexName);
    CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
    System.out.print("index is created : ");
    System.out.println(createIndexResponse.isAcknowledged());
  }

  public void deleteIndex(String indexName) throws IOException {
    DeleteIndexRequest request = new DeleteIndexRequest(indexName); 
    AcknowledgedResponse deleteIndexResponse = 
    client.indices().delete(request, RequestOptions.DEFAULT);
    System.out.print("index is deleted : ");
    System.out.println(deleteIndexResponse.isAcknowledged());
  }

  public void setIndexationAnalyzer(String indexName) throws IOException {
    CloseIndexRequest closeRequest = new CloseIndexRequest(indexName);
    AcknowledgedResponse closeIndexResponse = 
    client.indices().close(closeRequest, RequestOptions.DEFAULT);
    System.out.print("index is closed : ");
    System.out.println(closeIndexResponse.isAcknowledged());

    //Set Indexing analyzer
    String tokenizer = "standard";
    List<String> filter = new ArrayList<String>();
    filter.add("lowercase");

    Map<String, Object> rebuilt_standard = new HashMap<>();
    rebuilt_standard.put("tokenizer", tokenizer);
    rebuilt_standard.put("filter", filter);
    
    Map<String, Object> analyzer = new HashMap<>();
    analyzer.put("rebuilt_standard", rebuilt_standard);

    Map<String, Object> analysis = new HashMap<>();
    analysis.put("analyzer", analyzer);

    Map<String, Object> settings = new HashMap<>();
    settings.put("analysis", analysis);

    UpdateSettingsRequest requestSettings = new UpdateSettingsRequest(indexName);
    requestSettings.settings(settings);
    AcknowledgedResponse updateSettingsResponse =
    client.indices().putSettings(requestSettings, RequestOptions.DEFAULT);
    System.out.print("index settings are updated : ");
    System.out.println(updateSettingsResponse.isAcknowledged());

    // Open index 
    OpenIndexRequest openRequest = new OpenIndexRequest(indexName); 
    OpenIndexResponse openIndexResponse = client.indices().open(openRequest, RequestOptions.DEFAULT);
    System.out.print("index is open : ");
    System.out.println(openIndexResponse.isAcknowledged());
  }


  public void setMappingAnalyzer(String indexName) throws IOException {
    //Set Search analyzer (mapping)
    Map<String, Object> name = new HashMap<>();
    name.put("type", "text");
    name.put("analyzer", "rebuilt_standard");
    name.put("search_analyzer", "rebuilt_standard");

    Map<String, Object> description = new HashMap<>();
    description.put("type", "text");
    description.put("analyzer", "rebuilt_standard");
    description.put("search_analyzer", "rebuilt_standard");

    Map<String, Object> properties = new HashMap<>();
    properties.put("name", name);
    properties.put("description", description);

    Map<String, Object> mappingSource = new HashMap<>();
    mappingSource.put("properties", properties);
  
    PutMappingRequest requestMapping = new PutMappingRequest(indexName);
    requestMapping.source(mappingSource);

    AcknowledgedResponse putMappingResponse = 
    client.indices().putMapping(requestMapping, RequestOptions.DEFAULT);
    System.out.print("index mapping is updated : ");
    System.out.println(putMappingResponse.isAcknowledged());
  }

  public void insertDocument(String indexName) throws IOException {
    Map<String, Object> document = new HashMap<>();
    document.put("name", "Mü");
    document.put("description", "My name is Mü");

    IndexRequest indexRequest = new IndexRequest(indexName, "_doc")
    .id("1").source(document);
    IndexResponse indexResponse = client.index(indexRequest, RequestOptions.DEFAULT);
    System.out.print("index mapping is updated : ");
    System.out.println(indexResponse.status());
  }

  public void searchDocument(String indexName, String token) throws IOException {
    DisMaxQueryBuilder query = new DisMaxQueryBuilder();
    query.add(QueryBuilders.termQuery("name", token)); 
    query.add(QueryBuilders.termQuery("description", token)); 
    query.add(QueryBuilders.matchPhraseQuery("description", token)); 

    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.query(query);
    

    SearchRequest searchRequest = new SearchRequest(); 
    searchRequest.indices(indexName);
    searchRequest.source(searchSourceBuilder);
    
    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    RestStatus status = searchResponse.status();
    System.out.println("Request status : ");
    System.out.println(status);

    SearchHits hits = searchResponse.getHits();
    long totalHits = hits.getTotalHits();
    System.out.println("totalHits : ");
    System.out.println(totalHits);


  }

  public void close() throws IOException {
    client.close();
  }
}

I'd use a asciifolding token filter based analyzer.
The french analyzer uses that for example.

See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-asciifolding-tokenfilter.html and https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html#french-analyzer

Ok thanks, I guess the problem is elsewhere since even with both the lowercase and the asciifolding filter, I have no hit :confused:

EDIT : my bad, i had to wait a few seconds for the document to be indexed before quering the index. I got a match , ty !

You need to call refresh API if you don't want to wait.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.