Full Text, Fuzzy, Partial Search


(xeraa) #1

Assume the following scenario:

I have three documents, containing the text:

  • "Nagelstudio Werner"
  • "Nägellabor Meier"
  • "Naegeldesign Hoferer"

When searching for "nagel" I'd like to find all three documents, while
when searching for "nagelstudio" I'd like to find one the first
document.

What would be the right query for that task (using the Java API)?

I've already tried the following, only having limited success -
searchTerm holding the text I'm searching for:

  • queryBuilder = fuzzyQuery("_all", searchTerm).minSimilarity(0.1f); -

Searching for "nagel", only finding "Nägellabor Meier" - matching
"nagel" on "meier" with a low factor

  • queryBuilder = termQuery("_all", searchTerm); -> Not finding "nagel"
    at all, only "nagelstudio"
  • queryBuilder =
    fuzzyLikeThisQuery("_all").likeText(searchTerm).minSimilarity(0.1f); -

Only finding "meier" again

  • queryBuilder = moreLikeThisQuery("_all").likeText(searchTerm); ->
    Not finding the expected result for "nagel"
  • queryBuilder =
    queryString(searchTerm).field("_all").fuzzyMinSim(0.1f); -> Not
    finding the expected result for "nagel"
  • queryBuilder = fieldQuery("_all",
    searchTerm).phraseSlop(1000).fuzzyMinSim(0.1f).lowercaseExpandedTerms(true);
    -> Not finding the expected result for "nagel"
  • queryBuilder =
    fuzzyLikeThisFieldQuery("_all").likeText(searchTerm).minSimilarity(0.1f);
    -> Only finding "meier" again

Am I just missing the right method / option for it or is this a
problem of deeper misunderstanding?


(lowang) #2

I'm also interested in this subject, can anyone point out how to do
error tolerant search ?


(system) #3