Hey everyone,
I'm encountering an issue where my documents aren't being matched by a multi_match
query when I include a wildcard with the search string. I'm trying to search across multiple fields, including nested names and email addresses, with a case-insensitive approach. Despite using a custom analyzer that applies the standard
tokenizer and lowercase
filter, my searches with the string "Nicole" (hoping to match "Nicole L" and "nicole@***.org" in my document) are not returning the expected results.
---document
{"_index"=>"appointments",
"_id"=>"22676738",
"_version"=>1,
"_seq_no"=>4185,
"_primary_term"=>1,
"found"=>true,
"_source"=>
{"starts_at"=>"2024-03-18T02:00:00.000-07:00",
"ends_at"=>"2024-03-18T04:00:00.000-07:00",
"status"=>"Active",
"planned_duration"=>7200,
"users"=>[],
"contact"=>{"id"=>222, "full_name"=>"Nicole L***", "emails"=>["nicole@***.org"], "phone_numbers"=>["+****"]},
"address"=>{"line1"=>"***", "city"=>"**", "state"=>"**", "postal_code"=>"**"},
"account_id"=>2}}
---query
client.explain(index: 'appointments', id: 22676738, body: {
query: {
bool: {
should: [
{
bool: {
must: {
multi_match: {
query: "Nicole*",
fields: ["contact.full_name^10", "contact.emails^10", "contact.phone_numbers", "users.name", "status", "address.line1", "address.line2", "address.city", "address.state", "address.postal_code"]
}
},
filter: {
term: {
account_id: 2
}
}
}
}
]
}
}
---explaination
{"_index"=>"appointments",
"_id"=>"22676738",
"matched"=>false,
"explanation"=>
{"value"=>0.0,
"description"=>"Failure to meet condition(s) of required/prohibited clause(s)",
"details"=>
[{"value"=>0.0,
"description"=>
"no match on required clause (((contact.full_name:nicole)^10.0 | address.line2:nicole | address.postal_code:nicole | address.line1:nicole | (contact.emails:nicole)^10.0 | users.name:nicole | address.city:nicole | address.state:nicole | contact.phone_numbers:nicole | status:nicole))",
"details"=>[{"value"=>0.0, "description"=>"No matching clause", "details"=>[]}]},
{"value"=>0.0,
"description"=>"match on required clause, product of:",
"details"=>[{"value"=>0.0, "description"=>"# clause", "details"=>[]}, {"value"=>1.0, "description"=>"account_id:[2 TO 2]", "details"=>[]}]}]}}
--- index mapping
INDEX_MAPPING = {
settings: {
analysis: {
analyzer: {
default: {
type: "custom",
tokenizer: "standard",
filter: ["lowercase"]
}
}
}
},
mappings: {
properties: {
starts_at: {
type: "date"
},
ends_at: {
type: "date"
},
contact_name: {
type: "text"
},
status: {
type: "text"
},
planned_duration: {
type: "long"
},
users: {
type: "nested",
properties: {
id: {type: "long"},
initials: {type: "text"},
color: {type: "text"},
name: {type: "text"}
}},
contact: {
type: "nested",
properties: {
id: {type: "long"},
full_name: {type: "text"},
emails: {type: "text"},
phone_numbers: {type: "text"}
}},
address: {
type: "nested",
properties: {
line1: {type: "text"},
line2: {type: "text"},
city: {type: "text"},
state: {type: "text"},
postal_code: {type: "text"}}
},
account_id: {type: "long"},
}
}
}
Sorry if this is blatantly obvious. I'm struggling with some of the concepts I guess that lead me to me not finding this on my own.
Thank you in advance