Matching a document against a keyword - exact "substring" match between document and keywords


(Raul, Jr. Martinez) #1

Hello,

My goal is to ensure that only those document whose value exactly matches
(including word position) some part/substring of the keywords.

To illustrate, I have a index/type that contains the following documents:

    -----           -----           ---------------------
    concept         entity          value
    -----           -----           ---------------------
    part            part            Mirror Motor           
    part            brand           Honda Collection       
    part            brand           Honda Apparel          
    vehicle         make            Honda                  
    part            brand           Honda by Moose         
    part            part            Mirror                 
    part            part            Telescoping Mirror     
    part            part            Mirror Back            
    part            part            Mirror Bra             
    part            part            Mirror Hardware        
    part            part            Mirror Mount           
    part            part            Spot Mirror            

I want to ensure that when the keywords is "Honda Civic Mirror" only the
following documents are returned (or at least gets the highest _score
value)::

    vehicle         make            Honda                   
    part            part            Mirror                  

If the keywords is "Mirror Mount for Honda", only the following documents
are to be returned (or at least gets the highest _score value):

    vehicle         make            Honda                   
    part            part            Mirror Mount                

Is there any index and search analyzer that will allow me to achieve this
goal?

Thanks,

Raul


(Karussell) #2

Exact matching is a different thing ... doesn't the normal text query
achieve what you want?

Peter.

On 25 Jan., 03:06, "Raul, Jr. Martinez" jun...@gmail.com wrote:

Hello,

My goal is to ensure that only those document whose value exactly matches
(including word position) some part/substring of the keywords.

To illustrate, I have a index/type that contains the following documents:

    -----           -----           ---------------------
    concept         entity          value
    -----           -----           ---------------------
    part            part            Mirror Motor
    part            brand           Honda Collection
    part            brand           Honda Apparel
    vehicle         make            Honda
    part            brand           Honda by Moose
    part            part            Mirror
    part            part            Telescoping Mirror
    part            part            Mirror Back
    part            part            Mirror Bra
    part            part            Mirror Hardware
    part            part            Mirror Mount
    part            part            Spot Mirror

I want to ensure that when the keywords is "Honda Civic Mirror" only the
following documents are returned (or at least gets the highest _score
value)::

    vehicle         make            Honda
    part            part            Mirror

If the keywords is "Mirror Mount for Honda", only the following documents
are to be returned (or at least gets the highest _score value):

    vehicle         make            Honda
    part            part            Mirror Mount

Is there any index and search analyzer that will allow me to achieve this
goal?

Thanks,

Raul


(Raul, Jr. Martinez) #3

Hello Peter.

Normal text query works. I actually use query_string. However when
executing a query based on supplied keyword, the engine will actually try
to whole or parts of the keywords against the searchable documents field.

Right now, with the current ES index that I have, searching for "Honda
Civic Mirror" produces the following results:


concept entity value _score


vehicle model Civic 0.638050
vehicle make Honda 0.579641
part brand Honda Collection 0.526310
part brand Honda Apparel 0.524273
part brand Honda by Moose 0.507186
part part Mirror 0.476971
vehicle model Civic del Sol 0.449377
part part Mirror Arm 0.417349
part part Mirror Switch 0.417349

I was hoping that to find a solution on excluding "part/brand = Honda
Collection" to be not part of the list because in my keyword there is no
word "Collection".

I am looking looking for the reverse of search ( but not using
percolation). where document fields are checked to make sure that their
value is a substring (or exact match) of the keywords. This should ensure
that "Honda Collection" or "Mirror Arm" will not appear because "Arm" or
"Collection" is not part of my example keywords.


concept entity value


vehicle model Civic
vehicle make Honda
part part Mirror

Wwhat I am looking for here is hard to explain, I apologize. I'm interested
if someone out there has done something similar to this.

Regards,
Raul

Regards,
Raul

On Wed, Jan 25, 2012 at 4:33 PM, Karussell tableyourtime@googlemail.comwrote:

Exact matching is a different thing ... doesn't the normal text query
achieve what you want?

Peter.

On 25 Jan., 03:06, "Raul, Jr. Martinez" jun...@gmail.com wrote:

Hello,

My goal is to ensure that only those document whose value exactly matches
(including word position) some part/substring of the keywords.

To illustrate, I have a index/type that contains the following documents:

    -----           -----           ---------------------
    concept         entity          value
    -----           -----           ---------------------
    part            part            Mirror Motor
    part            brand           Honda Collection
    part            brand           Honda Apparel
    vehicle         make            Honda
    part            brand           Honda by Moose
    part            part            Mirror
    part            part            Telescoping Mirror
    part            part            Mirror Back
    part            part            Mirror Bra
    part            part            Mirror Hardware
    part            part            Mirror Mount
    part            part            Spot Mirror

I want to ensure that when the keywords is "Honda Civic Mirror" only the
following documents are returned (or at least gets the highest _score
value)::

    vehicle         make            Honda
    part            part            Mirror

If the keywords is "Mirror Mount for Honda", only the following documents
are to be returned (or at least gets the highest _score value):

    vehicle         make            Honda
    part            part            Mirror Mount

Is there any index and search analyzer that will allow me to achieve this
goal?

Thanks,

Raul


(rqueen) #4

Hi Raul,
Am looking for same scenario which you expected results.Can you suggest me how you solved the same search issue.
it's very urgent for me


(system) #5