Interesting highlighter issue

(Imran Azad) #1

I am using a multi-field and I'm having an interesting issue with highlighting that I'm surprised no else has posted yet.

I have a field called title, within this field I have another title called merged_title which has an analyzer defined on it that strips hyphens and merges the word. So if I search for "anti-emetic" it highlights just on titlebecause the original document text contained "anti-emetic" therefore it was tokenized individually. However in the instance where you search for "antiemetic" and the orginal document text contains "antiemetic" without a hyphen it highlights both on the title field and merged_titlefield. Is it possible to have it highlight only on one field in this instance?

(Nik Everett) #2

Have a look at matched_fields. I made it to solve problems like that a couple of years ago. I'm not 100% sure it covers the use case you have but its pretty close.

(Imran Azad) #3

Thanks Nik, wow that looks awesome just had a read of the docs and tried it out, however I'm not getting any highlights whatsoever coming back:

           "highlight" : {
                "fields" : {
                    "*title" : {
                        "matched_fields": ["*title", "*merge_title"],
                        "fragment_size" : 200,
                        "pre_tags" : ["<mark>"],
                        "post_tags" : ["</mark>"]

I've also set "store:yes" and term_vector" : "with_positions_offsets", for both fields as I wasn't sure which field should have this option. Any ideas what I'm doing wrong? Btw my fields are heavily nested hence the wildcard characters.


(Nik Everett) #4

Typically isn't required/actually makes things worse. Only do it after testing without it first.

Both or none are ok. Just doing it on one is likely to fail.

I've never tried using wildcard characters with matched_fields. Try without the wildcards and if that works then we'll know. It just wasn't something I needed or thought about.

(Imran Azad) #5

Hey Nik, thanks for the swift response, you're right, the issue does seem to be with the wild cards, do I have any options for moving forward at all? I'm working an a project and I'm so close to finishing it. This is one of the last requirements, it's almost there! On a side note is it necessary to specify "*title" twice once under "fields" and then in "matched_fields"? It's just that I removed it from "matched_fields" and it still worked and returned the highlighted text for *title even though I searched explicitly in "*merge_title".

(system) #6