Real time match analysis

Hello all, I was wondering if anyone could offer some feedback on whether
there is a way to determine how a document matched in real time. I
currently use custom analyzers at index time to allow a broad array of
matches for a given text field. I try to match based on phrases, synonyms,
substrings, stemming, etc of a given phrase, and I would like to be able to
figure out at search time, which analyzer was attributed to causing the
match.

Currently, I've gotten around this by creating child documents where the
fields are fanned out to their respective analyzer types. So I have a child
document where the field only applies stemming, another that uses only
synonyms, etc. However, due to the growing number of fields that require
analysis and the growth of my data set, I'd much prefer if I had less
documents (and less complex too). I was hoping there would be a way to tag
tokens at the analysis phase that could be used at the search phase to
quickly determine my match level, but I was not able to find anything like
this.

Having said that, has anyone else ever tried to figure this out, or have an
thoughts on how to leverage ES at a lower level to determine match?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4222f994-d448-4b61-a71e-3dca03a5a0fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Just a friendly bump to see if anyone has any feedback. :slight_smile:

On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:

Hello all, I was wondering if anyone could offer some feedback on whether
there is a way to determine how a document matched in real time. I
currently use custom analyzers at index time to allow a broad array of
matches for a given text field. I try to match based on phrases, synonyms,
substrings, stemming, etc of a given phrase, and I would like to be able to
figure out at search time, which analyzer was attributed to causing the
match.

Currently, I've gotten around this by creating child documents where the
fields are fanned out to their respective analyzer types. So I have a child
document where the field only applies stemming, another that uses only
synonyms, etc. However, due to the growing number of fields that require
analysis and the growth of my data set, I'd much prefer if I had less
documents (and less complex too). I was hoping there would be a way to tag
tokens at the analysis phase that could be used at the search phase to
quickly determine my match level, but I was not able to find anything like
this.

Having said that, has anyone else ever tried to figure this out, or have
an thoughts on how to leverage ES at a lower level to determine match?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

What about explain?

On Wed, Jan 14, 2015 at 3:24 PM, Ed Kim edkim81@gmail.com wrote:

Just a friendly bump to see if anyone has any feedback. :slight_smile:

On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:

Hello all, I was wondering if anyone could offer some feedback on whether
there is a way to determine how a document matched in real time. I
currently use custom analyzers at index time to allow a broad array of
matches for a given text field. I try to match based on phrases, synonyms,
substrings, stemming, etc of a given phrase, and I would like to be able to
figure out at search time, which analyzer was attributed to causing the
match.

Currently, I've gotten around this by creating child documents where the
fields are fanned out to their respective analyzer types. So I have a child
document where the field only applies stemming, another that uses only
synonyms, etc. However, due to the growing number of fields that require
analysis and the growth of my data set, I'd much prefer if I had less
documents (and less complex too). I was hoping there would be a way to tag
tokens at the analysis phase that could be used at the search phase to
quickly determine my match level, but I was not able to find anything like
this.

Having said that, has anyone else ever tried to figure this out, or have
an thoughts on how to leverage ES at a lower level to determine match?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1pNm54NvP-ex7Poyjp4RgNf%3DWgT98YDthGVcyrh2FevQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I was able to identify which field matched via explain, but couldn't see
any information on which token filter was the reason for the match. I've
tried specifying the analyzer name that the field uses as well as not
specifying. If the explain is supposed to provide this data, I will give it
another go and set up a test index with simpler analyzer setups.

Also, in order to do this, I will need to run the explain separate from the
search itself. My ultimate goal is to be able to do this within
milliseconds (less than 10). Is this feasible with explain?

On Wednesday, January 14, 2015 at 12:51:15 PM UTC-8, Nikolas Everett wrote:

What about explain?

On Wed, Jan 14, 2015 at 3:24 PM, Ed Kim <edk...@gmail.com <javascript:>>
wrote:

Just a friendly bump to see if anyone has any feedback. :slight_smile:

On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:

Hello all, I was wondering if anyone could offer some feedback on
whether there is a way to determine how a document matched in real time. I
currently use custom analyzers at index time to allow a broad array of
matches for a given text field. I try to match based on phrases, synonyms,
substrings, stemming, etc of a given phrase, and I would like to be able to
figure out at search time, which analyzer was attributed to causing the
match.

Currently, I've gotten around this by creating child documents where the
fields are fanned out to their respective analyzer types. So I have a child
document where the field only applies stemming, another that uses only
synonyms, etc. However, due to the growing number of fields that require
analysis and the growth of my data set, I'd much prefer if I had less
documents (and less complex too). I was hoping there would be a way to tag
tokens at the analysis phase that could be used at the search phase to
quickly determine my match level, but I was not able to find anything like
this.

Having said that, has anyone else ever tried to figure this out, or have
an thoughts on how to leverage ES at a lower level to determine match?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/326aca97-d937-41cc-9c28-7f89aa398c81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.