Heisenbug with Percolator


(Adam Georgiou) #1

Disclaimer/Naivety Hedge

I'm not really sure how I'd research the history of this issue, or if it
is in fact an issue or ignorance on my part, but it's nature is elusive as
far as I can tell and so I'm elaborating here...

Description

I have a query in my percolator index that I expect to match a given
document.
I percolate the document and the query is not returned.
I retrieve the query, via a get request; and then dump the body of the
query into a file.
I then index the contents of that file, without modifying it, into the same
index's '.percolator' type, with a different id.
(In other words, I've re-indexed the afflicted query without modifying it.)
Re-percolating the same document now returns the newly indexed query, while
still excluding the original identical query.

The only thing I can think of is that, perhaps somehow the mapping for
.percolator was different at the time of indexing the original query, and
thus there's something different stored in lucene for that query compared
to what's stored for the new query. But I don't have a good enough
understanding of how mappings and storage works for the percolator, and as
far as I've read this isn't commented on in the documentation. Is the above
scenario possible?

Metadata

  • elasticsearch version 1.1.0
  • 2 nodes, 1 shard, 0 replicas (it's testing environment)
  • // Query
    {
    "news_id": "0000000075-nid",
    "query": {
    "filtered": {
    "filter": {
    "term": {
    "product": "some_product"
    }
    },
    "query": {
    "multi_match": {
    "fields": [
    "random field", "random_field_2", "random_field_3"
    ],
    "query": "gasoline"
    }
    }
    }
    }
    }
  • The document percolated includes the word "gasoline".

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44a1e697-02ee-42fa-b715-14d832a3cd8c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Martijn Van Groningen) #2

Hi,

Can you also share the document being percolated? I would expect the query
to match if gasoline occurs in it and 'product' is equal to 'some_product'.

You may have ran into a big regarding to the percolator and mappings:


That has been fixed in 1.1.1, maybe you can try if this issue also occurs
with ES 1.1.1.

Martijn

On 8 May 2014 03:45, Adam Georgiou apg552@gmail.com wrote:

Disclaimer/Naivety Hedge

I'm not really sure how I'd research the history of this issue, or if
it is in fact an issue or ignorance on my part, but it's nature
is elusive as far as I can tell and so I'm elaborating here...

Description

I have a query in my percolator index that I expect to match a given
document.
I percolate the document and the query is not returned.
I retrieve the query, via a get request; and then dump the body of the
query into a file.
I then index the contents of that file, without modifying it, into the
same index's '.percolator' type, with a different id.
(In other words, I've re-indexed the afflicted query without modifying it.)
Re-percolating the same document now returns the newly indexed query,
while still excluding the original identical query.

The only thing I can think of is that, perhaps somehow the mapping for
.percolator was different at the time of indexing the original query, and
thus there's something different stored in lucene for that query compared
to what's stored for the new query. But I don't have a good enough
understanding of how mappings and storage works for the percolator, and as
far as I've read this isn't commented on in the documentation. Is the above
scenario possible?

Metadata

  • elasticsearch version 1.1.0
  • 2 nodes, 1 shard, 0 replicas (it's testing environment)
  • // Query
    {
    "news_id": "0000000075-nid",
    "query": {
    "filtered": {
    "filter": {
    "term": {
    "product": "some_product"
    }
    },
    "query": {
    "multi_match": {
    "fields": [
    "random field", "random_field_2", "random_field_3"
    ],
    "query": "gasoline"
    }
    }
    }
    }
    }
  • The document percolated includes the word "gasoline".

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/44a1e697-02ee-42fa-b715-14d832a3cd8c%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/44a1e697-02ee-42fa-b715-14d832a3cd8c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TzKytjda-RYD3ixXy1d5-7znCH0ssTmoOTD7iSsExjvVg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Adam Georgiou) #3

{
"doc": {
"random field": [
"\n\nMay 04--The 49ers employ an All-Pro linebacker whose
college career appeared to foreshadow off-the-field trouble in the NFL. He
was involved in an on-campus fight, suspended by his head coach and
admitted to gasolining"
],
"product": "KRT"
}
}

The above is similar to the document I was using, modifying random_field's
value by hand, and with some extra key:value pairs removed. (Note the
implied stemming here -- the identical query referred to above matched, as
stemming was applied correctly.)

On Friday, May 9, 2014 6:16:14 AM UTC-4, Martijn v Groningen wrote:

Hi,

Can you also share the document being percolated? I would expect the query
to match if gasoline occurs in it and 'product' is equal to 'some_product'.

You may have ran into a big regarding to the percolator and mappings:
https://github.com/elasticsearch/elasticsearch/pull/5776
That has been fixed in 1.1.1, maybe you can try if this issue also occurs
with ES 1.1.1.

Martijn

On 8 May 2014 03:45, Adam Georgiou <apg...@gmail.com <javascript:>> wrote:

Disclaimer/Naivety Hedge

I'm not really sure how I'd research the history of this issue, or if
it is in fact an issue or ignorance on my part, but it's nature
is elusive as far as I can tell and so I'm elaborating here...

Description

I have a query in my percolator index that I expect to match a given
document.
I percolate the document and the query is not returned.
I retrieve the query, via a get request; and then dump the body of the
query into a file.
I then index the contents of that file, without modifying it, into the
same index's '.percolator' type, with a different id.
(In other words, I've re-indexed the afflicted query without modifying
it.)
Re-percolating the same document now returns the newly indexed query,
while still excluding the original identical query.

The only thing I can think of is that, perhaps somehow the mapping for
.percolator was different at the time of indexing the original query, and
thus there's something different stored in lucene for that query compared
to what's stored for the new query. But I don't have a good enough
understanding of how mappings and storage works for the percolator, and as
far as I've read this isn't commented on in the documentation. Is the above
scenario possible?

Metadata

  • elasticsearch version 1.1.0
  • 2 nodes, 1 shard, 0 replicas (it's testing environment)
  • // Query
    {
    "news_id": "0000000075-nid",
    "query": {
    "filtered": {
    "filter": {
    "term": {
    "product": "some_product"
    }
    },
    "query": {
    "multi_match": {
    "fields": [
    "random field", "random_field_2", "random_field_3"
    ],
    "query": "gasoline"
    }
    }
    }
    }
    }
  • The document percolated includes the word "gasoline".

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/44a1e697-02ee-42fa-b715-14d832a3cd8c%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/44a1e697-02ee-42fa-b715-14d832a3cd8c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dc3408ec-3e26-4798-bf26-cbedf9757cf9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4