Incorrect offset/length values in phrase suggester result


(Ryan Tanner) #1

I'm trying to use phrase suggesters against multiple fields. I thought I
could just combine the results of each per-field suggestion but the offset
and length values are all incorrect. The offset is always 0 and the length
is always the length of the original string. This makes it impossible to
combine the results.

Any idea what's wrong with this query that would cause that?

{
"suggest": {
"text": "consipre busines devlopment",
"person_name": {
"phrase": {
"field": "name.suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "name.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"person_organization": {
"phrase": {
"field": "organizations.suggestion",
"size": 3,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "organizations.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"person_role": {
"phrase": {
"field": "roles.suggestion",
"size": 3,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "roles.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
]
}
},
"domain_title_suggestion": {
"phrase": {
"field": "domain_title_suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "domain_title_suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"domain_suggestion": {
"phrase": {
"field": "domain_suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "domain_suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
}
}
}

This is the result:

{
"took": 99,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 9,
"max_score": 1,
"hits": [
{
"_index": "conspire_v3",
"_type": "network",
"_id": "1865",
"_score": 1,
"_source": {
"network_id": 12345,
"name": "Battlestar Galactica",
"connectedness": 0,
"node_type": "network",
"node_id": 1865
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1874",
"_score": 1,
"_source": {
"count": 1,
"domain": "gmail.com",
"node_type": "domain",
"node_id": 1874
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1871",
"_score": 1,
"_source": {
"count": 1,
"domain": "caprica.org",
"node_type": "domain",
"node_id": 1871
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1876",
"_score": 1,
"_source": {
"name": "Lee Adama",
"connectedness": 0,
"addresses": [
"lee.adama@pegasus.org",
"apollo@galactica.com"
],
"node_type": "person",
"node_id": 1876,
"domains": [
"pegasus.org",
"galactica.com"
],
"domain_suggestion": [
"pegasus.org",
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1879",
"_score": 1,
"_source": {
"count": 1,
"domain": "pegasus.org",
"node_type": "domain",
"node_id": 1879
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1881",
"_score": 1,
"_source": {
"name": "Bill Adama",
"connectedness": 0,
"addresses": [
"bill.adama@pegasus.org",
"william@galactica.com"
],
"node_type": "person",
"node_id": 1881,
"domains": [
"pegasus.org",
"galactica.com"
],
"domain_suggestion": [
"pegasus.org",
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1868",
"_score": 1,
"_source": {
"count": 4,
"domain": "galactica.com",
"node_type": "domain",
"node_id": 1868
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1885",
"_score": 1,
"_source": {
"name": "Felix Geta",
"connectedness": 0,
"addresses": [
"geta@galactica.com"
],
"node_type": "person",
"node_id": 1885,
"domains": [
"galactica.com"
],
"domain_suggestion": [
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1867",
"_score": 1,
"_source": {
"name": "Kara Thrace",
"connectedness": 1,
"addresses": [
"grumpygirl@gmail.com",
"kara.thrace@caprica.org",
"starbuck@galactica.com"
],
"node_type": "person",
"node_id": 1867,
"domains": [
"gmail.com",
"caprica.org",
"galactica.com"
],
"domain_suggestion": [
"galactica.com"
],
"domain_title_suggestion": [
"Battlestar Galactica"
],
"organizations": [
"Conspire",
"Fake Organization"
],
"roles": [
"Business Development"
]
}
}
]
},
"suggest": {
"domain_title_suggestion": [
{
"text": "consipre busines devlopment",
"offset": 0,
"length": 27,
"options": [

    ]
  }
],
"person_name": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      
    ]
  }
],
"person_organization": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      {
        "text": "conspire busines devlopment",
        "highlighted": "<em>conspire<\/em> busines devlopment",
        "score": 0.13149868
      }
    ]
  }
],
"person_role": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      {
        "text": "consipre business development",
        "score": 0.43433
      },
      {
        "text": "consipre busines development",
        "score": 0.22576733
      },
      {
        "text": "consipre business devlopment",
        "score": 0.22103381
      }
    ]
  }
],
"domain_suggestion": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      
    ]
  }
]

}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c3527bed-da36-4123-8809-7d22800b10a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ryan Tanner) #2

Hate to bump this but I'd really appreciate some advice on this issue.

On Friday, July 11, 2014 1:25:32 PM UTC-6, Ryan Tanner wrote:

I'm trying to use phrase suggesters against multiple fields. I thought I
could just combine the results of each per-field suggestion but the offset
and length values are all incorrect. The offset is always 0 and the length
is always the length of the original string. This makes it impossible to
combine the results.

Any idea what's wrong with this query that would cause that?

{
"suggest": {
"text": "consipre busines devlopment",
"person_name": {
"phrase": {
"field": "name.suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "name.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"person_organization": {
"phrase": {
"field": "organizations.suggestion",
"size": 3,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "organizations.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"person_role": {
"phrase": {
"field": "roles.suggestion",
"size": 3,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "roles.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
]
}
},
"domain_title_suggestion": {
"phrase": {
"field": "domain_title_suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "domain_title_suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"domain_suggestion": {
"phrase": {
"field": "domain_suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "domain_suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
}
}
}

This is the result:

{
"took": 99,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 9,
"max_score": 1,
"hits": [
{
"_index": "conspire_v3",
"_type": "network",
"_id": "1865",
"_score": 1,
"_source": {
"network_id": 12345,
"name": "Battlestar Galactica",
"connectedness": 0,
"node_type": "network",
"node_id": 1865
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1874",
"_score": 1,
"_source": {
"count": 1,
"domain": "gmail.com",
"node_type": "domain",
"node_id": 1874
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1871",
"_score": 1,
"_source": {
"count": 1,
"domain": "caprica.org",
"node_type": "domain",
"node_id": 1871
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1876",
"_score": 1,
"_source": {
"name": "Lee Adama",
"connectedness": 0,
"addresses": [
"lee.adama@pegasus.org",
"apollo@galactica.com"
],
"node_type": "person",
"node_id": 1876,
"domains": [
"pegasus.org",
"galactica.com"
],
"domain_suggestion": [
"pegasus.org",
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1879",
"_score": 1,
"_source": {
"count": 1,
"domain": "pegasus.org",
"node_type": "domain",
"node_id": 1879
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1881",
"_score": 1,
"_source": {
"name": "Bill Adama",
"connectedness": 0,
"addresses": [
"bill.adama@pegasus.org",
"william@galactica.com"
],
"node_type": "person",
"node_id": 1881,
"domains": [
"pegasus.org",
"galactica.com"
],
"domain_suggestion": [
"pegasus.org",
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1868",
"_score": 1,
"_source": {
"count": 4,
"domain": "galactica.com",
"node_type": "domain",
"node_id": 1868
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1885",
"_score": 1,
"_source": {
"name": "Felix Geta",
"connectedness": 0,
"addresses": [
"geta@galactica.com"
],
"node_type": "person",
"node_id": 1885,
"domains": [
"galactica.com"
],
"domain_suggestion": [
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1867",
"_score": 1,
"_source": {
"name": "Kara Thrace",
"connectedness": 1,
"addresses": [
"grumpygirl@gmail.com",
"kara.thrace@caprica.org",
"starbuck@galactica.com"
],
"node_type": "person",
"node_id": 1867,
"domains": [
"gmail.com",
"caprica.org",
"galactica.com"
],
"domain_suggestion": [
"galactica.com"
],
"domain_title_suggestion": [
"Battlestar Galactica"
],
"organizations": [
"Conspire",
"Fake Organization"
],
"roles": [
"Business Development"
]
}
}
]
},
"suggest": {
"domain_title_suggestion": [
{
"text": "consipre busines devlopment",
"offset": 0,
"length": 27,
"options": [

    ]
  }
],
"person_name": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      
    ]
  }
],
"person_organization": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      {
        "text": "conspire busines devlopment",
        "highlighted": "<em>conspire<\/em> busines devlopment",
        "score": 0.13149868
      }
    ]
  }
],
"person_role": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      {
        "text": "consipre business development",
        "score": 0.43433
      },
      {
        "text": "consipre busines development",
        "score": 0.22576733
      },
      {
        "text": "consipre business devlopment",
        "score": 0.22103381
      }
    ]
  }
],
"domain_suggestion": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      
    ]
  }
]

}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d1a0b568-2f67-4b3b-a2df-95447d08efed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Itamar Syn-Hershko) #3

This looks like a bug, so you should probably create a repro and open a
github ticket for that. Before doing this tho I'd try to test with the
latest ES version, and also make sure there aren't custom analyzers / token
filters involved

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Sun, Jul 13, 2014 at 10:39 PM, Ryan Tanner ryan.tanner@gmail.com wrote:

Hate to bump this but I'd really appreciate some advice on this issue.

On Friday, July 11, 2014 1:25:32 PM UTC-6, Ryan Tanner wrote:

I'm trying to use phrase suggesters against multiple fields. I thought I
could just combine the results of each per-field suggestion but the offset
and length values are all incorrect. The offset is always 0 and the length
is always the length of the original string. This makes it impossible to
combine the results.

Any idea what's wrong with this query that would cause that?

{
"suggest": {
"text": "consipre busines devlopment",
"person_name": {
"phrase": {
"field": "name.suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "name.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"person_organization": {
"phrase": {
"field": "organizations.suggestion",
"size": 3,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "organizations.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"person_role": {
"phrase": {
"field": "roles.suggestion",
"size": 3,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "roles.suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
]
}
},
"domain_title_suggestion": {
"phrase": {
"field": "domain_title_suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "domain_title_suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
},
"domain_suggestion": {
"phrase": {
"field": "domain_suggestion",
"size": 1,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"gram_size": 2,
"direct_generator": [
{
"field": "domain_suggestion",
"suggest_mode": "always",
"min_word_length": 1
}
],
"highlight": {
"pre_tag": "",
"post_tag": "</em>"
}
}
}
}
}

This is the result:

{
"took": 99,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 9,
"max_score": 1,
"hits": [
{
"_index": "conspire_v3",
"_type": "network",
"_id": "1865",
"_score": 1,
"_source": {
"network_id": 12345,
"name": "Battlestar Galactica",
"connectedness": 0,
"node_type": "network",
"node_id": 1865
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1874",
"_score": 1,
"_source": {
"count": 1,
"domain": "gmail.com",
"node_type": "domain",
"node_id": 1874
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1871",
"_score": 1,
"_source": {
"count": 1,
"domain": "caprica.org",
"node_type": "domain",
"node_id": 1871
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1876",
"_score": 1,
"_source": {
"name": "Lee Adama",
"connectedness": 0,
"addresses": [
"lee.adama@pegasus.org",
"apollo@galactica.com"
],
"node_type": "person",
"node_id": 1876,
"domains": [
"pegasus.org",
"galactica.com"
],
"domain_suggestion": [
"pegasus.org",
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1879",
"_score": 1,
"_source": {
"count": 1,
"domain": "pegasus.org",
"node_type": "domain",
"node_id": 1879
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1881",
"_score": 1,
"_source": {
"name": "Bill Adama",
"connectedness": 0,
"addresses": [
"bill.adama@pegasus.org",
"william@galactica.com"
],
"node_type": "person",
"node_id": 1881,
"domains": [
"pegasus.org",
"galactica.com"
],
"domain_suggestion": [
"pegasus.org",
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "domain",
"_id": "1868",
"_score": 1,
"_source": {
"count": 4,
"domain": "galactica.com",
"node_type": "domain",
"node_id": 1868
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1885",
"_score": 1,
"_source": {
"name": "Felix Geta",
"connectedness": 0,
"addresses": [
"geta@galactica.com"
],
"node_type": "person",
"node_id": 1885,
"domains": [
"galactica.com"
],
"domain_suggestion": [
"galactica.com"
]
}
},
{
"_index": "conspire_v3",
"_type": "person",
"_id": "1867",
"_score": 1,
"_source": {
"name": "Kara Thrace",
"connectedness": 1,
"addresses": [
"grumpygirl@gmail.com",
"kara.thrace@caprica.org",
"starbuck@galactica.com"
],
"node_type": "person",
"node_id": 1867,
"domains": [
"gmail.com",
"caprica.org",
"galactica.com"
],
"domain_suggestion": [
"galactica.com"
],
"domain_title_suggestion": [
"Battlestar Galactica"
],
"organizations": [
"Conspire",
"Fake Organization"
],
"roles": [
"Business Development"
]
}
}
]
},
"suggest": {
"domain_title_suggestion": [
{
"text": "consipre busines devlopment",
"offset": 0,
"length": 27,
"options": [

    ]
  }
],
"person_name": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [

    ]
  }
],
"person_organization": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      {
        "text": "conspire busines devlopment",
        "highlighted": "<em>conspire<\/em> busines devlopment",
        "score": 0.13149868
      }
    ]
  }
],
"person_role": [
  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [
      {
        "text": "consipre business development",
        "score": 0.43433
      },
      {
        "text": "consipre busines development",
        "score": 0.22576733
      },
      {

        "text": "consipre business devlopment",
        "score": 0.22103381
      }
    ]
  }
],
"domain_suggestion": [

  {
    "text": "consipre busines devlopment",
    "offset": 0,
    "length": 27,
    "options": [

    ]
  }
]

}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d1a0b568-2f67-4b3b-a2df-95447d08efed%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d1a0b568-2f67-4b3b-a2df-95447d08efed%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt%3Ds1C%3DjdyL4Q%3DBiP03hV1Z%2BfwchkTGEhnutZjF_K5Mfg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Ryan Tanner) #4

I have the same result on both 1.1.1 and 1.2.2. All the index-time
analyzers on these fields are identical:

"index.analysis.filter.shingle_filter.type" : "shingle",
"index.analysis.filter.shingle_filter.min_shingle_size" : 2,
"index.analysis.filter.shingle_filter.max_shingle_size" : 5,
"index.analysis.analyzer.shingle_analyzer.type" : "custom",
"index.analysis.analyzer.shingle_analyzer.tokenizer" : "standard",
"index.analysis.analyzer.shingle_analyzer.filter" : [ "lowercase", 

"shingle_filter" ]

On Sunday, July 13, 2014 4:23:33 PM UTC-6, Itamar Syn-Hershko wrote:

This looks like a bug, so you should probably create a repro and open a
github ticket for that. Before doing this tho I'd try to test with the
latest ES version, and also make sure there aren't custom analyzers / token
filters involved

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Sun, Jul 13, 2014 at 10:39 PM, Ryan Tanner <ryan....@gmail.com
<javascript:>> wrote:

Hate to bump this but I'd really appreciate some advice on this issue.

On Friday, July 11, 2014 1:25:32 PM UTC-6, Ryan Tanner wrote:

I'm trying to use phrase suggesters against multiple fields. I thought I
could just combine the results of each per-field suggestion but the offset
and length values are all incorrect. The offset is always 0 and the length
is always the length of the original string. This makes it impossible to
combine the results.

Any idea what's wrong with this query that would cause that?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/07b6b933-3bfd-4d44-917e-afe5299c544f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ryan Tanner) #5

I must be doing something silly.

Looking at the documents returned within the suggestion result (responses \
hits \ hits), there's absolutely no correlation between those documents and
the query. And for some reason it just returns documents that are all
associated with domain names from South Africa, which really makes no sense
within the context of our use case.

For instance, the text "davis cohen" returns documents like:

    "_index": "conspire_v3",
    "_type": "person",
    "_id": "5544649",
    "_score": 1,
    "_source": {
      "name": "**name redacted** (**hidden**@absa.co.za)",
      "connectedness": 0,
      "addresses": [
        "**hidden**@absa.co.za"
      ],
      "node_type": "person",
      "node_id": 5544649,
      "domains": [
        "absa.co.za"
      ],
      "last_nickname_analysis": "20140708T225812.918+0000",
      "last_neo_upgrade": "20140708T231707.160+0000",
      "last_suggestion_crawl": "20140712T155002.491+0000",
      "domain_suggestion": [
        "absa.co.za"
      ]
    }

(the redacted name appears to be dutch in origin and is no where even close
to the original search phrase.

We're not using the combo query/suggester option, this is a standalone
suggestion. Is there supposed to be any significance to the documents
included in the response?

On Sunday, July 13, 2014 11:10:08 PM UTC-6, Ryan Tanner wrote:

I have the same result on both 1.1.1 and 1.2.2. All the index-time
analyzers on these fields are identical:

"index.analysis.filter.shingle_filter.type" : "shingle",
"index.analysis.filter.shingle_filter.min_shingle_size" : 2,
"index.analysis.filter.shingle_filter.max_shingle_size" : 5,
"index.analysis.analyzer.shingle_analyzer.type" : "custom",
"index.analysis.analyzer.shingle_analyzer.tokenizer" : "standard",
"index.analysis.analyzer.shingle_analyzer.filter" : [ "lowercase", 

"shingle_filter" ]

On Sunday, July 13, 2014 4:23:33 PM UTC-6, Itamar Syn-Hershko wrote:

This looks like a bug, so you should probably create a repro and open a
github ticket for that. Before doing this tho I'd try to test with the
latest ES version, and also make sure there aren't custom analyzers / token
filters involved

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Sun, Jul 13, 2014 at 10:39 PM, Ryan Tanner ryan....@gmail.com wrote:

Hate to bump this but I'd really appreciate some advice on this issue.

On Friday, July 11, 2014 1:25:32 PM UTC-6, Ryan Tanner wrote:

I'm trying to use phrase suggesters against multiple fields. I thought I
could just combine the results of each per-field suggestion but the offset
and length values are all incorrect. The offset is always 0 and the length
is always the length of the original string. This makes it impossible to
combine the results.

Any idea what's wrong with this query that would cause that?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/78fdf537-91b3-4246-89da-1c82463affcf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6