How to find a field with more than one value per doc?

dbenson · November 16, 2010, 9:08pm

We have a field we're trying to sort on which is configured to analyze
as a single field.

Field mapping:
provider: {
omit_norms: true
analyzer: "lowercase_keyword"
type: "string"
}

Analyzer definition from the yml file:
lowercase_keyword :
type : custom
filter : [lowercase]
tokenizer : keyword

When we attempt to sort on this field we get the following error:
SearchPhaseExecutionException[Failed to execute phase [query], total
failure; shardFailures {[mmq2E8f3QUyBXbDrAgLINA][maestro-
report1_20101115173210][0]: QueryPhaseExecutionException[[maestro-
report1_20101115173210][0]:
query[ConstantScore(:)],from[0],size[10],sort[<custom:"provider":
org.elasticsearch.index.field.data.strings.StringFieldDataType
$1@5b9537da>]: Query Failed [Failed to execute main query]]; nested:
IOException[Can't sort on string types with more than one value per
doc, or more than one token per field]; }{[mmq2E8f3QUyBXbDrAgLINA]...

How I find the documents which have multiple values in this field?

If we unintentionally indexed the same field twice could it end up as
an array in the JSON?

ES 0.13 Snapshot, using the Java API for indexing. Error is returned
by both the REST api and the Java api.

Thanks,

David

ppearcy · November 17, 2010, 12:45am

I work with David and wanted to mention that this seems like it may be
a regression. We still have a set up on 0.12 and it doesn't have this
issue using the same analyzer, granted the data may be slightly
different, so I can't be 100% sure.

We do have this tokenizer working great on a headline field that would
contain more varied data than the provider field.

While it will be helpful to track down the field causing the problem,
I don't understand how using a keyword tokenizer would ever result in
a multi-term field.

As always, thanks a ton for any guidance.

Best Regards,
Paul

On Nov 16, 2:08 pm, dbenson dben...@dbenson.net wrote:

We have a field we're trying to sort on which is configured to analyze
as a single field.

Field mapping:
provider: {
omit_norms: true
analyzer: "lowercase_keyword"
type: "string"

}

Analyzer definition from the yml file:
lowercase_keyword :
type : custom
filter : [lowercase]
tokenizer : keyword

When we attempt to sort on this field we get the following error:
SearchPhaseExecutionException[Failed to execute phase [query], total
failure; shardFailures {[mmq2E8f3QUyBXbDrAgLINA][maestro-
report1_20101115173210][0]: QueryPhaseExecutionException[[maestro-
report1_20101115173210][0]:
query[ConstantScore(:)],from[0],size[10],sort[<custom:"provider":
org.elasticsearch.index.field.data.strings.StringFieldDataType
$1@5b9537da>]: Query Failed [Failed to execute main query]]; nested:
IOException[Can't sort on string types with more than one value per
doc, or more than one token per field]; }{[mmq2E8f3QUyBXbDrAgLINA]...

How I find the documents which have multiple values in this field?

If we unintentionally indexed the same field twice could it end up as
an array in the JSON?

ES 0.13 Snapshot, using the Java API for indexing. Error is returned
by both the REST api and the Java api.

Thanks,

David

ppearcy · November 17, 2010, 6:07am

Actually, take that back. Was able to reproduce in 0.12.

On Nov 16, 5:45 pm, Paul ppea...@gmail.com wrote:

I work with David and wanted to mention that this seems like it may be
a regression. We still have a set up on 0.12 and it doesn't have this
issue using the same analyzer, granted the data may be slightly
different, so I can't be 100% sure.

We do have this tokenizer working great on a headline field that would
contain more varied data than the provider field.

While it will be helpful to track down the field causing the problem,
I don't understand how using a keyword tokenizer would ever result in
a multi-term field.

As always, thanks a ton for any guidance.

Best Regards,
Paul

On Nov 16, 2:08 pm, dbenson dben...@dbenson.net wrote:

We have a field we're trying to sort on which is configured to analyze
as a single field.

Field mapping:
provider: {
omit_norms: true
analyzer: "lowercase_keyword"
type: "string"

}

Analyzer definition from the yml file:
lowercase_keyword :
type : custom
filter : [lowercase]
tokenizer : keyword

When we attempt to sort on this field we get the following error:
SearchPhaseExecutionException[Failed to execute phase [query], total
failure; shardFailures {[mmq2E8f3QUyBXbDrAgLINA][maestro-
report1_20101115173210][0]: QueryPhaseExecutionException[[maestro-
report1_20101115173210][0]:
query[ConstantScore(:)],from[0],size[10],sort[<custom:"provider":
org.elasticsearch.index.field.data.strings.StringFieldDataType
$1@5b9537da>]: Query Failed [Failed to execute main query]]; nested:
IOException[Can't sort on string types with more than one value per
doc, or more than one token per field]; }{[mmq2E8f3QUyBXbDrAgLINA]...

How I find the documents which have multiple values in this field?

If we unintentionally indexed the same field twice could it end up as
an array in the JSON?

ES 0.13 Snapshot, using the Java API for indexing. Error is returned
by both the REST api and the Java api.

Thanks,

David

dbenson · November 22, 2010, 4:25pm

We never found a way to query for this, but it appears that we were
indexing the same field twice, using the Java api. Looking at the
source fields returned via the REST api, there was just a single
vaslue. But if you faceted that field for a single doc, you would get
back more than one value. We put a small check in our indexing code to
only permit a single value per field.

David

Topic		Replies	Views
Can't sort on string types with more than one value per doc Elasticsearch	4	388	July 6, 2017
Java.io.IOException: Can't sort on string types with more than one value per doc, or more than one token per field Elasticsearch	3	259	July 6, 2017
Can't sort on string types with more than one value per doc, or more than one token per field Elasticsearch	2	293	July 6, 2017
Not_analyzed attribute ==> Can't sort on string types with more than one value per doc, or more than one token per field Elasticsearch	9	375	July 6, 2017
Case Insensitive Sort Elasticsearch	7	3296	July 6, 2017

How to find a field with more than one value per doc?

Related topics