Different results, with the same query (local, single node). Strange!


(Cody) #1

Hey guys,

Just came across this strange thing. What I have is the simplest case, nothing fancy: local, single node, 1 index, 6 docs in them.
Mapping I used are:
{
"settings": {
"analysis": {
"filter": {
"cody_stop": {
"type": "stop",
"stopwords": "english"
},
"cody_stemmer": {
"type": "stemmer",
"language": "light_english"
}
},
"analyzer": {
"cody": {
"tokenizer": "standard",
"filter": [
"lowercase",
"cody_stop",
"cody_stemmer"
]
}
}
}
},
"mappings": {
"default": {
"properties": {
"fileName": {
"type": "multi_field",
"fields": {
"indexed": {"type": "string",
"analyzer": "cody"
},
"original": {
"type" : "string",
"index": "not_analyzed"
}
}
},
"content": {
"type": "string",
"analyzer": "cody"
},
"metaAuthor": {
"type": "string",
"analyzer": "cody"
},
"metaContent": {
"type": "string",
"analyzer": "cody"
}
}
}
}
}

The first query got 2 hits, both of them contains this term "tradition(s)".
{
"query": {
"query_string": {
"query": "traditions"
}
}
}

The second query got only 1 hit (one of the two results from the first query, it has "traditions" in its fileName). Based on my understanding, these two are essentially the same thing after analyzing them using my customized analyzer.
{
"query": {
"query_string": {
"query": "tradition"
}
}
}

Can someone help me out, this is driving me crazy?

Thanks,
Cody


(Mark Walkom) #2

How many shards?


(Cody) #3

Forgot to mention. I have 5 shards by default. Here is part of the results.

Result of query 1:
"hits": {
"total": 2,
"max_score": 0.08048013,
"hits": [
{
"_shard": 2,
"_node": "TcVGhp8jSDiBezCjtjs_fg",
"_index": "files",
"_type": "contentAndMeta",
"_id": "AVOLshPh7ne-n0RYZxN9",
"_score": 0.08048013,
"_source": {
"fileName": "the four traditions of geography.pdf",
....
{
"_shard": 2,
"_node": "TcVGhp8jSDiBezCjtjs_fg",
"_index": "files",
"_type": "contentAndMeta",
"_id": "AVOLspsD7ne-n0RYZxOF",
"_score": 0.069697835,
"_source": {
"fileName": "the spatial view in context.pdf",

Results of query 2:
"hits": {
"total": 1,
"max_score": 0.09164428,
"hits": [
{
"_shard": 2,
"_node": "TcVGhp8jSDiBezCjtjs_fg",
"_index": "files",
"_type": "contentAndMeta",
"_id": "AVOLshPh7ne-n0RYZxN9",
"_score": 0.09164428,
"_source": {
"fileName": "the four traditions of geography.pdf",


(Mark Walkom) #4

What if you reindex with a single shard and try your query again?


(Cody) #5

Still the same result. May I ask what this has anything with the number of shards? I know it may affect the relevance score, but I don't see why it would result in different number of hits.

Thanks


(system) #6