String Query for java API dosen't find some documents by name

Hello!

First, sorry for my poor english, but let me try to explain my problem.

I'm working in an application using elasticsearch java api for managed my documents.
Everything works fine, i'm able to search in DB and save on my index, i can count my documents aggregate by field and a lot of cool things, but i stucked on a weird problem.

When i trying to search my document by field called name, some documents doesn't return on search.

Let me give an example:

My documents is look like this(just for example):

id: 1
name: book
type: pdf

id: 2
name: Test of my search service
type: zip

When i trying to search, if i search by name, send as parameter the value "book", it works fine, but when i trying to search, send my parameter value "service", the result is empty.

Here my search code:

SearchRequestBuilder src1 = client.prepareSearch() .setQuery(QueryBuilders.queryStringQuery(parameter) .field("name"));

Anyone knows, why this search doesn't find my parameter value "service" on name field of document with id 2?

Thanks!

What is your mapping?

Hey!

Well, i don't set a mapping for my index. I create my index automatically use the elasticsearch Java api.

How can i see that mapping property?

Oh! I'm sorry.

I found this configuration that I use to set my index to not analyzed. There is a map section on it:

curl -XPUT localhost:9200/_template/template_1 -d '{
    "template": "*",
    "settings": {
        "index.refresh_interval": "5s"
    },
    "mappings": {
        "_default_": {
            "_all": {
                "enabled": true
            },
            "dynamic_templates": [
                {
                    "string_fields": {
                        "match": "*",
                        "match_mapping_type": "string",
                        "mapping": {
                            "index": "not_analyzed",
                            "omit_norms": true,
                            "type": "string"
                        }
                    }
                }
            ],
            "properties": {
                "@version": {
                    "type": "string",
                    "index": "not_analyzed"
                }
                    }
                }
            }
        }
    }
}'

That's the reason it only matches the exact full string.

Sorry for my ignorance...

But, how can i change that to doesn't match only full string ?
I did a test here using the wildcard method, but this way isn't solve my problem 100%.

I read the documentation, but i don't understand (or i don't find) that information.

If you have a time, could you give me a little explanation about this problem and solution ?

Thanks!

The question is more "why did you add this template in the first place?"

Why did you set to not analyzed?

Elasticsearch comes with good defaults so unless you know what you are doing, keep the defaults.

I encourage you to read the guide. This section will probably help here: https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-analysis.html

Thanks again for your friendly help.

About your question, I add this config on my index because if I don't add this, elasticsearch splitting my string values on index step and also splitting my string values in search response.

For example, if I work without this config, I got the response from my service like this:

{
Id: 2,
name: my

id: 2,
name: service

id: 2,
name: documents
}
(Just an example)

When I search for how to "fix it" a guy who had the same problem send me this "trick".

Hey @dadoonet, thanks for all support and the tips for learning more about mappings on elasticsearch.

Fortunately, i solved my problem using this code in java api:

SearchRequestBuilder sr = client.prepareSearch() .setIndices("indexName") .setType("typeName) .setSearchType(SearchType.DFS_QUERY_AND_FETCH) .setQuery(QueryBuilders.wildCardQuery("fieldName", "*"+parameter+"*"));

I'd not mark this as a solution. This is a very bad solution and it will produce very bad response time with the number of terms in fieldName in the inverted index.

You have to understand how a search engine works and what you need to do to make it work as you wish.

The fact that your document 2 as been indexed like:

{
  "name": [ "my", "service", "documents" ]
}

is exactly made to serve the search purpose you are asking for.

What is wrong with that? I mean why do you care about this?

I feel like you are also trying to run aggregations and you don't like the fact the result is split.
If this is the use case you have, just tell and we will help you to fix that the right way.

First: make you search works with standard search.
Then fix the other things. I'll be happy to help.

Best

@dadoonet you're right!

Take a better look at this, this is far, far away from the quality that I want.

And you're right again, my critical problem is exactly the aggregation. When I run the aggregation method, this split values ruined my results.

Could you show me the path to found the better way to solve that ?

I would like to apply the better solution from my application.

You can index the same field in two different ways, to serve multiple purposes:

  • analyzed for search
  • not analyzed for aggregations

Have a look at this example: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/multi-fields.html

Thanks @dadoonet!

You helped me a lot!
Everything works better now, after a lot of changes.