Elastic Search Result

Hi Team,

I have a sample user document which i am indexing by userName. While querying instead of retrieving all fields i just need userName and address2, so i added the required fields in my query. It is just returning only userName.

My question here is Can we use non indexed fields in my search query?

{
userName :'test',
address:'test123'
address2:'test456'
}

POST:

URL : http:/localhost:9300/user-2015/_search?q=test
search query : {
    "fields" : ["userName","address2"]
    }

Result gives only userName. Can some one help me on this?

Hi,
do you mean that out of the _source, you would only like to get back userName and address2? In that case source filtering should be what you are looking for: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-source-filtering.html . Use _source: ["userName", "address2"].

In case I misunderstood, could you please post a complete recreation, with expected response and the actual response that you are getting when sending your request?

Cheers
Luca

Thanks Luca. This is what i am expecting. Thanks for your help.

I have one more question, I have nearly .5 million documents indexed in my system. Now I want to change the mappings of the index( adding new fields and updating data type of some fields) .

In this case how can i reindex all my existing index?

For now, yes. You have to rebuild the index using scan/scroll. There is an
old blog post called something like "no downtime mapping updates" which
describes a good process for this.

Thanks Nik.

Hi Nik,

I am able to reindex using scan/scroll, now i have new index with different name. But still my customers are using my old index. I am trying to create alias for my old index based on the following tutorial.

https://www.elastic.co/guide/en/elasticsearch/guide/current/index-aliases.html

I am getting error.

POST:

POST /_aliases
{
    "actions": [
        { "remove":    
            { "index": "customer1", 
              "alias": "customer"
            }
        }
    ]
}

Response:
{
   "error": {
      "root_cause": [
         {
            "type": "invalid_alias_name_exception",
            "reason": "Invalid alias name [customer], an index exists with the same name as the alias",
            "index": "customer1"
         }
      ],
      "type": "invalid_alias_name_exception",
      "reason": "Invalid alias name [customer], an index exists with the same name as the alias",
      "index": "customer1"
   },
   "status": 400
}

customer1 is my new index, customer is my old index.

Based on the exception we have to give alias name, But I am getting error while indexing if i give same alias name for both my index. alias name is allcustomer

PUT allcustomer/external/3?pretty
{
  "firstname": "abcd",
  "lastname":"efgh"
}

Response:
{
   "error": {
      "root_cause": [
         {
            "type": "illegal_argument_exception",
            "reason": "Alias [allcustomer] has more than one indices associated with it [[customer, customer1]], can't execute a single index op"
         }
      ],
      "type": "illegal_argument_exception",
      "reason": "Alias [allcustomer] has more than one indices associated with it [[customer, customer1]], can't execute a single index op"
   },
   "status": 400
}

Please help me to fix this.

I don't think you can atomically delete an index and replace it with an
alias. I believe you'll b have to delete the index and then create the
alias.

Thanks Nik. I have one more doubt.

Say for example i have multiple (for e.g 10000 indices) indices for the customer based on the company.
e.g: customer-abc, customer-aaa..etc

I need to change the template mapping for the customer* indices. after i changed the template mapping do i need to run reindex for all the customer* indices one by one?

Generally its a bad idea to have an index per customer. It just doesn't scale to the number of customers people want - 10,000 indices in a cluster is a lot! You shouldn't have an index per customer.

I'm really curious now, why were you thinking of using an index per customer?

Warning: Mapping has a specific meaning in elasticsearch and templates aren't part of it.

If you are doing this delete index and replace with alias thing then yes, you should do it one by one. If you were doing a template swap you could do them all at once if you wanted to. But you may not want to for lots of reasons - it might be better to do it for some indexes at a time to lower the impact of serving from cold data.

Hi Nik,

The reason i have created index based on company is, my requirement is to serve the customer information based on the company. In a request i will get customer information and company information to query ES.

Based on current indices, customer template will have settings and mappings.

Current customer template is

GET_template/template_customer

{
   "template_customer": {
      "order": 0,
      "template": "customer*",
      "settings": {
         "index": {
            "number_of_shards": "1"
         }
      },
      "mappings": {
         "cus": {
            "dynamic": "false",
            "properties": {
               "firstname": {
                  "type": "string"
               }
            }
         }
      },
      "aliases": {}
   }
}

I need to add one more field in the template mapping (because i want to reflect this for all customer indices) and need to add custom analyzer for firstName field. So i feel like reindexing all customer* indices one by one is painful.

So as you said i need to think of indices design.

Yeah, but you can to that with a simple filter and a field per document.
It's how you'd do it if you were using a relational database. I'm just
curious if there is another system that you are used to that supports index
per customer kinds of things.

I don't fault you at all for thinking that index per customer is ok. Lots
of people think that way. I was talking to someone who thought similar
stuff yesterday. Its a really common error, I'm just super curious why!
You've been quick to respond so I thought I'd ask you.

Thanks Nik, thanks for clarifying this error. We are also using relational database, and company id is one of the field there.
We don't have other system for indexing.

I have a requirement as follows.

We have a field called customer desc, it should work for both partial and exact key word search.

E.g : If customer desc is less than 8 it should work as exact key word search, if it is greater than 8 then it should be sub string.

my current mapping is

PUT /test
{  
   "mappings":{  
      "test":{  
         "properties":{  
            "desc":{  
               "type":"string",
               "analyzer":"index_ngram",
               "search_analyzer":"search_ngram"
            }
         }
      }
   },
   "settings":{  
      "analysis":{  
         "filter":{  
            "desc_ngram":{  
               "type":"ngram",
               "min_gram":8,
               "max_gram":20
            }
         },
         "analyzer":{  
            "index_ngram":{  
               "type":"custom",
               "tokenizer":"standard",
               "filter":[  
                   "lowercase",
                   "desc_ngram"
               ]
            },
            "search_ngram":{  
               "type":"custom",
               "tokenizer":"keyword",
               "filter":["lowercase","standard"]
            }
         }
      }
   }
}


GET /test/_analyze?analyzer=index_ngram&source=34FG

Response:
{
   "tokens": []
}

Expected output is
{
   "tokens": [
    {
         "token": "34fg"
      }
   ]
}

If length is greater than 8 then nGram should trigger.

Can some one help me to fix this.

This should probably be another topic for searchability.

That feels backwards - if the user if the user types more characters you want to search for the documents that have any substring match? But if the search term is short you want to search for the exact string? I think the other way makes sense - if the user types 8 or less characters then they don't have a full id so do a substring search.

Either way, your plan to have different search and index analyzers is the right idea I think.

Some examples would help here. I think.