"_id" and mappings

Hi,

I'm doing something like below,

try:
	es_client = Elasticsearch([EShost], timeout=1000)
	if not es_client.indices.exists(index):
		settings = '''
		{  
		  "settings" : {
				"analysis": {
					"filter": {
						"bigrams_filter": {
							"type":     "nGram",
							"min_gram": 12,
							"max_gram": 12
						}
					},
					"analyzer": {
						"bigrams": {
							"type":      "custom",
							"tokenizer": "standard",
							"filter":   [
								"lowercase",
								"bigrams_filter"
							]
						}
					}
				}
			},"mappings": {
				"'''+estype+'''": {
					"properties": {
						"_id": {
							"type":     "string",
							"analyzer": "bigrams" 
						}
					}
				}
			}
		}'''
		print "\nCreating index..!!"
		es_client.indices.create(index=index, ignore=400, body=settings)
except:
	print traceback.print_exc()

So that I ll get all the id's having specified string as a substring.
But I'm not able to retrieve any id which is having specified input string in it.
Am I missing anything..?

Below is the _mapping output:-

{
  "string_database" : {
	"mappings" : {
	  "strings" : {
		"properties" : {
		  "occurence" : {
			"type" : "long"
		  },
		  "timestamp" : {
			"type" : "date",
			"format" : "strict_date_optional_time||epoch_millis"
		  }
		}
	  }
	}
  }
}

Here "_id" is not present.

The "_id" field is an internal field that Elasticsearch manages... you cannot configure analyzers for it. If you need that behavior, you'll have to configure your own field in the document (e.g. "my_id") which has the desired analyzers.

Hello Polyfractal,

Thanks for your reply. I have added new field inside "_source" field and it worked.

I have one more query, is it possible to create new field outside the "_source" field and apply analyzer for it..??(I have tried same but didn't worked) or ES allows field to be created inside _source field only.. ??

And if it is possible then(applying analyzer too), will it impact search performance..?? because searching inside the _source will be more costlier than searching outside it. (I'm new to ES I might be thinking in wrong direction)

Regards

Hmm, I'm not quite sure I know what you mean.

The _source field represents the original JSON that you send to Elasticsearch. So it will contain basically the exact document that you sent.

It's possible to add fields that are indexed/searchable but not in the original _source document using a feature called multi-fields. This allows you to analyze a single field in different ways. The contents of the field are only sent once in the original document, but multiple analysis processes are used to create "virtual fields" which are query'able.

Similarly, you can use the copy_to parameter to copy the contents of multiple fields into a single new "virtual field". This is also not present in the original _source, but it is searchable.

And if it is possible then(applying analyzer too), will it impact search performance..?? because searching inside the _source will be more costlier than searching outside it. (I'm new to ES I might be thinking in wrong direction)

This isn't really a problem as the _source is never actually searched :slight_smile: That field just holds the original JSON so that you can use it later (to show to users, etc). The actual searching is done on internal data structures that are generated when the document is indexed.

Thanks a lot Polyfractal, your reply was very informative.