How are multiple types handled at ingestion?

Hello,

I see that a field can have multiple types, but the documentation examples focus on string vs. not analyzed type of situation

I wonder if it's possible to have multiple types like:

  • numbers as strings and number types:
  • date as string and date types
  • boolean as string and boolean types
    ?

If so, how would that work?
assuming the mapping is defined, would elasticsearch parse the single input into the multiple types? or do i need to parse the input and build an object with the various types ?
Is there any kind of logstash plugin that does this?

Thanks

The mapping definition allows you to interpret how the values in your JSON are mapped to entries in the search index. Normally a field called foo in your JSON is mapped to a field in your index called foo but you can include some additional interpretations in the fields section. Here's an example where we try coerce a JSON field called foo into an integer field called foo.asInt:

PUT /test
{
   "mappings": {
	  "doc": {
		 "properties": {
			"foo": {
			   "type": "keyword",
			   "fields": {
				  "asInt": {
					 "type": "integer",
					 "ignore_malformed": true
				  }
			   }
			}
		 }
	  }
   }
}
POST test/doc
{
	"foo":[1,2,"n/a"]
}
GET test/_search
{
	"query":{
		"match":{
			"foo.asInt":2
		}
	}
}
GET test/_search
{
	"query":{
		"match":{
			"foo":"n/a"
		}
	}
}

Hope this helps

1 Like

Thanks Mark

I see you use the type keyword
is that the trick here?
I also don't see this keyword feature in the docs of 2.3.
Is this a 5.0 feature?

Thanks

Oops. Yes :slight_smile: It's essentially like a string with a not_analyzed setting.

The keyword type in my example is not central to the question of how you define multiple types. I could have picked any types e.g. string, integer etc and organised them whatever way round I wanted. A convention that many use for example with strings is to have foo as primary field of type string and analyzed for search then to have an alternative foo.raw field mapped that is not_analyzed for use in analytics. Some have suggested that Kibana end users may be more inclined to pick field foo than foo.raw as the subject of a pie chart so it may make more sense to use a convention that reverses the configuration and have a not_analyzed foo field and an alternative mapping field named foo.search which is analyzed.
Ultimately it's up to you though to pick a convention that works.