I am very new to elasticsearch . Presently working in v5.4.3
I was ging through the concept of mapping but could not get much clarity on the different types of mapping .
For example - #curl -XPUT 'http://localhost:9200/twitter/user/XYZ?pretty' -H 'Content-Type: application/json' -d '{ "name" : "ABC" }'
Is this automatic mapping (or default mapping ) ????? Since mapping is not defined before actual insertion of data takes place ???
And where we define the mapping template first and then insert data accordingly is explicit mapping ??
like below -
[root@node1 ~]# curl -X PUT "localhost:9200/test" -H 'Content-Type: application/json' -d'
{
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"mappings" : {
"type1" : {
"properties" : {
"field1" : { "type" : "text" }
}
}
}
}
'
{"acknowledged":true,"shards_acknowledged":true,"index":"test"}[root@node1 ~]#
And now insert data after defining mapping
[root@node1 ~]# curl -X PUT "localhost:9200/test/type1/1?" -H 'Content-Type: application/json' -d'
{
"title": "User2-Document6"
}
'
Is my understanding correct ?? I read mapping related information in the official site but not very clear .
Kindly help me understand the same .
Any leads would be highly appreciable !!
ES has turned your name field into both text and keyword (look at multi-fields). Text gets run through an analyzer and keyword is the source.
With your second example, you are correct, you are using Explicit Mapping. You are setting up a mapping type of type1 with properties. field1 is where you should specify the field name.
Which I don't think was your intention. Your now have mapping for type1 and title You probably wanted just the title type. So you could have done the following
And inserted same data 5 times (5 docs )
curl -X PUT "localhost:9200/elasticsearch_data/user/1?" -H 'Content-Type: application/json' -d'
{
"text": "This is a string only "
}
'
Case 2
I'm glad that was helpful! I'll do my best to try and answer your other questions.
String types are automatically analyzed as multi-fields by ES. The default analyzer is the Standard Analyzer. The two multi-field types are text, which by default uses the standard tokenizer to divide the text into tokens for the inverted index, and the other type is keyword which is not analyzed. keyword is the exact text that was put into ES. There is an older blog post here that could benefit you by explaining why Elastic made the switch to multi-fields for string types.
Having your data stored as text allows full-text search. Having your data stored as keyword allows keyword searches and aggregations to be performed.
Regarding your last question about the difference in the index sizes, I don't know.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.