'_all' field for multi field data store occupying more space


(anusha) #1

Hello Team,

I am using '_all' field to query on multiple fields data and my mappings as shown...

"mappings": {
"ymme_type": {
"_all": {
"auto_boost": true,
"index_analyzer": "wordAnalyzer",
"search_analyzer": "whitespace_analyzer"
},
"properties": {
"Engine": {
"type": "string",
"index": "not_analyzed",
"fields": {
"raw": {
"type": "string"
}
}
},
"EngineCode": {
"type": "string",
"include_in_all": false
},
"Make": {
"type": "string",
"boost": 3,
"index": "not_analyzed",
"norms": {
"enabled": true
},
"fields": {
"raw": {
"type": "string"
}
}
},
"MakeCode": {
"type": "string",
"include_in_all": false
},
"Model": {
"type": "string",
"boost": 2,
"index": "not_analyzed",
"norms": {
"enabled": true
},
"fields": {
"raw": {
"type": "string"
}
}
},
"ModelCode": {
"type": "string",
"include_in_all": false
},
"ShortYear": {
"type": "string",
"boost": 4,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"Year": {
"type": "string",
"boost": 5,
"index": "not_analyzed",
"norms": {
"enabled": true
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"YearCode": {
"type": "string",
"include_in_all": false
}
}
}

Here when I loaded data of 34000 records occupying 25mb space,
If I have seen the mappings without '_all' field and loaded the same data(34000) records occupying 6mb space,
May I know the reason?????????
And is there any solution for my index to occupy less memory??????


(Mark Walkom) #2

What do you mean by "space" here, are you talking heap or disk, and how are you measuring it?


(anusha) #3

Hello Mark,
In sense when we execute
GET _cat/indices?v

It will return store.size : value , am saying about the storage size......


(anusha) #4

Mark,
May I know the difference between heap and disk space???


(Mark Walkom) #5

Well _all is a shortcut field for searching, and it contains the analysed results of every field, so you would expect it to be larger than an index with documents that has _all disabled.

Heap is used for querying, aggregations. Disk is where the data is actually stored though.


(system) #6