Wondering how to create a mapping for the below document when the nested object key has too many unique / unknown keys in just one document.
When run as dynamic mapping, I receive the following error
{
"took" : 2371,
"errors" : true,
"items" : [
{
"index" : {
"_index" : "pricing",
"_type" : "redshift",
"_id" : "jvOxcHIBZBHEiq-WwVYW",
"status" : 400,
"error" : {
"type" : "illegal_argument_exception",
"reason" : "Limit of total fields [1000] in index [pricing] has been exceeded"
}
}
}
]
}
How would you create a mapping if the key within the object is unknown?
For example: The products is a nested object, but can have multiple keys for each version of the product.
{
"formatVersion": "v1.0",
"disclaimer": "This pricing list is for informational purposes only. All prices are subject to the additional terms included in the pricing pages on http://aws.amazon.com. All Free Tier prices are also subject to the terms included at https://aws.amazon.com/free/",
"offerCode": "AmazonRedshift",
"version": "20200511174831",
"publicationDate": "2020-05-11T17:48:31Z",
"products": {
"KNVR5PFC54PXV76M": {
"sku": "KNVR5PFC54PXV76M",
"productFamily": "Compute Instance",
"attributes": {
"servicecode": "AmazonRedshift",
"location": "Africa (Cape Town)",
"locationType": "AWS Region",
"instanceType": "dc2.large",
"currentGeneration": "Yes",
"vcpu": "2",
"memory": "15 GiB",
"storage": "0.16TB SSD",
"io": "0.60 GB/s",
"usagetype": "AFS1-Node:dc2.large",
"operation": "RunComputeNode:0001",
"ecu": "7",
"servicename": "Amazon Redshift",
"usageFamily": "Dense Compute"
}
},
"UUE32SQ6PQ9F6JHJ": {
"sku": "UUE32SQ6PQ9F6JHJ",
"productFamily": "Compute Instance",
"attributes": {
"servicecode": "AmazonRedshift",
"location": "US East (Ohio)",
"locationType": "AWS Region",
"instanceType": "dc1.8xlarge",
"currentGeneration": "No",
"vcpu": "32",
"memory": "244 GiB",
"storage": "2.56TB SSD",
"io": "3.70 GB/s",
"usagetype": "USE2-Node:dw2.8xlarge",
"operation": "RunComputeNode:0001",
"ecu": "104",
"servicename": "Amazon Redshift",
"usageFamily": "Dense Compute"
}
}
},
"terms": {
"OnDemand": {
"KNVR5PFC54PXV76M": {
"KNVR5PFC54PXV76M.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "KNVR5PFC54PXV76M",
"effectiveDate": "2020-04-01T00:00:00Z",
"priceDimensions": {
"KNVR5PFC54PXV76M.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "KNVR5PFC54PXV76M.JRTCKXETXF.6YS6EN2CT7",
"description": "$0.357 per Redshift Dense Compute Large (DC2.L) Compute Node-hour (or partial hour)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Hrs",
"pricePerUnit": {
"USD": "0.3570000000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
}
},
"Reserved": {
"U4V6Y3USKUYCB6Q5": {
"U4V6Y3USKUYCB6Q5.6QCMYABX3D": {
"offerTermCode": "6QCMYABX3D",
"sku": "U4V6Y3USKUYCB6Q5",
"effectiveDate": "2016-05-31T23:59:59Z",
"priceDimensions": {
"U4V6Y3USKUYCB6Q5.6QCMYABX3D.2TG2D8R56U": {
"rateCode": "U4V6Y3USKUYCB6Q5.6QCMYABX3D.2TG2D8R56U",
"description": "Upfront Fee",
"unit": "Quantity",
"pricePerUnit": {
"USD": "34800"
},
"appliesTo": []
},
"U4V6Y3USKUYCB6Q5.6QCMYABX3D.6YS6EN2CT7": {
"rateCode": "U4V6Y3USKUYCB6Q5.6QCMYABX3D.6YS6EN2CT7",
"description": "USD 0.0 per Redshift, dw2.8xlarge reserved instance applied",
"beginRange": "0",
"endRange": "Inf",
"unit": "Hrs",
"pricePerUnit": {
"USD": "0.0000000000"
},
"appliesTo": []
}
},
"termAttributes": {
"LeaseContractLength": "1yr",
"OfferingClass": "standard",
"PurchaseOption": "All Upfront"
}
}
}
}
}
}
Tried to flatten the object, but still the search results entire document...
GET /amzpricereportnew/_search
{
"query": {
"nested": {
"path": "products",
"query": {
"bool": {
"must": [
{"match": {"products.KNVR5PFC54PXV76M.attributes.instanceType": "dc2.large"}}
]
}
}
}
}
}
Most documentation for nested objects talks about keys that are known already. In this case, we don't know the keys beforehand besides they being unique. If I were to run the query, the entire document gets returned.
What we are trying to find is the cost of a product from the terms object. Is there any best practices for designing index mappings for these schema which have higher cardinality?