0
I have about 1.5 million documents in my elastic search. I'm hoping to reindex them so that each index filters documents containing certain keywords, and one ( null index
) that do not contain any of the keywords I specified in other indices. I'm not sure why my indices returned fewer documents than expected. Particularly I'm expecting about 1.2 million documents in the null index
but it only returned about 30k documents in the new index. Would appreciate ideas on what I've done wrong here!
This is how I reindex documents containing certain keywords in multiple fields
curl --location --request POST 'http://abcdef2344:9200/_reindex' \
--header 'Content-Type: application/json' \
--data-raw '{
"source": {
"index": "mydocs_email_*",
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"multi_match": {
"fields": [
"content",
"meta.raw.Message:Raw-Header:Subject"
],
"query": "keyword1"
}
},
{
"multi_match": {
"fields": [
"content",
"meta.raw.Message:Raw-Header:Subject"
],
"query": "keyword2"
}
}
]
}
}
]
}
}
},
"dest": {
"index": "analysis_keywords"
}
}'
Then I use must_not
to create another index that do not contain keyword1
and keyword2
.
curl --location --request POST 'http://abcdef2344:9200/_reindex' \
--header 'Content-Type: application/json' \
--data-raw '{
"source": {
"index": "mydocs_email_*",
"query": {
"bool": {
"filter": [
{
"bool": {
"must_not": [
{
"multi_match": {
"fields": [
"content",
"meta.raw.Message:Raw-Header:Subject"
],
"query": "keyword1"
}
},
{
"multi_match": {
"fields": [
"content",
"meta.raw.Message:Raw-Header:Subject"
],
"query": "keyword2"
}
}
]
}
}
]
}
}
},
"dest": {
"index": "analysis_null"
}
}'
The null index
returned 29.7k documents. From the error message it looks like I should expect 1.28 million files. It also said I need to increase the number of fields in the index - which I also did after running the codes above. Though the number of files still stay the same.
{"took":53251,"timed_out":false,"total":1277428,"updated":243,"created":29755,"deleted":0,"batches":30,"version_conflicts":0,"noops":0,"retries":{"bulk":0,"search":0},"throttled_millis":0,"requests_per_second":-1.0,"throttled_until_millis":0,"failures":[{"index":"analysis_null","type":"_doc","id":"/email/.......msg","cause":{"type":"illegal_argument_exception","reason":"Limit of total fields [1000] in index [analysis_null] has been exceeded"},"status":400}]