Hello,
I recently started attempting to use elasticsearch to store some data.
In the learning process, I was doing ok until I attempted using the bulk api.
I used the something along the lines of:
conn = ES(url, timeout, bulksize)
for each (tuple)
data = something(tuple)
conn.index(data, index name, count, bulk=true)
Which I imagined would add a large number of items to localhost:9200/(index name)/(type name)/(count), but it ended up adding a large number of garbage at localhost:9200/(index name).
Not sure how to proceed, but knowing that I needed a reset, I went ahead and did a delete on localhost:9200/*, which deleted everything as expected. I thus attempted to begin again, with adding a index with mapping.
curl -XPUT "http://localhost:9200/(index name)" -d mapping.json
mapping.json:
{
"settings": {
"analysis": {
"analyzer": {
"ngram_analyzer": {
"tokenizer": "ngram_tokenizer"
}
},
"tokenizer": {
"ngram_tokenizer": {
"type": "nGram",
"min_gram": "2",
"max_gram": "3",
"token_chars": [
"letter",
"digit",
"symbol",
"punctuation",
"whitespace"
]
}
}
}
},
"mappings": {
"access": {
"properties": {
"date": {
"type": "date",
"format": "YYYY-MM-dd",
"analyzer": "ngram_analyzer"
},
"time": {
"type": "date",
"format": "HH:mm:ss",
"analyzer": "ngram_analyzer"
},
"protocol": {
"type": "string",
"analyzer": "ngram_analyzer"
},
"source ip": {
"type": "string",
"analyzer": "ngram_analyzer"
},
"source port": {
"type": "integer"
},
"country": {
"type": "string",
"analyzer": "ngram_analyzer"
},
"organization": {
"type": "string",
"analyzer": "ngram_analyzer"
},
"dest ip": {
"type": "string",
"analyzer": "ngram_analyzer"
},
"dest port": {
"type": "integer"
}
}
}
}
}
This is where the troubles began. While the above command created a node at (index name), it refused to populate the index with my settings.
{
"darknet" : {
"aliases" : { },
"mappings" : { },
"settings" : {
"index" : {
"creation_date" : "1424425712525",
"mapping" : {
"json" : ""
},
"uuid" : "cgTEPkqnQJKejLPyHqVNYA",
"number_of_replicas" : "1",
"number_of_shards" : "5",
"version" : {
"created" : "1040399"
}
}
},
"warmers" : { }
}
}
I attempted to update the node with curl -XPUT "http://localhost:9200/(index name)/_setting" -d mapping.json after a close, but that also left the index blank.
I delete the index, create another index, etc. etc. but to no avail.
In the end, I manage to make a change to the index, though not a good one.
I made a call to the update mapping API, which changed the index into a horrifying monstrosity with repeating tags, like so:
{
"darknet" : {
"aliases" : { },
"mappings" : {
"1" : {
"properties" : {
"mappings" : {
"properties" : {
"access" : {
"properties" : {
"properties" : {
"properties" : {
"country" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
},
"date" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"format" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
},
"dest ip" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
},
"dest port" : {
"properties" : {
"type" : {
"type" : "string"
}
}
},
"organization" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
},
"protocol" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
},
"source ip" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
},
"source port" : {
"properties" : {
"type" : {
"type" : "string"
}
}
},
"time" : {
"properties" : {
"analyzer" : {
"type" : "string"
},
"format" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
}
}
}
}
}
}
},
"settings" : {
"properties" : {
"analysis" : {
"properties" : {
"analyzer" : {
"properties" : {
"ngram_analyzer" : {
"properties" : {
"tokenizer" : {
"type" : "string"
}
}
}
}
},
"tokenizer" : {
"properties" : {
"ngram_tokenizer" : {
"properties" : {
"max_gram" : {
"type" : "string"
},
"min_gram" : {
"type" : "string"
},
"token_chars" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
}
}
}
}
}
}
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1424363851475",
"uuid" : "TNBwPJxURz2kwA-bQJklIQ",
"number_of_replicas" : "1",
"number_of_shards" : "5",
"version" : {
"created" : "1040399"
}
}
},
"warmers" : { }
}
}
I'm guessing I broke something in the process of bulk indexing and subsequent mass deletion.
I'm currently looking for a way to either fix this, or reset elasticsearch on my machine to "factory state". I would much appreciate it if you could give me a hand with this.