How to start over - Help


(Martin) #1

I need help
I am still a beginner, and I did the following:
I am trying to load a large amount of data for comparison test.

I have created a simple PHP program to read MySQL data and load it into ES.
I have created an analyzer that I use on 1 field (char_filter:html_strip), and I have the rest of my fields mapped normally (not_analyzed).
I ran my program today and let it run for about 2 hours (loaded around 50K documents), and then had to leave work to start over. It was taking too long, and I am going to attempt running via a PHP CLI instead of via HTTP to, hopefully, improve the speed of the load.
I got home and attempted to DELETE the whole mammoth index. It finally showed that it was gone, but it is not.
I ran this with the default settings, which I know now was a rookie mistake.
Now, when I crank up ES, it is popping about 600mb of RAM and looks to be INITIALIZING 2 of the 5 shards.
SO I can't do anything.

Do I just wait on this INITIALIZING to finish or is there a way to just start over clean?

This is a test on my local PC, and I don't want to burn it up a couple thousand miles from my real home.


(Boaz Leskes) #2

Heya, I strongly recommend you read the definitive guide. It's a great read and will get you started on the basic concepts. It's available online - see https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html

Another good resource is here: https://www.elastic.co/webinars/get-started-with-elasticsearch/?baymax=default&elektra=docs&storm=top-video


(Martin) #3

I was working through it as I went. Being pushed from higher

I take this as your answer being, I have to let java finish cleaning up, or whatever it is doing.


(Martin) #4

This is a local dev (non-production) solution to my problem.

Answering the question, "How to start over?"

If, by chance, someone does make the same mistake I made and needs an actual answer.

I REPEAT! :smiley: This is a local development environment. It can just go away.

All you need to do is change your config to some other cluster and node name, then go here and delete the old index. I guess deleting the old cluster would be OK too. It is never to be referenced again.

{path to}\elasticsearch\data{your old cluster name}\nodes\0\indices.

Crank up elasticsearch, and the fear of your PC heating the room is gone.

I guess I missed this page in the Definitive Guide. I will, of course, start over at page 1.


(Christian Dahlqvist) #5

How did you delete the index? Did you use the delete index API?


(Martin) #6

I did do that initially, and I was also able to query and see that it was not there; but, when I stopped and restarted the node, there it was. Really confusing for me. I then reviewed shard status and saw that 3 of my 5 shards were INITIALIZING. And I could not interrupt. I researched for hours. I went to task manager and it was using between 600 and 700mb of memeory. I also pulled up my PC Health Monitor and my fan was constantly running to keep the CPU cool.

I stopped the node, changed my elasticsearch.yml to a new cluster name and node name; and then went through the directory path using Windows Explorer to just delete the index as described above. I am using POSTMAN for everything at this time, so I only lost data. You only really need to delete it if you are concerned with disk space, which I am not at this time; but, I suffered enough with it to really want it gone.

I am just testing and trying to learn, so the data I had loaded was not important to me. I will have to load it again anyway once I get the analyzers and mappings where they need to be.

In the end, my PC is not a good resource to load big data. I trimmed down the dataset so that I could complete my testing. I also set up the ES_HEAP_SIZE environment variable. I am running on Windows 8.1

I hope this helps, and I REPEAT, I am testing in a local dev environment. :slight_smile:


(system) #7