Elasticsearch reindex only appears to take a few documents


#1

Hello there,

I am trying to reindex some data in index firewall2 to index firewall4. I set up the firewall4 index and its mapping. When I try to run the reindex command, I see the following:

curl -XPOST 'localhost:9200/_reindex?pretty' -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "firewall2"
  },
  "dest": {
    "index": "firewall4"
  }
}
'
{
  "took" : 2,
  "timed_out" : false,
  "total" : 0,
  "updated" : 0,
  "created" : 0,
  "batches" : 0,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : 0,
  "throttled_millis" : 0,
  "requests_per_second" : "unlimited",
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

There are more than 2 records in the original index; there are approximately 40k. This doesn't display any errors, so I would think things are ok, but when I try to make the index pattern (firewall4-*) in Kibana, it says it cannot find data matching the index pattern. There was no data originally in the firewall4 index, so I did not care if anything existing would be deleted or not.

Here are all the indices:

firewall/             firewall2-2018-02-07/ firewall4/       .kibana/ 
firewall2/            firewall3/ 
firewall-2018-02-07/  firewall3-2018-02-07/

The "firewall" index had data as well, but I deleted that as part of a test. What I want to do now is reindex the data from firewall2 and firewall3 to firewall4, but I have no luck.

Any assistance would be great :smiley:


(David Pilato) #2

What makes you think that the reindex operation is done?

Did you check the running tasks?


#3

I am not seeing anything that looks like reindexing, unless this is what it's supposed to look like :smiley:

 curl -XGET localhost:9200/_tasks/?pretty
{
  "nodes" : {
    "X24HtNOOSrGFp2KQQldJPw" : {
      "name" : "Scimitar",
      "transport_address" : "localhost:9300",
      "host" : "localhost",
      "ip" : "localhost:9300",
      "tasks" : {
        "X24HtNOOSrGFp2KQQldJPw:590" : {
          "node" : "X24HtNOOSrGFp2KQQldJPw",
          "id" : 590,
          "type" : "transport",
          "action" : "cluster:monitor/tasks/lists",
          "start_time_in_millis" : 1518036782414,
          "running_time_in_nanos" : 316439
        },
        "X24HtNOOSrGFp2KQQldJPw:591" : {
          "node" : "X24HtNOOSrGFp2KQQldJPw",
          "id" : 591,
          "type" : "direct",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1518036782414,
          "running_time_in_nanos" : 139752,
          "parent_task_id" : "X24HtNOOSrGFp2KQQldJPw:590"
        }
      }
    }
  }
}

I started the reindex process a couple hours ago, not sure of the exact time. I also wasn't really sure if it started correctly as it seemed strange that the "took" field stated a number between 1 and 12 while the example from https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-reindex.html shows 147.

At any rate, I appreciate your time with this :smiley:


(David Pilato) #4

Could you run:

GET firewall2/_search?size=0
GET firewall4/_search?size=0

Also do you see anything in your logs?


#5
curl -XGET localhost:9200/firewall2/_search?size=0
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}

curl -XGET localhost:9200/firewall4/_search?size=0
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}

#6

There is also no entries in elasticsearch.log since I restarted elasticsearch earlier today. There are also no logs in depreciated or index* logs.


#7

There are some logs that popped up just a few minutes ago. They are all saying basically the same thing, but for different indices; firewall2, 3, 4, etc and some other ftp indexes I created that don't ship the logs in real time.

[2018-02-07 17:10:43,512][DEBUG][action.fieldstats        ] [Scimitar] [firewall4][4], node[X24HtNOOSrGFp2KQQldJPw], [P], v[6], s[STARTED], a[id=PGDwElgYQvelwemlJcPxdQ]: failed to execute [org.elasticsearch.action.fieldstats.FieldStatsRequest@64190be9]
RemoteTransportException[[Scimitar][localhost:9300][indices:data/read/field_stats[s]]]; nested: IllegalArgumentException[field [@timestamp] doesn't exist];
Caused by: java.lang.IllegalArgumentException: field [@timestamp] doesn't exist
        at org.elasticsearch.action.fieldstats.TransportFieldStatsTransportAction.shardOperation(TransportFieldStatsTransportAction.java:166)
        at org.elasticsearch.action.fieldstats.TransportFieldStatsTransportAction.shardOperation(TransportFieldStatsTransportAction.java:54)
        at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$ShardTransportHandler.messageReceived(TransportBroadcastAction.java:282)
        at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$ShardTransportHandler.messageReceived(TransportBroadcastAction.java:278)
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:77)
        at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:378)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

(David Pilato) #8

So if I sum up:

You have no data in index firewall2. And you ran:

curl -XPOST 'localhost:9200/_reindex?pretty' -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "firewall2"
  },
  "dest": {
    "index": "firewall4"
  }
}
'

Which means that you copied "no data" from firewall2 to firewall4.

What did you expect actually?


#9

I have data in firewall2. Here is the data from Kibana.

I have no data in firewall4. It was a blank index created plus the mapping. I am trying to practice reindexing data. That is why I am trying to get firewall2 and 3 to go to firewall4.


(David Pilato) #10

No you don't:

curl -XGET localhost:9200/firewall2/_search?size=0

gave

{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}

total is 0... No hits.

I'm not sure what your kibana instance is connected to.


#11

Maybe this is me being new to all of this, but it looks like there is data in firewall2 index here


(David Pilato) #12

So you have data in firewall2-2018-02-07 index.
But no data in firewall2 index.


#13

Ah ok. When I look in the /var/lib/elasticsearch...../indices, I seen the default logstash index. So I figured all the logstash-* indices were in place because of the logstash index, but I guess this is not the case? Instead I should use the date-specific indices to get the data from and then create a new index called firewall4-(today) to put the data in correct? I was hoping I could use the index prefix as catch all to move all the firewall2-* and firewall3-* data to firewall4 hoping that it would create the date suffix when it completed.


(David Pilato) #14

Don't look in

 /var/lib/elasticsearch...../indices

It's internal. Just use the API.

May be reindex from firewall2-*?


#15

Alrighty. I was just thinking along the same lines as creating an index how I just use the name and the rest, like the -date, happens on it's own. Thank you for clearing that up. :smiley:


(system) #16

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.