Index not creating from couchdb with logstash


(imad) #1
      I am using this config file to import data from couchdb. 

input {

couchdb_changes {

    db => "roles"

    host => "localhost"

    port => 5984

}

}

output {

elasticsearch {

    document_id => "%{[@metadata][_id]}"

    document_type => "%{[@metadata][type]}"

    host => "localhost"

    index => "roles_index"

    protocol => "http"

    host => localhost

    port => 9200

}

}

I was able to run logstash with this config file and import data
once. I closed command prompt to shutdown logstash and reran cmd prompt
and stash with the config file again. but now I cannot see any index
created. Is there anything that I might be doing wrong here. I am using
ctl+c to kill logstash in cmd prompt. Will appreciate any help.

Thanks.


(Aaron Mildenstein) #2

If you ran it once, is it possible that the sequence is stored in sequence_path so it thinks it's already indexed everything.

If you delete this file, or edit it to 0, it should start over from the beginning and re-index all of your couchdb content.


(imad) #3

if you don't mind.. which file is that that I need to delete or set sequence_path to 0.


(imad) #4

Got it I set sequence_path => "0" in couchdb_changes { } section of my config file and it worked.

Thanks for your response.


(Aaron Mildenstein) #5

Oh! I'm sorry I didn't make myself more clear. The file identified by sequence_path should contain a zero. You just named a file 0 in your home directory which is storing the current sequence number. The link I shared previously shares the default file name and path. This worked, but if you delete a file named 0 in your home directory because you don't know what it does, it will potentially cause issues for you.


(imad) #6

I am still having the problem. Sorry for not understanding it correctly. You were right a file named "0" got generated in my logstash-1.5.2\bin folder. I deleted that but still didn't make it to run. As per the link that you sent for sequence_path .couchdb_seq in bin folder is the file where were sequence numbers are kept. I though didn't see .couchdb_seq in my bin folder, I created it my self and put a 0 in it, then ran Logstash again with my config file but still cannot create any indices.


(Aaron Mildenstein) #7

The 0 file and the default placement of the .couchdb_seq file would not necessarily be in the same place (it ran once correctly, so that file is somewhere, even if it's not in the bin directory.

I would specify a full-path to the file, putting it somewhere you will know where it is, and that the Logstash process will have write permissions.


(imad) #8

Got it. So I created "couchdb_seq" file and set it to sequence_path. It works now if I reset value in file couchdb_seq to 0 every time I want to run logstash to import data in elasticsearch index. Is there any other way for this.

I want to now have multiple inputs/outpust based on condition in my config file to create more than one indices. But with the approach to set sequence_path to "couchdb_seq". My indices are not getting created. So are there any other ways to resolve this?


(Aaron Mildenstein) #9

The idea of the couchdb_changes feed is that you should only have to import once, and then any changes that come through would be propagated (hence, the _changes feed).

Are you trying to re-run the import multiple times to populate multiple indices?

You can accomplish this with conditionals and multiple elasticsearch output blocks, or with the sprintf format and the index name attribute.

If you need multiple inputs (or have multiple couchdb databases), you can define one couchdb_changes input block per database. Be sure to use a different sequence_path with each db, to keep them from colliding.

Use tags to identify different streams. This will make it easier to control which feeds go to which indices.


(imad) #10

Yes I wanted to import data from multiple databases into separate indices. Thank Aaron for helping me out. It worked, I can create indices for each database now. Here is my config file for anyone who is also looking to do the same.

input {
couchdb_changes {
db => "users"
host => "localhost"
port => 5984
sequence_path => "users_couchdb_seq"
tags => [ "users" ]
}
couchdb_changes {
db => "roles"
host => "localhost"
port => 5984
sequence_path => "roles_couchdb_seq"
tags => [ "roles" ]
}
}
output {
if "users" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "users_index"
protocol => "http"
host => '127.0.0.1'
port => 9200
}
}
if "roles" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "roles_index"
protocol => "http"
host => '127.0.0.1'
port => 9200
}
}
}


(imad) #11

@theuntergeek is there anyother way to recreate index if they are already created without using sequence_path method?


(Aaron Mildenstein) #12

You could duplicate the input and have a separate sequence_path. That's about it.


(imad) #13

Ok, if logstash can have a flag to flush and recreate indices, it would be very handy.


(Aaron Mildenstein) #14

Logstash will not likely ever have this feature. If you want to recreate an index, you really ought to use the sequence_path method. It's by far the fastest and simplest way to achieve your goal. You can drop indices with simple API calls (delete) or use Elasticsearch Curator.

You'd still have to rebuild the pipeline in future releases of Logstash, as the input plugin would still need to restart with the appropriate sequence number. Once the pipeline is established, it will not re-read the sequence id unless there's an error or a disconnect.


(system) #15