Hi Ted would you care to share with geoip adresses are better than maxmind?
Thanks
hey clint,
thanks for the reply. and the perl script.
funny you should mention maxmind. i've given their premium database a
try and they are quite inaccurate for many addresses. there are many
more databases that are much more precise and more comprehensive. so
the performance advantage provided from their C index does not really
compensate.
and my intent with using elasticsearch for geoip is to subsequently be
able to conduct searches with the geo filter.
this setup may suffer slightly from speed due to the required http
calls but from my standpoint is worth it. i'm just looking for
reasonably acceptable latency.
On Tue, Oct 26, 2010 at 5:35 PM, Clinton Gormley
<clinton@iannounce.co.uk> wrote:
hiya ted
The way I see it, there are three broad dimensions:
- Converting CSV to JSON
- most of the GeoIP databases provide the data in csv
My only experience with mapping IPs to locations is with MaxMind's
GeoIP, but they provide a very compact and fast index for you - I'm not
sure what benefit there would be in putting that into Elasticsearch
( but I may not have understood your intent)
here's a quick Perl script which will index your CSV docs:
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS();
use Elasticsearch();
use Data::Dumper;
my $Index = 'my_index';
my $Type = 'my_type';
my @Col_Names = qw(id name ip_addr foo bar);
my $es = Elasticsearch->new( servers => '127.0.0.1:9200' );
my $csv = Text_CSV::XS->new( { binary => 1 } );
$csv->column_names(@Col_Names);
for my $csv_file (@ARGV) {
print "Processing file '$csv_file'\n";
open my $csv_fh, '<:encoding(utf8)', $csv_file
or die "Couldn't open $csv_file: $!";
my @rows;
my $i = 0;
while ( my $row = $csv->getline_hr($csv_fh) ) {
push @rows, {
index => {
index => $Index,
type => $Type,
id => $row->{id}, # assuming there is an ID
data => $row
}
};
if ( $i++ == 1000 ) {
index_rows( \@rows );
@rows = ();
}
}
index_rows( \@_ );
}
sub index_rows {
my $rows = shift;
my $result = $es->bulk($rows);
die "Error while indexing: " . $result->{errors}
if $result->{errors};
}
clint