I'm starting to transition our ES from our external dev to myself. I'm no developer, but can generally learn most things.
I have a few questions just to stat getting my head around things.
Our current mapping does not include a geo_point field, but we do have the coordinates. It was just mapped as text. Can I update this field type without needing to create an entirely new indice/index?
Is it possible to create a new field, and programmatically update this field with new data, for the index to then be able to return that result in near, but not quite real time? E.g. we have some data that is not mapped with avatars. We'd like to get these avatars (company logos) and update on the fly. Is this possible?
We have 300+ million records. Each does have a persistent id. Can we simply update the indice with a new dataset (from S3) or do we need to create an entirely new index each time with the S3 data?
Is logstash considered the best resource for indexing data from S3?
You will need to reindex to cast the current text into the new format
You can create the field, then update it later, yes. Elasticsearch is near-realtime though, so not sure what that has to do with returning the field in this use
Depends what you want to achieve. If the actual number of updates is low, then consider just updating records as needed. If there is a lot of updates, then it'd be more efficient to create a new index
Yes, it's a combo of company/profile data. I'm going to look at filebeat now.
Quick question before I go down that rabbit hole - do I need to first 'map' the fields before it's indexed, or as part of the indexing process I am able to do this within filebeat?
Elasticsearch will figure it out as best it can - dynamic mapping.
When you are starting it's usually a good idea to test a bit of data, then grab the mappings, tweak as you need, then setup a template.
Got it. Unfortunately the dynamic mapping we used previously, didn't pick up coordinates as a geo_point so we can't use the radius/bounding box - which is one of the reasons we need to remap and reindex.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.