Few questions as I take over from an external Dev

Hey all,

I'm starting to transition our ES from our external dev to myself. I'm no developer, but can generally learn most things.

I have a few questions just to stat getting my head around things.

  1. Our current mapping does not include a geo_point field, but we do have the coordinates. It was just mapped as text. Can I update this field type without needing to create an entirely new indice/index?

  2. Is it possible to create a new field, and programmatically update this field with new data, for the index to then be able to return that result in near, but not quite real time? E.g. we have some data that is not mapped with avatars. We'd like to get these avatars (company logos) and update on the fly. Is this possible?

  3. We have 300+ million records. Each does have a persistent id. Can we simply update the indice with a new dataset (from S3) or do we need to create an entirely new index each time with the S3 data?

  4. Is logstash considered the best resource for indexing data from S3?

Thank you!

  1. You will need to reindex to cast the current text into the new format
  2. You can create the field, then update it later, yes. Elasticsearch is near-realtime though, so not sure what that has to do with returning the field in this use
  3. Depends what you want to achieve. If the actual number of updates is low, then consider just updating records as needed. If there is a lot of updates, then it'd be more efficient to create a new index
  4. Depends what you need to do with the data, if it's just pulling it in then take a look at https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-s3.html

It sounds like this is user profile-like information, is that right?

1 Like

Thank you @warkolm for your answers.

Yes, it's a combo of company/profile data. I'm going to look at filebeat now.

Quick question before I go down that rabbit hole - do I need to first 'map' the fields before it's indexed, or as part of the indexing process I am able to do this within filebeat?

Elasticsearch will figure it out as best it can - dynamic mapping.
When you are starting it's usually a good idea to test a bit of data, then grab the mappings, tweak as you need, then setup a template.

Got it. Unfortunately the dynamic mapping we used previously, didn't pick up coordinates as a geo_point so we can't use the radius/bounding box - which is one of the reasons we need to remap and reindex.

Is setting up a template available in filebeat?

Filebeat has built in templates. But you can also define your own.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.