How to store huge amount of complex data?


#1

Hi,

I need to figure out the best way of storing the information and I would appreciate your advice.

I am receiving 250,000,000 messages.
Each message contains:

  • up to 7 traders
  • up to 999 good items
  • other data (but I simplify here)

Each trader has

  • Identification number
  • Name
  • address
  • Type

Each good item has:

  • reference
  • description
  • mass
  • ...

My users need to be able to look for traders, good items, ...

Basic example of query could be:

  • list of traders involved in messages where good items type is ... and mass > ... and message issued by ... between date 1 and date 2 , and trader address "sound like" ..., group by trader type and location

In fact, users will be able to tackle the information from all directions (messages, traders, goods).

Therefore I thought storing the information the following way: => good items oriented

Per good item, I create a document that replicates all "searchable" information related to:

  • the good item itself
  • all the traders info
  • the other searchable info from the message

The advantage of this solution would be to flattenize the data.
The drawback is the replication of the information.

That's why I would appreciate your expertise...

Many thanks in advance,


(David Pilato) #2

Interesting. That's a similar project to what my former colleagues and I did at the french customs with elasticsearch. :wink:

The advantage of this solution would be to flattenize the data.
The drawback is the replication of the information.

Yes. We came to the same conclusion. So we indexed the same data multiple times. I think it's a good tradeoff.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.