I am working on building a report based on indexed documents. Following is the scenario of the report:
Status field that contains two types of values (ERROR, OK)
Destination field containing all the destination phone numbers (long data type)
Each destination field has a status either it is Error or OK. In some cases destination numbers have errors but same number also have OK status with different timestamp.
One report is of all the numbers delivered (status OK): done
Second report is all the number that were not delivered i.e number must not have a OK status but ERROR only.
Since same phone numbers can be OK or Error based on different time stamps, what are you returning the two cases?
When message is not delivered system retries it but also logs a error message. When it is delivered then OK status with same number is updated as new log entry.
I have looked at nested queries but I am not sure how it solves my issue.
As you can see, since both error and OK can be a part of the same document under a particular phone, you cant really retrieve what you need (atleast at my first glance).
This sounds more useful ...
Did you try any query? One report is of all the numbers delivered (status OK): done : Is this working? Can you not do something similar for Error as well?
@Jaspreet_Singh I can get all numbers that have ERROR status. But the problem is I want only those numbers that doesn't have OK status at all. For example I have a number 123-123-123 and It only has ERROR status but when I query then 123-456-789 (number from above example) is also returned because that also has ERROR status.
But in your above example, 123456789 does not have OK status.
By the way you can get records that have ONLY ERROR and not OK but combining must and must_not.
Try something like ... (im going to assume a few things but logic should be similar)
@Jaspreet_Singh I posted two docs in which one is OK and one is Error for same destination. Method that you have mentioned I have already tried it in many different ways but It also returns 123456789 as error.
One way to solve this would be through an entity-centric index. You basically create a separate index, where each phone number is represented by a single document with the phone number as a document ID. When data comes in you insert the raw document as usual into the current index, but at the same time also add any relevant information to the appropriate document in the new entity-centric index. If you have structured this correctly, it should be very easy to create your report based on this index.
Thanks Everyone for help. I have solved the problem by using multiple aggregations. Combined two fields and applied unique and min aggs and got the results. Visualization shows correct results that I can further filter out to get desired output.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.