I've recently upgraded an on-premise Elastic 7.17.19 server to 8.13.0. Since I had problems with the java client I had (especially with digesting) I've decided to also upgrade by java client to 8.13 with jackson 2.17.0. I've rewritten all elastic code and queries. I'm facing runtime problems with HitMetadata. I don't know what I'm doing wrong but it seems that its in the client api it self.
Here is my method:
public <T extends IElasticDocument> ResponseWrapper<T> findElasticDocuments(SearchRequest.Builder query, Highlight.Builder highlight, boolean fullScroll,
Map<String, Aggregation> aggs, int size, Class<T> regulationClass, SortOptions ... sort) {
String pitId = null;
try {
String indexName = ReflectionUtils.getAnnotation(regulationClass, IndexName.class).value();
String[] fullIndex = getStaticValue(regulationClass, "fullIndex");
final Time keepAlive = new Time.Builder().time("3m").build();
final OpenPointInTimeResponse pitResp = client.openPointInTime(req -> req.index(indexName).keepAlive(keepAlive));
final String initialPitId = pitResp.id();
pitId = initialPitId;
query.size(size)
.pit(pit -> pit.id(initialPitId).keepAlive(keepAlive)) // This is the initial pit. It would be better to use the pit from the last result
.source(SourceConfig.of(sc ->
sc.filter(SourceFilter.of(sf ->
sf.includes("", fullIndex))
)
));
if (null != highlight)
query.highlight(h -> highlight);
if (null != sort && 0 < sort.length)
query.sort(Arrays.asList(sort));
if (null != aggs && !aggs.isEmpty())
query.aggregations(aggs);
String lastId = null; // holds the last retrieved result id.
ResponseWrapper<T> wrapper = new ResponseWrapper<>();
List<T> results = new ArrayList<>();
long took = 0;
do {
if (null != lastId) {
query = query.searchAfter(FieldValue.of(lastId));
}
SearchRequest req = query.build();
SearchResponse<T> response = client.search(req, regulationClass);
took += response.took();
List<Hit<T>> hits = response.hits().hits();
if (null != hits && !hits.isEmpty() && 1000 >= results.size()) {
if (fullScroll) {
if (1000 >= results.size()) {
lastId = convertResults(results, hits);
}
}
else {
wrapper.setScrollId(lastId);
lastId = null;
}
}
else {
wrapper.setScrollId(lastId);
lastId = null;
}
} while (lastId != null);
wrapper.setTook(took);
wrapper.setResults(results);
return wrapper;
}
catch (Exception e) {
logger.error(DEFAULT_ERROR_MSG, e);
return new ResponseWrapper<>(new ArrayList<>());
}
finally {
ClosePointInTimeResponse pitCloseResp;
try {
final String closePitId = pitId;
pitCloseResp = client.closePointInTime(req -> req.id(closePitId));
if (!pitCloseResp.succeeded())
logger.warn("Search request with id: {} failed to close", closePitId);
}
catch (ElasticsearchException | IOException e) {
logger.warn("Search request with id: {} failed to close", pitId);
}
}
}
This method accepts a query build somewhere else and handles some generic stuff that is good for all the indexes I search. With debug I can see the json request coming out is as follows:
{
"_source": {
"includes": [
"",
"neoId",
"version",
"createDate",
"updateDate",
"superIndex",
"index",
"fileNumber",
"title",
"status",
"regulationType",
"subject",
"directiveType",
"directiveNumber",
"publishDate",
"deleted",
"appliesTo",
"relatedToCodex",
"relatedCodexIndexNumber",
"effectiveDate",
"dateResolution",
"hasObligation",
"cancellationDate",
"newDirectiveIndex",
"newDirectiveNumber",
"oldDirectiveIndex",
"oldDirectiveNumber",
"searchable",
"submissionCommentsDate",
"regulationCount",
"textEditor",
"hasNewerRelease"
]
},
"highlight": {
"type": "fvh",
"fields": {
"content.content": {
"fragment_size": 200,
"number_of_fragments": 1
}
}
},
"query": {
"bool": {
"must": [
{
"term": {
"searchable": {
"value": true
}
}
},
{
"dis_max": {
"boost": 1.2000000476837158,
"queries": [
{
"match_phrase": {
"title": {
"boost": 6.0,
"analyzer": "hebrew",
"query": "דירקטור"
}
}
},
{
"match": {
"title": {
"boost": 1.5,
"analyzer": "hebrew",
"operator": "and",
"query": "דירקטור"
}
}
},
{
"match_phrase": {
"content.content": {
"boost": 4.0,
"analyzer": "hebrew",
"query": "דירקטור"
}
}
}
],
"tie_breaker": 0.7
}
}
],
"must_not": [
{
"term": {
"directiveType": {
"value": "87"
}
}
}
]
}
},
"size": 1000,
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"publishDate": {
"order": "desc"
}
},
{
"index": {
"order": "desc"
}
},
{
"fileNumber": {
"order": "asc"
}
}
]
}
Running this request in postman yields good results. Here is one
{
"took": 2637,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 365,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "regulation_release",
"_id": "YsOM1nwB0netDWQ5M7XV",
"_score": 115.92962,
"_source": {
"updateDate": "2021-10-31T15:40:03",
"fileNumber": 1,
"subject": [
"8"
],
"publishDate": "2021-10-31",
"superIndex": 1544,
"appliesTo": [
"2"
],
"title": "some good title data",
"dateResolution": false,
"oldDirectiveIndex": null,
"hasObligation": "NONE",
"cancellationDate": null,
"createDate": "2021-10-31T15:31:15",
"relatedCodexIndexNumber": null,
"newDirectiveNumber": null,
"relatedToCodex": false,
"index": 182118,
"oldDirectiveNumber": null,
"version": 3,
"neoId": 16641,
"searchable": true,
"regulationType": "settlement",
"deleted": false,
"directiveType": "10",
"newDirectiveIndex": null,
"submissionCommentsDate": null,
"directiveNumber": null,
"regulationCount": 1,
"effectiveDate": null,
"status": "DRAFT"
},
"highlight": {
"content.content": ["some good content highlight data"]
},
"sort": [
115.92962,
1635638400000,
182118,
1
]
}
///
When parsing this response I get an error "Missing required property 'Hit.index' ".
I've tried to follow with debug. What I see is that I get a hit with all necessary details for the source and then I get another Hit for the highlight and sort which except for these to fields does not have the other fields. Now as I understand it the highlight and sort fields should not have been separated from the first hit and should not be on their own.
If I comment out the highlight, sort and aggregations addition to the request builder then the request fails completely on problems with parsing HitMetadata: Unexpected JSON event 'END_OBJECT' instead of '[START_OBJECT, KEY_NAME]'
I have no idea how to resolve this and what I might be doing wrong. Any help would be great.
Thanks