Is it possible to run a query with NO term and ONLY filters?


(Tim Bessie) #1

I have a case where I want exact matches ONLY only some fields, but NO query term. Is this possible? I've tried it (using Elasticsearch 1.5.2 - haven't upgraded to 2.0.0 yet), and it returns no results when I provide a null or empty string as a query term.

How do I get results when providing ONLY a map of filters?

Also, in this case, I have the following:

Field A: Date
Field B: Number
FIeld C: List: (Value1, Value2, Value3)

... what I want to do is match on exact values of A and B, but search for "an exact match on ONE element of C - that is, the query would be like:

A: 2015-11-10
B: 12
C: Value2

... and would match on the example record above.

I might also want to query for C: Value1 OR Value2

Can someone tell me if I can do what I need with the above? Run with filters only, or exact matches, not using a query term? Or do I need to provide a query term(s), but configuration arguments for the query such that it only accepts exact matches?

  • Tim

(Nik Everett) #2

You're in luck! That is exactly how its supposed to work.

If you aren't finding any results I imagine your trouble is analysis. Make sure that your list field is not_analyzed (see mapping) and you can use a term filter. There is a terms filter which has the OR behavior you want.

Query vs filter is really about "do I want scoring?" You can use a filtered query that just has a filter if you don't want scoring. That is how to do it in 1.x. For 2.0 see the box at the top of the page - this way is deprecated and will go away in 3.0. The box has the migration path.


(Tim Bessie) #3

Thanks for your suggestions! I've gotten close, but every time I've tried to set the mappings in Java, I get JSON format errors, of this form:

org.elasticsearch.index.mapper.MapperParsingException: Root type mapping not empty after parsing! Remaining fields:   [Deleted : type=boolean] [Text : type=string] [DocumentName : type=string] [Draft : type=boolean] [OwnerUserId : type=long] [PracticeId : type=long] [AuthorName : type=string] [CreatedDate : type=date,format=dateOptionalTime] [Id : type=long] [Sections : type=string,index=not_analyzed] [PatientId : type=long] [DateOfService : type=date,format=dateOptionalTime]
	at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:278)
	at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:192)
	at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:434)
	at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:505)
	at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:365)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:188)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:158)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

I've tried both using builders to create the JSON structure for the mappers and add them before index creation, and also creating the index and adding them after the fact, but I always get the same error - this is using Elasticsearch 1.5.2 - I know that there had been some bugs with regard to doing this kind of thing in the past, tho' I don't know if there were in 1.5.2.

Here's the code I'm using currently - note that "IndexedFieldDescriptor" is just a descriptor class for index setup metadata I've created. Nothing funny going on in there. Do you see anything obvious I'm doing wrong? According to the comments in the mapping code, I'm doing it right (unless the comments are misleading):

protected void createIndex(String indexName) throws ElasticsearchException, SearchException {

    IndicesAdminClient iac = getAdminClient().indices();
    CreateIndexRequestBuilder cirb = iac.prepareCreate(indexName);
    
ActionFuture<CreateIndexResponse> createFuture = cirb.execute();

CreateIndexResponse createdResponse = createFuture.actionGet(EmrConfig.getInstance().getElasticSearchTimeoutMSLong());

if (!createdResponse.isAcknowledged()) {
    throw new SearchException("elasticsearch : cannot create index " + indexName, SearchException.Type.ServerError);
}

Map<String, String> mapping = new HashMap<String, String>();
for (IndexedFieldDescriptor desc : entityType.getFields()) {
    
    Class<?> javaClass = desc.getJavaClass();
    
    StringBuilder sb = new StringBuilder();
    if (javaClass == String.class) {
        sb.append("type=string");
    } else if (javaClass.isAssignableFrom(Date.class)) {
        sb.append("type=date");
        sb.append(",format=dateOptionalTime");
    } else if (javaClass == Boolean.class) {
        sb.append("type=boolean");
    } else if (javaClass == Long.class) {
        sb.append("type=long");
    } else {
        throw new IllegalStateException("Cannot handle class "+ javaClass.getCanonicalName());
    }
    
    if (!desc.isAnalysed()) {
        sb.append(",index=not_analyzed");
    }
    
    mapping.put(desc.getName(), sb.toString());
}


PutMappingRequestBuilder rb = iac.preparePutMapping(indexName);
rb = rb.setType(entityType.name().toLowerCase());
rb = rb.setSource(mapping);

ActionFuture<PutMappingResponse> mappingFuture = rb.execute();

PutMappingResponse putMappingResponse = mappingFuture.actionGet(EmrConfig.getInstance().getElasticSearchTimeoutMSLong());

if (!putMappingResponse.isAcknowledged()) {
    throw new SearchException("elasticsearch : cannot put mapping on index " + indexName, SearchException.Type.ServerError);
}

(Nik Everett) #4

So not_analyzed is only a thing for strings. Also, I believe you aren't making valid json. It should look kinda like this:

{
  "properties": {
    "foo": {
      "type": "string",
      "index": "not_analyzed"
    },
    "bar": {
      "type": "long"
    }
  }
}

(Tim Bessie) #5

The only thing I'm using not_analyzed for is Strings anyway, so no problem there.

As for the JSON - I'm using an API for it that is supposed to work like I've used it, I think. The comments say to pass in a map of the form key:var=value,var=value..., which is what I'm doing.

Is it possible that that code doesn't generate valid JSON?

  • Tim

(Tim Bessie) #6

EDIT: I figured out what I did wrong below. The Javadocs for addMapping() say the following:

/**
 * Adds mapping that will be added when the index gets created.
 *
 * @param type   The mapping type
 * @param source The mapping source
 */
public CreateIndexRequestBuilder addMapping(String type, String source) {
    request.mapping(type, source);
    return this;
}

... and I was taking "mapping type" to mean the field within the document, not the document type itself. Is that a standard in the Elasticsearch naming conventions and I just misunderstood, or are terms used a bit loosely?

==============================================

Also, I just tried your suggestion (I think I had tried that already after reading it elsewhere), and the index now looks like the below - note how the target field whose mapping I'm trying to modify is added outside of the main mapping block - it doesn't appear to go in the correct place. In fact, I found that those mapping properties are not applied to it after this index modification is made.

Do let me know if the below looks correct to you - it doesn't to me, but then I'm not familiar with what a proper index should look like after its mapping has been modified (I did a search for a value of "Section" and it returned no results; however, when I searched for it in lower case, it appeared; our default Elasticsearch config file is set up to convert to lowercase during analysis, so those map modifications are not being applied - as part of this test, I rebuild the index from scratch, so if that mapping change were working, I'd see it):

{
  "patdocument-50" : {
    "aliases" : { },
    "mappings" : {
      "patdocument" : {
        "properties" : {
          "AuthorName" : {
            "type" : "string"
          },
          "CreatedDate" : {
            "type" : "date",
            "format" : "dateOptionalTime"
          },
          "DateOfService" : {
            "type" : "date",
            "format" : "dateOptionalTime"
          },
          "Deleted" : {
            "type" : "boolean"
          },
          "DocumentName" : {
            "type" : "string"
          },
          "Draft" : {
            "type" : "boolean"
          },
          "Id" : {
            "type" : "long"
          },
          "OwnerUserId" : {
            "type" : "long"
          },
          "PatientId" : {
            "type" : "long"
          },
          "PracticeId" : {
            "type" : "long"
          },
          "Sections" : {
            "type" : "string"
          },
          "Text" : {
            "type" : "string"
          }
        }
      },
      "Sections" : {
        "properties" : {
          "Sections" : {
            "type" : "string",
            "index" : "not_analyzed"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1447825627556",
        "uuid" : "Ht0FqKIkTN-EOSgr_rGFhg",
        "number_of_replicas" : "0",
        "number_of_shards" : "1",
        "version" : {
          "created" : "1050299"
        }
      }
    },
    "warmers" : { }
  }
}

(Nik Everett) #7

Other than that things are usually named in snake_case it looks pretty sane. You probably want not_analyzed in all the strings.

To figure out why it isn't working properly for you it'd be best to make a full recreation using bash and curl - one that just creates the index from scratch.


(Tim Bessie) #8

Well, it is working now - I noted that at the top, after I'd written the original (the forum software warned me about responding too many times to my own message :wink: ).

Did you see my question at the top asking about terminology?

By the way, thank you so much for your help! In my rush I neglected thanking you. :slight_smile:

  • Tim

(system) #9