Elasticsearch Filter

The goal of filters is to reduce the number of documents that have to be examined by the query. Queries have to not only find matching documents, but also calculate how relevant each document is, which typically makes queries heavier than filters. Also, query results are not cachable. Filter is quick to calculate and easy to cache in memory, using only 1 bit per document. These cached filters can be reused efficiently for subsequent requests.

When to Use filter vs query?

As a general rule, use query clauses for full-text search or for any condition that should affect the relevance score, and use filter clauses for everything else.

There are two ways to filter search results.

  1. Use a boolean query with a filter clause. Search requests apply boolean filters to both search hits and aggregations.
  2. Use the search API’s post_filter parameter. Search requests apply post filters only to search hits, not aggregations. You can use a post filter to calculate aggregations based on a broader result set, and then further narrow the results. A post filter has no impact on the aggregation results.

Term filter

The term filter is used to filter by exact values, be they numbers, dates, Booleans, or not_analyzed exact-value string fields. Here we are filtering out all users whose rating is not 5. In other words, only retrieve users with rating of 5.

GET elasticsearch_learning/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "rating": 5 }}
      ]
    }
  }
}
/**
 * https://www.elastic.co/guide/en/elasticsearch/reference/current/filter-search-results.html
 */
@Test
void filterQuery() {

    int pageNumber = 0;
    int pageSize = 5;

    SearchRequest searchRequest = new SearchRequest(database);
    searchRequest.allowPartialSearchResults(true);
    searchRequest.indicesOptions(IndicesOptions.lenientExpandOpen());

    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.from(pageNumber * pageSize);
    searchSourceBuilder.size(pageSize);
    searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
    /**
     * fetch only a few fields
     */
    searchSourceBuilder.fetchSource(new String[]{"id", "firstName", "lastName", "rating", "dateOfBirth"}, new String[]{""});

    BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

    boolQuery.filter(QueryBuilders.termQuery("rating", 5));

    searchSourceBuilder.query(boolQuery);

    searchRequest.source(searchSourceBuilder);

    searchRequest.preference("rating");

    if (searchSourceBuilder.query() != null && searchSourceBuilder.sorts() != null && searchSourceBuilder.sorts().size() > 0) {
        log.info("\n{\n\"query\":{}, \"sort\":{}\n}", searchSourceBuilder.query().toString(), searchSourceBuilder.sorts().toString());
    } else {
        log.info("\n{\n\"query\":{}\n}", searchSourceBuilder.query().toString());
    }

    try {
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        log.info("isTimedOut={}, totalShards={}, totalHits={}", searchResponse.isTimedOut(), searchResponse.getTotalShards(), searchResponse.getHits().getTotalHits().value);

        List<User> users = getResponseResult(searchResponse.getHits());

        log.info("results={}", ObjectUtils.toJson(users));

    } catch (IOException e) {
        log.warn("IOException, msg={}", e.getLocalizedMessage());
        e.printStackTrace();
    } catch (Exception e) {
        log.warn("Exception, msg={}", e.getLocalizedMessage());
        e.printStackTrace();
    }

}

 

Range filter

The range filter allows you to find numbers or dates that fall into a specified range. Here we are filtering out all users whose rating is either a 2, 3, or 4. 

GET elasticsearch_learning/_search 
{
  "query":{
    "bool" : {
      "filter" : [
        {
          "range" : {
            "rating" : {
              "from" : 2,
              "to" : 4
            }
          }
        }
      ]
    }
  }
}
/**
 * https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
 */
@Test
void filterQueryWithRange() {

    int pageNumber = 0;
    int pageSize = 5;

    SearchRequest searchRequest = new SearchRequest(database);
    searchRequest.allowPartialSearchResults(true);
    searchRequest.indicesOptions(IndicesOptions.lenientExpandOpen());

    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.from(pageNumber * pageSize);
    searchSourceBuilder.size(pageSize);
    searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
    /**
     * fetch only a few fields
     */
    searchSourceBuilder.fetchSource(new String[]{"id", "firstName", "lastName", "rating", "dateOfBirth"}, new String[]{""});

    BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

    boolQuery.filter(QueryBuilders.rangeQuery("rating").gte(2).lte(4));

    searchSourceBuilder.query(boolQuery);

    searchRequest.source(searchSourceBuilder);

    searchRequest.preference("rating");

    if (searchSourceBuilder.query() != null && searchSourceBuilder.sorts() != null && searchSourceBuilder.sorts().size() > 0) {
        log.info("\n{\n\"query\":{}, \"sort\":{}\n}", searchSourceBuilder.query().toString(), searchSourceBuilder.sorts().toString());
    } else {
        log.info("\n{\n\"query\":{}\n}", searchSourceBuilder.query().toString());
    }

    try {
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        log.info("isTimedOut={}, totalShards={}, totalHits={}", searchResponse.isTimedOut(), searchResponse.getTotalShards(), searchResponse.getHits().getTotalHits().value);

        List<User> users = getResponseResult(searchResponse.getHits());

        log.info("results={}", ObjectUtils.toJson(users));

    } catch (IOException e) {
        log.warn("IOException, msg={}", e.getLocalizedMessage());
        e.printStackTrace();
    } catch (Exception e) {
        log.warn("Exception, msg={}", e.getLocalizedMessage());
        e.printStackTrace();
    }

}

 

Exists Filter

The exists and missing filters are used to find documents in which the specified field either has one or more values (exists) or doesn’t have any values (missing). It is similar in nature to IS_NULL (missing) and NOT IS_NULL (exists)in SQL.

{
       "exists": {
            "field":    "name"
        }
}

Here we are filtering out all users that have logged into the system.

GET elasticsearch_learning/_search 
{
"query":{
  "bool" : {
    "filter" : [
      {
        "exists" : {
          "field" : "lastLoggedInAt"
        }
      }
    ]
  }
}
/**
 * https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
 */
@Test
void filterQueryWithExists() {

    int pageNumber = 0;
    int pageSize = 5;

    SearchRequest searchRequest = new SearchRequest(database);
    searchRequest.allowPartialSearchResults(true);
    searchRequest.indicesOptions(IndicesOptions.lenientExpandOpen());

    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.from(pageNumber * pageSize);
    searchSourceBuilder.size(pageSize);
    searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
    /**
     * fetch only a few fields
     */
    searchSourceBuilder.fetchSource(new String[]{"id", "firstName", "lastName", "rating", "dateOfBirth"}, new String[]{""});

    BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

    boolQuery.filter(QueryBuilders.existsQuery("lastLoggedInAt"));

    searchSourceBuilder.query(boolQuery);

    searchRequest.source(searchSourceBuilder);

    searchRequest.preference("rating");

    if (searchSourceBuilder.query() != null && searchSourceBuilder.sorts() != null && searchSourceBuilder.sorts().size() > 0) {
        log.info("\n{\n\"query\":{}, \"sort\":{}\n}", searchSourceBuilder.query().toString(), searchSourceBuilder.sorts().toString());
    } else {
        log.info("\n{\n\"query\":{}\n}", searchSourceBuilder.query().toString());
    }

    try {
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        log.info("isTimedOut={}, totalShards={}, totalHits={}", searchResponse.isTimedOut(), searchResponse.getTotalShards(), searchResponse.getHits().getTotalHits().value);

        List<User> users = getResponseResult(searchResponse.getHits());

        log.info("results={}", ObjectUtils.toJson(users));

    } catch (IOException e) {
        log.warn("IOException, msg={}", e.getLocalizedMessage());
        e.printStackTrace();
    } catch (Exception e) {
        log.warn("Exception, msg={}", e.getLocalizedMessage());
        e.printStackTrace();
    }

}

Source code on Github




Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


Leave a Reply

Your email address will not be published. Required fields are marked *