Elasticsearch Document API

Elasticsearch provides single document APIs and multi-document APIs, where the API call is targeting a single document and multiple documents respectively.

All CRUD APIs are single-index APIs.

Index API

Adds a JSON document to the specified data stream or index and makes it searchable. If the target is an index and the document already exists, the request updates the document and increments its version. You cannot use the index API to send update requests for existing documents to a data stream.

You use one of these options to index a document:

PUT /<target>/_doc/<_id> 
POST /<target>/_doc/
PUT /<target>/_create/<_id>
POST /<target>/_create/<_id>

target - name of index. If the target doesn’t exist and doesn’t match a data stream template, this request creates the index.
_id - id of the document
Use POST /<target>/_doc/ when you want Elasticsearch to generate an ID for the document

You can index a new JSON document with the _doc or _create resource. Using _create guarantees that the document is only indexed if it does not already exist. To update an existing document, you must use the _doc resource.

Example of Index

PUT doctor_ut/_doc/1013143536
{
  "npi" : "1013143536",
  "firstName" : "SHAWN",
  "lastName" : "WRIGHT",
  "fullName" : "SHAWN WRIGHT",
  "credential" : "LICSW",
  "otherLastName" : "WRIGHT",
  "otherFirstName" : "SHAWN",
  "type" : "Individual",
  "gender" : "FEMALE"
}

Java example of Index API

IndexRequest request = new IndexRequest(utIndex);
request.id(doctorIndex.getNpi());
request.source(searchHit.getSourceAsString(), XContentType.JSON);
IndexResponse indexResponse = restHighLevelClient.index(request, RequestOptions.DEFAULT);

 

GET API

Retrieves the specified JSON document from an index.

GET <index>/_doc/<_id>

HEAD <index>/_doc/<_id>

You use GET to retrieve a document and its source or stored fields from a particular index. Use HEAD to verify that a document exists. You can use the _source resource retrieve just the document source or verify that it exists.

Example of Get API

GET doctor_ut/_doc/1013143536

You can also specify the fields you want in your result from that particular document.

GET doctors/_doc/1013143536?_source_includes=name,rating

Java example of Get API

public void getDoctorByNPI() {
        String indexName = Index.DOCTOR_UT.name().toLowerCase();

        String npi = "1013143536";
        GetRequest getRequest = new GetRequest(indexName, npi);


        try {
            GetResponse getResponse = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
            log.info(getResponse.getSourceAsString());
        } catch (Exception e) {
            log.warn(e.getLocalizedMessage());
        }
}

Multi Get API

Retrieves multiple JSON documents by ID. You use mget to retrieve multiple documents from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body.

GET doctor_ut/_mget
{
  "docs": [
    {
      "_id": "1689633083"
    },
    {
      "_id": "1073924098"
    }
  ]
}

Get multiple documents from different indices

GET _mget
{
  "docs": [
    {
      "_index": "doctor_ut",
      "_id": "1689633083"
    },
    {
      "_index": "doctors",
      "_id": "1073883070"
    }
  ]
}

Java example of Multi Get API

public void getMultipleDoctorsByNPIs() {
    String utahDoctorIndex = Index.DOCTOR_UT.name().toLowerCase();
    String doctorsIndex = Index.DOCTORS.name().toLowerCase();

    String npi1 = "1013143536";
    String npi2 = "1073883070";

    GetRequest getRequest = new GetRequest(utahDoctorIndex, npi1);
    MultiGetRequest request = new MultiGetRequest();
    request.add(new MultiGetRequest.Item(utahDoctorIndex, npi1));
    request.add(new MultiGetRequest.Item(doctorsIndex, npi2));

    try {
       MultiGetResponse response = restHighLevelClient.mget(request, RequestOptions.DEFAULT);

       // utah doctor
       MultiGetItemResponse utahDoctor = response.getResponses()[0];
       log.info(utahDoctor.getResponse().getSourceAsString());

       MultiGetItemResponse doctor = response.getResponses()[1];
       log.info(doctor.getResponse().getSourceAsString());
    } catch (Exception e) {
       log.warn(e.getLocalizedMessage());
    }
}

 

 

Update API

Updates a document using the specified script.

POST /<index>/_update/<_id>
{
...
}

The update API also supports passing a partial document, which is merged into the existing document. To fully replace an existing document, use the index API .

The document must still be reindexed, but using update removes some network roundtrips and reduces chances of version conflicts between the GET and the index operation.

The _source field must be enabled to use update. In addition to _source,  you can access the following variables through the ctx map: index, _type, _id, _version, _routing, and _now(the current timestamp).

POST doctor_ut/_update/1013143536
{
  "doc": {
    "firstName": "Folau"
  },
  "doc_as_upsert": true
}

 

Java example of Update API

public void updateDoctor() {
    String indexName = Index.DOCTOR_UT.name().toLowerCase();
    String npi = "1013143536";

    UpdateRequest request = new UpdateRequest(indexName, npi);
    Map<String, Object> jsonMap = new HashMap<>();
    jsonMap.put("firstName", "Folau");

    request.doc(jsonMap, XContentType.JSON);

    try {
       UpdateResponse updateResponse = restHighLevelClient.update(request, RequestOptions.DEFAULT);
       log.info(updateResponse.getGetResult().sourceAsString());
    } catch (Exception e) {
       log.warn(e.getLocalizedMessage());
    }
}

Update by query

While processing an update by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents. A bulk update request is performed for each batch of matching documents. Any query or update failures cause the update by query request to fail and the failures are shown in the response. Any update requests that completed successfully still stick, they are not rolled back

POST /<index>/_update_by_query

Updates documents that match the specified query. If no query is specified, performs an update on every document in the data stream or index without modifying the source, which is useful for picking up mapping changes.

POST doctor_ut/_update_by_query
{
  "script": {
    "source": "if (ctx._source.firstName == 'Kinga') {ctx._source.firstName='Tonga';}",
    "lang": "painless"
  },
  "query": {
    "term": {
      "firstName": "Kinga"
    }
  }
}

Java example of Update by query

public void batchUpdateDoctors() {
    String indexName = Index.DOCTOR_UT.name().toLowerCase();

    UpdateByQueryRequest request = new UpdateByQueryRequest(indexName);
    request.setQuery(new TermQueryBuilder("firstName", "new_name1"));
    request.setScript(new Script(ScriptType.INLINE, "painless", "if (ctx._source.firstName == 'new_name1') {ctx._source.firstName='Kinga';}", Collections.emptyMap()));

    try {
        BulkByScrollResponse bulkResponse = restHighLevelClient.updateByQuery(request, RequestOptions.DEFAULT);
        log.info("updated={}", bulkResponse.getStatus().getUpdated());
    } catch (Exception e) {
        log.warn(e.getLocalizedMessage());
    }
}

 

 

 

Delete API

Removes a JSON document from the specified index. You use DELETE to remove a document from an index. You must specify the index name and document ID.

DELETE /<index>/_doc/<_id>
DELETE doctor_ut/_doc/1013143536

Java example of Delete API

public void deleteDoctor() {
    String indexName = Index.DOCTOR_UT.name().toLowerCase();
    String npi = "1013143536";

    DeleteRequest request = new DeleteRequest(indexName, npi); 
    try {
         DeleteResponse deleteResponse = restHighLevelClient.delete(request, RequestOptions.DEFAULT);
         log.info(deleteResponse.getIndex());
    } catch (Exception e) {
         log.warn(e.getLocalizedMessage());
    }
}

 

Reindex API

Copies documents from a source to a destination.

The source and destination can be any pre-existing index, index alias, or  data stream . However, the source and destination must be different. For example, you cannot reindex a data stream into itself.

Reindex requires _source to be enabled for all documents in the source.

The destination should be configured as wanted before calling _reindex. Reindex does not copy the settings from the source or its associated template.

Mappings, shard counts, replicas, and so on must be configured ahead of time.

POST _reindex
{
  "source": {
    "index": "doctors"
  },
  "dest": {
    "index": "doctor-ut"
  }
}

 

 

 

 

 




Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


Leave a Reply

Your email address will not be published. Required fields are marked *