Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


What is Elasticsearch?

Elasticsearch is the distributed search and analytics engine. Elasticsearch provides near real-time search and analytics for all types of data. Whether you have structured or unstructured text, numerical data, or geospatial data, Elasticsearch can efficiently store and index it in a way that supports fast searches. It is accessible from RESTful web service interface and uses schema less JSON (JavaScript Object Notation) documents to store data. It is built on Java programming language and hence Elasticsearch can run on different platforms. It enables users to explore very large amount of data at very high speed.

General Features

  • Elasticsearch is scalable up to petabytes of structured and unstructured data.
  • Elasticsearch can be used as a replacement of document stores like MongoDB and RavenDB.
  • Elasticsearch uses denormalization to improve the search performance.
  • Elasticsearch is an open source and available under the Apache license version 2.0.
  • Elasticsearch is one of the popular enterprise search engines, and is currently being used by many big organizations like Wikipedia, The Guardian, StackOverflow, GitHub etc.
  • Store and analyze logs, metrics, and security event data
  • Use machine learning to automatically model the behavior of your data in real time
  • Automate business workflows using Elasticsearch as a storage engine
  • Manage, integrate, and analyze spatial information using Elasticsearch as a geographic information system (GIS)

Data stored as Document

Elasticsearch is a distributed document store. Instead of storing information as rows of columnar data, Elasticsearch stores complex data structures that have been serialized as JSON documents. When you have multiple Elasticsearch nodes in a cluster, stored documents are distributed across the cluster and can be accessed immediately from any node. 

When a document is stored, it is indexed and fully searchable in near real-time –within 1 second. Elasticsearch uses a data structure called an inverted index that supports very fast full-text searches. An inverted index lists every unique word that appears in any document and identifies all of the documents each word occurs in.

An index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data. By default, Elasticsearch indexes all data in every field and each indexed field has a dedicated, optimized data structure. For example, text fields are stored in inverted indices, and numeric and geo fields are stored in BKD trees. The ability to use the per-field data structures to assemble and return search results is what makes Elasticsearch so fast.

Elasticsearch also has the ability to be schema-less, which means that documents can be indexed without explicitly specifying how to handle each of the different fields that might occur in a document. When dynamic mapping is enabled, Elasticsearch automatically detects and adds new fields to the index. This default behavior makes it easy to index and explore your data—​just start indexing documents and Elasticsearch will detect and map booleans, floating point and integer values, dates, and strings to the appropriate Elasticsearch data types.

Node

A node is a single running instance(server) of a cluster

Cluster

A cluster is a collection of nodes. Cluster provides collective indexing and search capabilities across all the nodes for entire data.

Index

It is a collection of different type of documents and their properties. Index also uses the concept of shards to improve the performance. For example, a set of document contains data of a social networking application.

Document

It is a collection of fields in a specific manner defined in JSON format. Every document belongs to a type and resides inside an index. Every document is associated with a unique identifier called the UID.

Shard

Indexes are horizontally subdivided into shards. This means each shard contains all the properties of document but contains less number of JSON objects than index. The horizontal separation makes shard an independent node, which can be store in any node. Primary shard is the original horizontal part of an index and then these primary shards are replicated into replica shards.

Replicas

Elasticsearch allows a user to create replicas of their indexes and shards. Replication not only helps in increasing the availability of data in case of failure, but also improves the performance of searching by carrying out a parallel search operation in these replicas

RDBMS and Elasticsearch

Elasticsearch RDBMS
Cluster Database
Shard Shard
Index Table
Field Column
Document Row

Advantages

  • Elasticsearch is developed on Java, which makes it compatible on almost every platform.
  • Elasticsearch is real time, in other words after one second the added document is searchable in this engine
  • Elasticsearch is distributed, which makes it easy to scale and integrate in any big organization.
  • Creating full backups are easy by using the concept of gateway, which is present in Elasticsearch.
  • Handling multi-tenancy is very easy in Elasticsearch when compared to Apache Solr.
  • Elasticsearch uses JSON objects as responses, which makes it possible to invoke the Elasticsearch server with a large number of different programming languages.
  • Elasticsearch supports almost every document type except those that do not support text rendering.

Disadvantages

  • Elasticsearch has a problem of Split brain situations at times.
September 24, 2020

Springboot Lombok

Project Lombok is a java library that automatically plugs into your editor and build tools, spicing up your java.
Never write another getter or equals method again, with one annotation your class has a fully featured builder, Automate your logging variables, and much more.

Java can get too verbose for things you have to do such as generating getter and setter methods. These things often bring no real value to the business side of your applications. This is what lombok is for. Lombok is here to help you generate boilerplate code and you focus on business logic. The way it works is by plugging into your build process and autogenerating Java bytecode into your .class files as per a number of project annotations you introduce in your code.

Install Lombok on your computer

  1. Add dependency 
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>
  2. Find lombok.jar in your project maven directory -> Right click -> Run As -> Java Application 
  3. Click on Specify Location button to choose the path where STS is installed
  4. Go to Application/Contents/Eclipse/SpringToolSuit4.ini Then click on Install -> Quick Installer
  5. Restart STS you are good to go

Use Lombok in your project

import java.io.Serializable;
import java.util.Date;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.UUID;
import java.util.stream.Collectors;

import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.EnumType;
import javax.persistence.Enumerated;
import javax.persistence.FetchType;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Index;
import javax.persistence.JoinColumn;
import javax.persistence.JoinTable;
import javax.persistence.Lob;
import javax.persistence.ManyToMany;
import javax.persistence.OneToOne;
import javax.persistence.PrePersist;
import javax.persistence.PreUpdate;
import javax.persistence.Table;
import javax.persistence.Temporal;
import javax.persistence.TemporalType;
import javax.validation.constraints.NotEmpty;
import org.hibernate.annotations.CreationTimestamp;
import org.hibernate.annotations.ResultCheckStyle;
import org.hibernate.annotations.SQLDelete;
import org.hibernate.annotations.Type;
import org.hibernate.annotations.UpdateTimestamp;
import org.hibernate.annotations.Where;

import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.annotation.JsonInclude.Include;
import com.social.api.address.Address;
import com.social.api.user.role.Role;

import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.Setter;
import lombok.ToString;

@Setter
@Getter
@ToString
@AllArgsConstructor
@NoArgsConstructor
@JsonInclude(value = Include.NON_NULL)
public class User implements Serializable {

    private static final long serialVersionUID       = 1L;

    private Long              id;

    private String            firstName;

    private String            lastName;

    private String            email;

    private String            password;

    private String            phoneNumber;

    private Date              dateOfBirth;

    private String            aboutMe;

    private String            profileImageUrl;

    private String            coverImageUrl;

    private Date              passwordExpirationDate;

    private Integer           invalidPasswordCounter = 0;


}

 

 

September 19, 2020

Spring Data Projection

Spring Data query methods usually return one or multiple instances of the aggregate root managed by the repository. However, it might sometimes be desirable to create projections based on certain attributes of those types. Spring Data allows modeling dedicated return types, to more selectively retrieve partial views of the managed aggregates.

Projection is just a way to retrieve certain fields from the database without retrieving all fields. This improves performance as it does not retrieve all fields.

There are 3 different types of projection: Scalar, DTO, Entity projection

Scalar Projection

Scalar projection allows you to select columns you need.

@Repository
public interface BookRepository extends JpaRepository<User, Long> {
 
    @Query("SELECT u.name, u.email FROM Book b")
    List<Object[]> getNameAndEmail();   
}

DTO Projection

DTO projection uses a constructor which Hibernate uses to populate data from the database.

@Repository
public interface UserRepository extends CrudRepository<User, Long> {
 
    @Query("SELECT new com.kaveinga.user.dto.UserDetailDTO(u.firstName, u.lastName) FROM User user WHERE user.firstName = :firstName")
    List<UserDetailDTO> findByFirstName(String firstName);
}

JPA DTO

As long as the DTO class has only one constructor and its parameter names match your entity class’s attribute names, Spring generates a query with the required constructor expression.

public interface UserRepository extends JpaRepository<User, Long> {
    UserDTO findByEmail(String email);

}
public class UserDTO {

    private Long       id;

    private String     uid;

    private String     name;

    private String     email;

    private int        age;

    private Address address;

    public UserDTO(Long id, String uid, String name, String email, int age, Address address) {
        super();
        this.id = id;
        this.uid = uid;
        this.name = name;
        this.email = email;
        this.age = age;
        this.address = address;

    }

    // getters and setters
}

JPA DTO as interface

Here is another way you can retrieve data from the database. Instead of having a DTO class you can use an interface. Your interface only defines getter methods for basic attributes.

public interface UserRepository extends JpaRepository<User, Long> {

    UserView findByUid(String uuid);

}
public interface UserView {

    Long getId();

    String getName();

    String getEmail();

    int getAge();

    String getUid();
    // nested object
    AddressView getAddress();

}

 

September 15, 2020

Elasticsearch Snapshot

Snapshort or Backup

Snapshot is a backup taken from a running Elasticsearch cluster. We can take a snapshot of individual indices or of the entire cluster. Snapshots are incremental, which means each snapshot of an index only stores data that is not part of an earlier snapshot.  

I have found this npm library that does what I want here.

 

 

September 11, 2020

React Update State

How to update state of an object

We have a user object

setUser({
  ...user,
  name: "Peter"
});

 

How to update state of list/array of objects

We have a shopping cart which contains a list of menu items

const [shopCart, setShopCart] = useState([{
  name: "",
  price: 0,
  uuid: ""
}]);


add item to shopping cart

const addMenuItemToCart = (menuItem: any) => {
  console.log("addMenuItemToCart, ", menuItem)

  setShopCart(shopCart => {
    shopCart.push(menuItem);
    const newState = shopCart.map(obj => {
      // 👇️ otherwise return object as is
      return obj;
    });

    return newState;
  });

}

 

 

 

September 5, 2020