Over the last couple years I have built a few clusters and have made some observations around how to design and plan when building a new cluster. Some of the considerations described here would also apply to other systems that have a similar approach to scaling and redundancy. Also the designs discussed in this article should work on any version of elasticsearch and the examples are built using Terraform.

Single Node Elasticsearch Cluster

The first solution that almost everybody will build with elasticsearch would be the Single Node Elasticsearch Cluster. Obviously this type of cluster is potentially not ideal in a production environment and would not be able to manage large amounts of data either. It does provide a simple solution for testing and validating scenarios.

Using a single node cluster might make sense if we have all of the following conditions

Data is not critical or sensitive
Data source is not elasticsearch
Recreating Indices is cheap and fast
Amount of data is very small

For most elasticsearch users it is unlikely that this solution would be useful or safe to implement. In previous versions it was possible to have an embedded elasticsearch engine built into your application, but this approach is no longer supported. If you use containers to run your applications, you could build the elasticsearch node as a sidecar in order to provide a similar effect as embedded engine.

provider "docker" {}

resource "docker_container" "snec" {
  image = docker_image.elasticsearch.latest
  name  = "snec"
  env = [
    "discovery.type=single-node"
  ]
  ports {
    internal = 9300
    external = 9300
  }
  ports {
    internal = 9200
    external = 9200
  }
}

resource "docker_image" "elasticsearch" {
  name = "docker.elastic.co/elasticsearch/elasticsearch:7.10.0"
}

Eventually you might need to expand your cluster, or have redundancy concerns. The natural step is to expand your elasticsearch cluster from a single node to a multi node cluster.

One Big Elasticsearch Cluster

The next step from a single node cluster is a multi node cluster. It is really natural to expand the single node cluster to a multi node cluster and it does not introduce any new overhead or changes in how the cluster is being used. The basic idea is to expand the cluster in order to introduce redundancy and improved performance. This approach is the most common solution that most elasticsearch users will encounter when using elasticsearch.

One Big Elasticsearch Cluster

Unlike the single cluster, it is not advisable to sidecar a multi node cluster with each application. It makes much more sense to centralize the cluster instead, but this does come with some drawbacks. The first problem is that centralized data requires a lot more configuration and management in order to ensure that the data is searchable. The security model requires extra work and management as well. Introducing new data types is also really difficult since it can get very difficult to manage, otherwise it requires lengthily and complex data massaging or multiple indices with different types.

While it is easy to scale the cluster horizontally, it does introduce new problems that need to be managed. Scaling a cluster can also be a band aid to solve cluster issues.

provider "docker" {}

variable "number_of_nodes" { default = 3 }
variable "ports" {
  default = [{
    internal = 9200,
    external = 9200
  }]
  type = list(object({
    internal = number
    external = number
  }))
}

resource "docker_network" "elastic" {
  name = "elastic"
}

resource "docker_image" "elasticsearch" {
  name = "docker.elastic.co/elasticsearch/elasticsearch:7.10.0"
}

resource "docker_volume" "elastic" {
  count = var.number_of_nodes
  name = "elastic-${count.index + 1}"
}

resource "docker_container" "node" {
  count = var.number_of_nodes
  image = docker_image.elasticsearch.latest
  name  = "node-${count.index + 1}"
  env = [
    "node.name=node-${count.index + 1}",
    "cluster.name=es-docker-cluster",
    "discovery.seed_hosts=${count.index == 0 ? "node-2,node-3" : ""}${count.index == 1 ? "node-1,node-3" : ""}${count.index == 2 ? "node-1,node-2" : ""}",
    "cluster.initial_master_nodes=node-1,node-2,node-3",
    "bootstrap.memory_lock=true",
    "ES_JAVA_OPTS=-Xms512m -Xmx512m"
  ]
  ulimit {
    name = "memlock"
    soft = -1
    hard = -1
  }
  dynamic "ports" {
    for_each = count.index == 0 ? var.ports : []
    content {
      internal = ports.value.internal
      external = ports.value.external
    }
  }  
  networks_advanced {
    name = docker_network.elastic.name
  }
  volumes {
    volume_name = docker_volume.elastic[count.index].name
    container_path = "/usr/share/elasticsearch/data"
  }
}

I like to think of the multi node elasticsearch cluster as the One Big Elasticsearch Cluster, since over time as the cluster grows it starts getting more and more complex to manage and maintain. Unfortunately I think that a lot of people end up getting stuck in the One Big Cluster trap, which prevents them from easily making changes to the cluster without expensive and time consuming reindexing. Sometimes it is not even possible to reindex the data depending on the ingesting rate. Splitting a cluster like this over multiple regions is not recommended or even supported by elasticsearch.

The issues with One Big Cluster get more severe as more diverse data is ingested into the cluster. It can be pretty hard to determine the best approach to storing long term unstructured documents to metric type events. I find that it becomes very difficult to manage the One Big Cluster with different data types and different tenants.

The final type of cluster I want to detail is the Multiple Elasticsearch Clusters.

Multiple Elasticsearch Clusters

The final solution is to split elasticsearch clusters into smaller clusters focussed to each search solutions. The idea is to run multiple clusters and use the cross cluster solution to expose the clusters much like the One Big Cluster. The big difference between One Big Cluster and Multiple Clusters indicated below.

Multiple Elasticsearch Cluster

The big advantage to the cross cluster search approach is that each cluster is independent from each other. Since each cluster is independent, it allows for a complete cluster to fail, but still allow searches to function across the healthy clusters. The biggest drawback is that cross cluster solutions increase the complexity of the infrastructure and potentially making management much more complex. Since each cluster is independent from each other, changes need to be applied to each cluster. Security configuration and management will also be decentralized.

I feel that most people will build towards the Cross Search decentralized elasticsearch clusters, since it really provides the best of both worlds. Once you decentralize the elasticsearch solution you will remove a lot of the complexity you might experience when looking at different data types.