Service Mesh Uncharted: Hashicorp Consul

Rahul Kumar Singh
7 min readMar 16, 2024

--

This is the very first installment of my new series “Service Mesh Uncharted” where I (with or without collaboration) throughout this series, we’ll examine various service mesh solutions available in the market and bring tutorials for those under this series. So buckle up, service mesh enthusiasts!

There are different Service Mesh tools available in the market out of which a few good ones which will be covered in this series are listed below:

This blog is a collaboration of Rahul Kumar Singh and Rohan Singh, Senior Cloud Infrastructure Engineer at SADA comes with wonderful experience in Google Cloud, DevOps, Infrastructure, and Automation. He is a Google Cloud Champion Innovator for Modern Architecture and actively contributes to the Google Cloud community. We have plans to write more blogs in the future.

Service Mesh — this terminology is scary for a few folks and super fun for others. So what is service mesh in general? Let’s try to understand it before we move into our main topic for this blog.

Let’s understand Service Mesh as a technology with this simple analogy of apartments — assume you live in a large apartment building with many residents (microservices) and a busy lobby (service mesh). Each resident has their own apartment (container) and needs to interact with others for various tasks (API calls). To maintain order and ensure everyone gets what they need, you have a building manager (Consul/Istio/Google Cloud Traffic Director/Kong) to handle all the communication and ensure smooth movement (data flow) among them. Below are the benefits of having it in your environment:

  • Routes traffic
  • Provide Security
  • Monitors Performance
  • Simplifies development

In mid-2023 we got an opportunity to explore and implement HashiCorp Consul for one of our customers who heavily relied on Consul’s service discovery feature.

Let’s understand the use case or what we have done.

The customer wants to manage the traffic between various services that they have running on both GCE and GKE and also seeks to have a centralized control plane for managing the network policies to restrict traffic to their sensitive applications.

HashiCorp Consul

To put the aforementioned analogy in technical words HashiCorp Consul is a service manager for all of your microservices or machines for handling communication and provides a single pane of window to manage all.

Let’s start by creating ingress Firewall rules with target tags as consul and required ports:

gcloud compute --project={GCP_PROJECT_ID} firewall-rules create allow-internal-to-consul --direction=INGRESS --priority=1000 --network={GCP_VPC_NAME} --action=ALLOW --rules=tcp:8300,tcp:8301,tcp:8302,tcp:8500,tcp:8600,tcp:8502 --source-ranges={CIDR.Ranges.to.access.Consul,gke_subnet_range_for_client_connection} --target-tags=consul

To understand what port number is for what, please refer to Consul Servers Ports.

To ensure high availability, we will create a three-node Consul cluster on Google Compute Engine (GCE) VMs. This cluster will have one leader node and two follower nodes.

Note: Ensure the node name in the config file is different for each GCE VM Instance or node else cluster won’t sync.

Create three instances according to the required configurations with the startup script as below on all three instances:

wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg - dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install consul

Once the installation is complete we should start with the modification of the consul config file located at /etc/consul.d and the file name is consul.hcl .

Here is the gist link to the configuration file that we have used.

Consul Cluster Overview Image

Consul listens on port 8500 and the GRPC port is 8502.

Let’s explore how to achieve granular control over service discovery for your microservices running in a GKE cluster using Consul. This approach allows you to determine which services are visible to others within the network. We’ll configure a Consul client within your GKE. This client acts as an agent, registering your microservices with a Consul master running on a separate Google Compute Engine (GCE) VM.

To create the GKE cluster, we’ll leverage the open-source Terraform module provided by GCP. This module offers a convenient way to set up the cluster, and since we’re focusing on a private cluster, it allows us to utilize the stub_domain variable.

What is a Stub Domain? — Stub domains typically refer to DNS configurations where a DNS resolver forwards queries for specific domains to another DNS server, rather than resolving them directly. It is basically to make queries b/w two different DNS. Stub domains are commonly used in networking setups where different DNS servers handle different parts of the DNS namespace, allowing for more efficient and flexible DNS resolution therefore to ensure smooth Consul service discovery we will give stub domain in our terraform code.

Alternatively, you can create the GKE cluster using other methods and then manually configure the stub domain after cluster creation.

Run terraform init && terraform apply to create a Private GKE Cluster with an external endpoint.

Note: If GKE cluster node pool creation is failing because of a permission issue, grant the mentioned permission to PROJECT_ID@cloudservices.gserviceaccount.com SA.

Follow this link for an explanation: https://mouliveera.medium.com/permissions-error-required-compute-instancegroups-update-permission-for-project-8a7f759c30c2

Once GKE is created, consul client installation is required to make a connection with the consul master. Helm is used to perform this installation.

Below is the consul.yaml used by Helm to install and configure the consul client.

Check all Consul stanzas for helm.

Before installing the consul client, run to create permissions for the service account so it can install Helm charts —

kubectl create clusterrolebinding admin - clusterrole=cluster-admin - serviceaccount=kube-system:default

Run to deploy —

kubectl create ns consul
helm repo add hashicorp https://helm.releases.hashicorp.com
helm install -f consul.yaml --name consul-blog hashicorp/consul -n consul

Run to see consul client deployment —

kubectl -n consul get all

Output —

NAME                                            READY   STATUS    RESTARTS        AGE
pod/blog-consul-client-hc49s 1/1 Running 0 7h35m
pod/blog-consul-sync-catalog-7b949cdd9c-5kd6x 1/1 Running 4 (7h32m ago) 7h35m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/blog-consul-dns ClusterIP 10.3.0.224 <none> 53/TCP,53/UDP 7h35m
service/nginx-service ClusterIP 10.3.0.216 <none> 80/TCP 7h25m


NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/blog-consul-client 1 1 1 1 1 <none> 7h35m


NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/blog-consul-sync-catalog 1/1 1 1 7h35m
deployment.apps/nginx 2/2 2 2 7h25m


NAME DESIRED CURRENT READY AGE
replicaset.apps/blog-consul-sync-catalog-7b949cdd9c 1 1 1 7h35m
replicaset.apps/nginx-57d84f57dc 2 2 2 7h25m

The Consul client is visible in Consul UI in the Nodes section after the successful deployment.

Consul Node Visibility in Consul UI

Let’s deploy 2 same nginx applications one in the same namespace as of consul, and another in a different namespace using the below nginx.yaml file.

Ensure to add below two annotations in the service object of each application that needs to be discovered by the consul.

# to have the custom name of the service
consul.hashicorp.com/service-name: nginx-service

# to let Kubernetes service explicitly configured to be synced to Consul
consul.hashicorp.com/service-sync: 'true'

Check all Consul Annotations and Labels.

Consul Service Sync

Exec in one of the application containers to dig another application via Consul Service Name.

Exec and Dig

You may check the Consul UI Services to see the registered service.

Consul Registered Service

Consul KV

Imagine you have an application running in Google Kubernetes Engine (GKE) that needs to retrieve secrets securely from Vault. To ensure the Vault endpoint address is dynamically accessible at runtime, we can leverage Consul’s Key/Value store (Consul KV).

Consul KV functions as a centralized warehouse for configuration parameters and metadata. In this scenario, we’ll store the Vault endpoint address within Consul KV. This allows microservices running within your GKE cluster to retrieve it using the Consul HTTP API (similar to a curl request).

To make this process smoother, you can utilize an Init Container that can perform an HTTP API request to Consul KV and fetch the Vault endpoint address. Once retrieved, the Init Container can then inject the Vault endpoint address into the main application container’s environment variables.

Enhancing Consul Security

Now that all the installation, registration, and accessing of key/value pairs are done, we need to make sure we do all the above steps securely. For the above steps to be performed securely we need Access Control. Consul provides its in-house access control with the following features:

  • Tokens
  • Policies
  • Roles

Let's peek into each of them:

Tokens

Tokens in Consul are used for authentication and authorization. They grant permissions to perform certain actions within the Consul cluster. Configuration involves generating and managing tokens using the Consul CLI or API.

Policies

Policies in Consul define the permissions associated with tokens. They specify what actions a token can perform within the Consul cluster. Configuration involves creating policies with specific rules and associating them with tokens.

Roles

Roles in Consul group together policies and specify which policies are associated with which tokens. They simplify the management of permissions by allowing tokens to inherit permissions from roles. Configuration involves defining roles and associating policies with them.

Rule, Policy, Token; Image Credit — HashiCorp

Understand Access Control Privileges

With this, we are done with HashiCorp Consul service discovery (a service mesh). Follow me to learn more about Service Mesh solutions in the future under Service Mesh Uncharted series.

--

--

Rahul Kumar Singh

Staff @ SADA | Building Secure and Reliable solution for the world | Football Freak