Sitemap
Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Observability-Landscape-as-Code in Practice

11 min readNov 16, 2022

--

Press enter or click to view image in full size
Laser lights
Laser lights. Taken at Casa Loma’s Legends of Horror. Photo by Adri Villela.

with Ana Margarita Medina

If you follow Adriana’s writings on Observability Adriana’s writings on Observability , you may recall a post from back in June introducing the concept of Observability-Landscape-as-Code (OLaC) Observability-Landscape-as-Code (OLaC) .

An Observability Landscape is made up of the following pieces:

  • Application instrumentation
  • Collecting and storing application telemetry
  • An Observability back-end
  • A set of meaningful SLOs
  • Alerts for on-call Engineers
Press enter or click to view image in full size

Keeping that in mind, OLaC is simply the codification of your Observability Landscape, thereby ensuring consistency, maintainability, and reproducibility.

That’s all well and good, but how about seeing this thing in action? Well, my friend, you’ve come to the right place, because today, you get to see a tutorial featuring a number of OLaC practices in action!

  1. Collecting & storing application telemetry Collecting & storing application telemetry
  2. How: OpenTelemetry Collector OpenTelemetry Collector is deployed via code (Helm chart), alongside the various services that make up the OpenTelemetry Demo App OpenTelemetry Demo App .
  3. Codifying your Observability back-end configuration Codifying your Observability back-end configuration
  4. How: Using the Lightstep Terraform Provider Lightstep Terraform Provider to create dashboards in Lightstep.

We wanted to showcase OLaC principles with a real-life example using modern cloud-native tooling…Which means using Kubernetes for our cloud infrastructure with Google Cloud Google Cloud’s Kubernetes offering. Now, since we are good practitioners of OLaC and SRE, we won’t just be setting things up through the clickity click of a UI. No sirreee. Instead, we’ll be #automatingAllTheThings using HashiCorp Terraform. Terraform allows us to do infrastructure-as-code (IaC), and gives us tons of added benefits like better control over our resources and standardization. These are key principles in OLaC and IaC.

We will be deploying OpenTelemetry Demo App OpenTelemetry Demo App to our cluster. The Demo App has been instrumented using OpenTelemetry OpenTelemetry , and will send and through the OpenTelemetry Collector OpenTelemetry Collector to Lightstep.

Are you ready??? Let’s get started!

Tutorial

Pre-Requisites

Before you begin, you will need the following:

Steps

1- Clone the example repo

Let’s start by cloning the example repo:

git clone https://github.com/lightstep/unified-observability-k8s-kubecon.git

2- Initialize Sub-Modules

This project makes use of a few [Git submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules), so in order to ensure that things work nicely, you’ll need to pull them in:

cd unified-observability-k8s-kubecon
git submodule init && git submodule update

3- Google Cloud Login

Before we can create a GKE cluster you must authenticate your Google Cloud account:

gcloud auth application-default login --no-launch-browser

You will be presented with a link which you need to open up in a browser, to authenticate your Google ID. Once you are authenticated, the browser will display an authorization token for you to paste in the command line, as follows:

Press enter or click to view image in full size
Results of gcloud auth command

4- Create terraform.tfvars

Now that you’re authenticated, let’s get ready to Terraform! Before you can do that, we need to create a terraform.tfvars file.

Lucky for you, we have a handy-dandy template that you can use to get started:

cd k8s-cluster-with-otel-demo/terraform
cp terraform.tfvars.template terraform.tfvars

Next populate the following values in the file:

  • <your_gcp_project>: The name of your Google Cloud project. Don't know your project name? No problem! Just run gcloud config get-value project to find out what it is!
  • <your_gke_cluster_name>: The name you wish to give your GKE cluster. Make sure it follows Kubernetes cluster naming conventions (i.e. no underscores _ or special characters).
  • <your_lightstep_access_token>: Your Lightstep Access Token Lightstep Access Token . This is used to send Traces to your Lightstep Project Lightstep Project .
  • <your_lightstep_api_key>: Your Lightstep API key Lightstep API key . This is used to create our Metrics dashboards.
  • <your_lightstep_org_name>: Your Lightstep organization name. Not sure what your organization is called? No problem! Log into Lightstep,and click on the person icon on the bottom left of your screen. This will pop up a little menu. The organization name can be found under the "Account Management" heading, like this:
Press enter or click to view image in full size
Lightstep UI showing where to get the Lightstep Org Name

Notice that my organization is called “LightStep”. Yours will be different. Note also that Organization names are case-sensitive.

Note: terraform.tfvars is in .gitignore and won't be put into version control.

5- Run Terraform

This step will initialize Terraform (install providers locally), and then will apply the Terraform plan.

It will:

Before running the commands below, make sure that you’re already in the k8s-cluster-with-otel-demo/terraform folder.

Please note that this step may take up to 30 minutes, depending on GKE’s disposition. Be patient. 😄

6- Update your kubeconfig

Now that the cluster is created, you can add it to your kubeconfig file! By default, the file is saved at $HOME/.kube/config.

Before you can update your kubeconfig, you first need to make sure that you have the gke-gcloud-auth-plugin gke-gcloud-auth-plugin installed:

gcloud components install gke-gcloud-auth-plugin
gke-gcloud-auth-plugin --version
echo "export USE_GKE_GCLOUD_AUTH_PLUGIN=True" >> ~/.bashrc

Now we can add the cluster to kubeconfig:

gcloud container clusters get-credentials $(terraform output -raw kubernetes_cluster_name) --region $(terraform output -raw region)

This gets the kubernetes_cluster_name and region output values from Terraform (that's the terraform output -raw stuff), and plunks those into your gcloud container clusters get-credentials command.

Or, if you closed the terminal in which you were running Terraform and lost your output values, you can also do this:

gcloud container clusters get-credentials <cluster_name> --region <region>

Where <cluster_name> and <region> correspond to the values you entered in Step 3 in your terraform.tfvars file.

7- Check out the OTel Demo app

If you run kubectl get ns, you'll notice that there's now a new namespace called otel-demo:

Get Adriana Villela’s stories in your inbox

Join Medium for free to get updates from this writer.

This is where we deployed the OTel Demo app. Let’s look into this namespace to see what we’ve created. First, let’s look at the pods with kubectl get pods -ns otel-demo:

Press enter or click to view image in full size

Notice how we deployed a bunch of different services that make up the OTel Demo App, including adservice, cartservice, recommendationservice, etc.

We also deployed an OTel Collector OTel Collector . Its configuration YAML is stored in a configmap. We can take a peek by running kubectl describe configmap otel-demo-app-otelcol -n otel-demo:

Press enter or click to view image in full size

You can see that we also reference a variable called ${LS_TOKEN} which represents your Lightstep Access Token Lightstep Access Token , which you set in terraform.tfvars. But where is it? The secret is mounted to the OTel Collector container instance as a secret called otel-collector-secret. Let's take a look at the secret by running kubectl describe secret otel-collector-secret -n otel-demo:

Press enter or click to view image in full size
Results of running “kubectl describe secret otel-collector-secret -n otel-demo”

All this magic happens in otel-demo-app-values-ls.yaml otel-demo-app-values-ls.yaml . This is a version of values.yaml values.yaml from the OTel Demo App Helm Chart OTel Demo App Helm Chart with updates to the Collector configs so that we can configure the OTel Collector to send Traces to Lightstep.

8- Run the OTel Demo App

Okay…enough Kubernetes talk. Let’s look at the OpenTelemetry Demo App! You can access the Demo App by Kubernetes port-forward:

kubectl port-forward -n otel-demo svc/otel-demo-app-frontend 8080:8080

To access the front-end, go to http://localhost:8080:

Press enter or click to view image in full size

Go ahead and explore the amazing selection of telescopes and accessories, and buy a few. 😉🔭

9- See Traces in Lightstep

We can now pop over to Lightstep and check out some . Let’s do this by creating a Notebook.

First, click on the little page icon on the left nav bar (highlighted in blue, below). That will bring up this page:

Press enter or click to view image in full size

Next, we build our query for our Traces. Let’s look at the traces from the recommendationservice. We'll do by entering recommendationservice in the field next to "All telemetry". Because this is a service, select the second value from the drop-down, which says, "Use 'recommendationservice' as service value", as per below:

Press enter or click to view image in full size
Lightstep Notebook — selecting the recommendationservice

After you select that value, you’ll see a chart like this:

Press enter or click to view image in full size

The little green dots represent trace exemplars from that Service. Hover over one of them to see for yourself!

Press enter or click to view image in full size

If you click on one of these dots, you’ll get taken to the Trace view. Before you click, be sure to save your Notebook first (don’t worry, you’ll get a reminder before you navigate away from the page)!

Here’s the Trace view we see when we click on the get_product_list dot (Operation) above:

Press enter or click to view image in full size

Pretty cool, amirite?

10- See Kubernetes Metrics in Lightstep

Remember when you ran terraform apply? Well, not only did it create a Kubernetes cluster, deploy the OTel Demo App (and OTel Collector), it also created some handy-dandy Metrics dashboards for us.

You can check out the newly-created Metrics dashboards by going to the Dashboards icon (the icon with 4 little squares) on the left navigation bar:

Press enter or click to view image in full size

First, let’s check out the Kubernetes / Compute Resources / Cluster dashboard. This dashboard lets you see the state of your cluster.

Press enter or click to view image in full size

We then have various other Metrics called Kubernetes Workload Metrics. These are the dashboards with names that start with “ Kubernetes / Compute Resources / Workload”. These dashboards are specific to the services you are running. They take into account the Kubernetes Workloads in your various namespaces, using kube-state-metrics kube-state-metrics . For a closer look, check out otel_demo_app_k8s_dashboard.tf otel_demo_app_k8s_dashboard.tf .

We used Lightstep’s Prometheus Kubernetes OpenTelemetry Collector Lightstep’s Prometheus Kubernetes OpenTelemetry Collector to get these Metrics into Lightstep. This Helm chart is inspired by kube-prometheus-stack kube-prometheus-stack , but with one crucial difference — no Prometheus! We’re able to use recent enhancements to the OpenTelemetry Operator for Kubernetes OpenTelemetry Operator for Kubernetes such as support for Service Monitors in order to scrape Prometheus metrics from pods, system components, and more.

Note: You can learn more about the Prometheus Kubernetes OpenTelemetry Collector Prometheus Kubernetes OpenTelemetry Collector by checking out the docs .

For example, the Kubernetes / Compute Resources / Workload / otel-demo-app-cartservice dashboard displays metrics for the OTel Demo App’s cartservice. In it we can see how our containers and pods are doing based on Metrics such as those for CPU and Memory.

Press enter or click to view image in full size
Screen capture of dashboard for otel-demo-app-cartservice, created by the lightstep/lightstep Terraform provider.

11- See Application Metrics in Lightstep

Ah…but we’re not done with Metrics just yet! If you go back to the dashboard view and scroll to the very end of the list, you’ll see the OTel Demo App — Application Metrics dashboard.

Press enter or click to view image in full size
Screen capture of the list of dashboards created by the lightstep/lightstep Terraform provider

Let’s click on it to take a quick little peek!

Press enter or click to view image in full size
Screen capture of OTel Demo App dashboards created by the lightstep/lightstep Terraform provider

The latest version of the OTel Demo App emits both auto-instrumented and manually-instrumented Metrics. In today’s demo, we wanted to highlight some of the from the recommendationservice.

First, we have the auto-instrumented Python Metrics, which are captured from the Python runtime:

  • runtime.cpython.cpu_time: Track the amount of time being spent in different states of the CPU. This includes user (time running application code) and system (time spent in the operating system). This metric is represented as total elapsed time in seconds.
  • runtime.cpython.memory: Memory utilization
  • runtime.cpython.gc_count: Number of times the garbage collector has been called.

We also have one manually-instrumented Metric:

For more on the recommendationservice , check out . For more on captured by other services, check out the OTel Demo App service docs OTel Demo App service docs .

12- Teardown

terraform destroy -auto-approve

If you’re no longer using this environment, don’t forget to tear down its resources, to avoid running up a huge cloud bill. You’re welcome. 😉

This step can take up to 30 minutes, so please be patient! Also, you’ll probably notice that on first run, you’ll see the following error:

Error: uninstall: Release not loaded: otel-operator: release not found

Don’t panic! If you run terraform destroy -auto-approve again, it will finish nukifying all the things.

Final Thoughts

Today we got to see some aspects of Observability-Landscape-as-Code (OLaC) in practice! Specifically, we looked at the following elements:

  • Application instrumentation with OpenTelemetry
  • Collecting and storing application telemetry via the OTel Collector
  • Configuring an Observability back-end (i.e. Lightstep) through code

We showcased this by using Terraform to:

  • Deploy the OpenTelemetry Demo App to Kubernetes. The Otel Demo App showcases the Traces and Metrics instrumentation of different services in different languages using OpenTelemetry.
  • Deploy an OpenTelemetry Collector to Kubernetes (part of the Demo App deployment). The Collector is used to send application Traces and Metrics to Lightstep.
  • Configure Lightstep dashboards. The Lightstep Terraform provider allowed us to codify this.

Codifying our Observability Landscape means that we can tear down and recreate our application, Collector, and dashboards as needed, knowing that we’ll have consistency across the board every single time. Plus, it means that we can version control it, so that it’s not lost in the ether somewhere, or sitting in a secret server under Bob’s desk. Bonus!

Hopefully this gives you a nice little flavour of the power of OLaC, and will inspire you to go out there and start OLaC-ing too! (I just made up a new verb. You’re welcome.)

Whew! That was a lot to think about and take in! Give yourself a pat on the back, because we’ve covered a LOT! Now, please enjoy this picture of Adriana’s rat, Bunny, enjoying an almond!

Press enter or click to view image in full size
Bunny the rat enjoying an almond. Sadly, we lost bunny shortly after this photo was taken. 😭Photo by Adri Villela.

Peace, love, and code. 🦄 🌈 💫

The OpenTelemetry Demo App OpenTelemetry Demo App is always looking for feedback and contributors. Please consider joining the OTel Community OTel Community to help make OpenTelemetry AWESOME!

Got questions about Observability-Landscape-as-Code? Talk to us! Feel free to connect with us through e-mail, or:

Hope to hear from y’all!

Originally published at https://lightstep.com.

--

--

Dev Genius
Dev Genius

Published in Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Adriana Villela
Adriana Villela

Written by Adriana Villela

DevRel | OTel End User SIG Maintainer | CNCF Ambassador | Podcaster | 🚫BS | Speaker | Boulderer | Computering 24+ years | Opinions my own 🇧🇷🇨🇦