My experience building an opentelemetry receiver for KubeArmor as a Linux foundation intern.
Introduction
I was selected as a Linux Foundation mentorship program'23 intern for the KubeArmor project. I Learned a lot during my 12 weeks building open telemetry support for KubeArmor Logs. I had awesome mentors, Anurag Kumar, Barun Acharya, Ankur Kothiwal, and Rahul Jadhav. They were very supportive and worked to make me have a good mentorship experience. I am really grateful to them. It was a great experience meeting and working with Sibasish Behera, my co-mentee. I would also like to thank the Linux Foundation and Cloud Native Computing Foundation organization for creating such a program. This blog summarises how I got into the program, details of building an opentelemetry log-based receiver, and my experience as an intern.
Getting Into The Program
I had previously applied for this program the other year and was unsuccessful. I would say the difference between my past unsuccessful attempt and the successful one is that I put in more effort in the successful one. After my rejection last year, I stuck around with the community, attended meetings, attempted to solve issues, and gained a better understanding of the project. During the next application cycle, the knowledge that I had garnered helped me to put in a better application. I was better able to show that I had an understanding of the project with proof of work as well as come up with an approach for my project idea which I included in my proposal (although the outline of the cover letter in the LFX platform does not include an approach to the project, including this in your cover letter could up your chances. Link to my cover letter). You can read more about my take on landing open-source internships.
My LFX project
My project was on adding open telemetry support to KubeArmor. Opentelemetry is a collection of APIs and SDKs that provides a vendor-agnostic way of obtaining metrics, traces, and logs from software making applications more flexible to observability backends. You can read more about open telemetry from my blog on the benefits of contributing to opentelemetry written during my time as an outreachy intern for the project. KubeArmor emits three kinds of logs, host visibility logs, alerts, and its own application logs. My task was to create an adapter to make these logs follow the open telemetry specification.
Approach to the project
After studying log-based receivers in the opentelemetry contrib repository, I decided to follow a similar pattern. Below is the design of the receiver:
The task involved creating three components:
KubeArmor GRPC client
The KubeArrmor GRPC client is needed to fetch logs from the KubeArmor relay server. The relay server collects all messages, alerts, and system logs generated by KubeArmor in each node, and then it allows other logging systems to simply collect those. I initially used the already existing KubeArmor log client binary but that did not go well with the Kubernetes deployment as it would involve downloading the binary in all opentelemetry collector containers running the receiver which was not scalable and so I had to write a client for the receiver. As stated earlier, KubeArmor emits three kinds of logs, to enable users to select which kind of logs that they are interested in, I created "filters" for the logs, namely:
system
This filter enables the client to fetch only system insight logs.
policy
This filter enables the client to fetch only alerts i.e. policy violations.
kubearmorLogs
This filter enables the client to fetch only Kubearmor's own application logs
all
This filter enables the client to fetch all the logs kubearmor fetches.
The existing KubeArmor log client inspired my work on the client.
Stanza input operator:
Stanza is a lightweight agent log collector and forwarder based on GO developed by observIQ and donated to OpenTelemetry. Log-based receivers in opentelemetry leverage stanza to convert logs to the opentelemetry format. Let us better understand what the stanza package in opentelemetry consists of.
Components of the OpenTelemetry Stanza Package
Operators
Stanza has what is known as "operators", this represents what enables tasks to be carried out on a log stream pipeline. For example, tasks to collect logs, tasks to transform logs, tasks to output logs etc. They are of various types: input operators, parser operators, transform operators, and output operators. This document gives more details about existing operators in the package.
Pipeline
The pipeline represents a component in which operators are arranged to move logs one to the other. The data structure is that of a directed graph where input or process operators are connected to output or process operators as well.
An adapter
This serves as a "glue" between stanza and opentelemetry collector component. The whole point of the receiver is to convert any log format to the opentelemetry plog.Logs format. Opentelemetry leverages stanza, an existing log collector agent to carry out this conversion. Stanza in turn makes use of operators to carry out this task. The adapter creates what can be described as a generic stanza receiver which implements the necessary GOLang interface required to create an opentelemetry receiver. It also defines a "LogReceiverType" interface which individual log-based receivers implement to define their own configurations. Logs are fed into the stanza pipeline and converted to the "entry" log format using the input operator. These "log entries" come out of the pipeline using the default output operator, emitter. The emitter batches these log entries and sends them to the converter. The converter converts these batch entries to the opentelemetry plog.Logs format and sends to the consumer loop.
The diagram below shows the general design of a log-based opentelemetry receiver leveraging stanza:
KubeArmor Receiver
This implements the "logReceiver" interface from the general stanza receiver.
This design documentation explains more about the design of my solution
My internship experience.
As an LFX intern, I took the lead in my project and owned it. This is what I would like prospective interns to note. As a newbie in open source internships, I used to think that I would hold the crayon and mentors would hold my hand and then color for me but the reality is that from my experience, you hold the crayon and color then call out to your mentors if you need any help. I came up with the approach to tackle my project idea on my own and also implemented it myself as well. Thanks to my awesome mentors, I always had someone to run to when I was experiencing any doubt or had any questions. I had weekly meetings with my mentors where I kept them updated about my progress, ran over my ideas with them, and communicated any blockers to them. Apart from the weekly meetings I could also easily reach out to them on Slack and communicate any issues or update as well. All these led to a great internship experience and the successful completion of my project.
Result
I learned a lot as a result of this experience. I gained practical experience creating a component for opentelemetry and through this process, gained more insight into how these components work under the hoodie. I also learned more about kubearmor and the different kinds of logs it emits. I also delved a bit into the realm of data analysis as I learned the logQL querying language and used Grafana to create a visualization for Kubearmor logs.
In summary, I was able to create an opentelemetry kubearmor receiver and tried it out with the Loki exporter, creating a Grafana dashboard for Kubearmor logs.