This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

CNOE

CNOE is a platform building orchestrator, which we choosed at least to start in 2024 with to build the EDF

1 - Analysis of CNOE competitors

We compare CNOW - which we see as an orchestrator - with other platform orchestring tools like Kratix and Humanitc

Kratix

Kratix is a Kubernetes-native framework that helps platform engineering teams automate the provisioning and management of infrastructure and services through custom-defined abstractions called Promises. It allows teams to extend Kubernetes functionality and provide resources in a self-service manner to developers, streamlining the delivery and management of workloads across environments.

Concepts

Key concepts of Kratix:

  • Workload: This is an abstraction representing any application or service that needs to be deployed within the infrastructure. It defines the requirements and dependent resources necessary to execute this task.
  • Promise: A “Promise” is a ready-to-use infrastructure or service package. Promises allow developers to request specific resources (such as databases, storage, or computing power) through the standard Kubernetes interface. It’s similar to an operator in Kubernetes but more universal and flexible. Kratix simplifies the development and delivery of applications by automating the provisioning and management of infrastructure and resources through simple Kubernetes APIs.

Pros of Kratix

  • Resource provisioning automation. Kratix simplifies infrastructure creation for developers through the abstraction of “Promises.” This means developers can simply request the necessary resources (like databases, message queues) without dealing with the intricacies of infrastructure management.

  • Flexibility and adaptability. Platform teams can customize and adapt Kratix to specific needs by creating custom Promises for various services, allowing the infrastructure to meet the specific requirements of the organization.

  • Unified resource request interface. Developers can use a single API (Kubernetes) to request resources, simplifying interaction with infrastructure and reducing complexity when working with different tools and systems.

Cons of Kratix

  • Although Kratix offers great flexibility, it can also lead to more complex setup and platform management processes. Creating custom Promises and configuring their behavior requires time and effort.

  • Kubernetes dependency. Kratix relies on Kubernetes, which makes it less applicable in environments that don’t use Kubernetes or containerization technologies. It might also lead to integration challenges if an organization uses other solutions.

  • Limited ecosystem. Kratix doesn’t have as mature an ecosystem as some other infrastructure management solutions (e.g., Terraform, Pulumi). This may limit the availability of ready-made solutions and tools, increasing the amount of manual work when implementing Kratix.

Humanitec

Humanitec is an Internal Developer Platform (IDP) that helps platform engineering teams automate the provisioning and management of infrastructure and services through dynamic configuration and environment management.

It allows teams to extend their infrastructure capabilities and provide resources in a self-service manner to developers, streamlining the deployment and management of workloads across various environments.

Concepts

Key concepts of Humanitec:

  • Application Definition:
    This is an abstraction where developers define their application, including its services, environments, a dependencies. It abstracts away infrastructure details, allowing developers to focus on building and deploying their applications.

  • Dynamic Configuration Management:
    Humanitec automatically manages the configuration of applications and services across multiple environments (e.g., development, staging, production). It ensures consistency and alignment of configurations as applications move through different stages of deployment.

Humanitec simplifies the development and delivery process by providing self-service deployment options while maintaining centralized governance and control for platform teams.

Pros of Humanitec

  • Resource provisioning automation. Humanitec automates infrastructure and environment provisioning, allowing developers to focus on building and deploying applications without worrying about manual configuration.

  • Dynamic environment management. Humanitec manages application configurations across different environments, ensuring consistency and reducing manual configuration errors.

  • Golden Paths. best-practice workflows and processes that guide developers through infrastructure provisioning and application deployment. This ensures consistency and reduces cognitive load by providing a set of recommended practices.

  • Unified resource management interface. Developers can use Humanitec’s interface to request resources and deploy applications, reducing complexity and improving the development workflow.

Cons of Humanitec

  • Humanitec is commercially licensed software

  • Integration challenges. Humanitec’s dependency on specific cloud-native environments can create challenges for organizations with diverse infrastructures or those using legacy systems.

  • Cost. Depending on usage, Humanitec might introduce additional costs related to the implementation of an Internal Developer Platform, especially for smaller teams.

  • Harder to customise

2 - Included Backstage Templates

Here you will find information about backstage templates that are included into idpbuilder’s ref-implementation

2.1 - Template for basic Argo Workflow

Backstage Template for Basic Argo Workflow with Spark Job

This Backstage template YAML automates the creation of an Argo Workflow for Kubernetes that includes a basic Spark job, providing a convenient way to configure and deploy workflows involving data processing or machine learning jobs. Users can define key parameters, such as the application name and the path to the main Spark application file. The template creates necessary Kubernetes resources, publishes the application code to a Gitea Git repository, registers the application in the Backstage catalog, and deploys it via ArgoCD for easy CI/CD management.

Use Case

This template is designed for teams that need a streamlined approach to deploy and manage data processing or machine learning jobs using Spark within an Argo Workflow environment. It simplifies the deployment process and integrates the application with a CI/CD pipeline. The template performs the following:

  • Workflow and Spark Job Setup: Defines a basic Argo Workflow and configures a Spark job using the provided application file path, ideal for data processing tasks.
  • Repository Setup: Publishes the workflow configuration to a Gitea repository, enabling version control and easy updates to the job configuration.
  • ArgoCD Integration: Creates an ArgoCD application to manage the Spark job deployment, ensuring continuous delivery and synchronization with Kubernetes.
  • Backstage Registration: Registers the application in Backstage, making it easily discoverable and manageable through the Backstage catalog.

This template boosts productivity by automating steps required for setting up Argo Workflows and Spark jobs, integrating version control, and enabling centralized management and visibility, making it ideal for projects requiring efficient deployment and scalable data processing solutions.

2.2 - Template for basic kubernetes deployment

Backstage Template for Kubernetes Deployment

This Backstage template YAML automates the creation of a basic Kubernetes Deployment, aimed at simplifying the deployment and management of applications in Kubernetes for the user. The template allows users to define essential parameters, such as the application’s name, and then creates and configures the Kubernetes resources, publishes the application code to a Gitea Git repository, and registers the application in the Backstage catalog for tracking and management.

Use Case

The template is designed for teams needing a streamlined approach to deploy applications in Kubernetes while automatically configuring their CI/CD pipelines. It performs the following:

  • Deployment Creation: A Kubernetes Deployment YAML is generated based on the provided application name, specifying a basic setup with an Nginx container.
  • Repository Setup: Publishes the deployment code in a Gitea repository, allowing for version control and future updates.
  • ArgoCD Integration: Automatically creates an ArgoCD application for the deployment, facilitating continuous delivery and synchronization with Kubernetes.
  • Backstage Registration: Registers the application in Backstage to make it discoverable and manageable via the Backstage catalog.

This template enhances productivity by automating several steps required for deployment, version control, and registration, making it ideal for projects where fast, consistent deployment and centralized management are required.

3 - idpbuilder

Here you will find information about idpbuilder installation and usage

3.1 - Installation of idpbuilder

Local installation with KIND Kubernetes

The idpbuilder uses KIND as Kubernetes cluster. It is suggested to use a virtual machine for the installation. MMS Linux clients are unable to execute KIND natively on the local machine because of network problems. Pods for example can’t connect to the internet.

Windows and Mac users already utilize a virtual machine for the Docker Linux environment.

Prerequisites

  • Docker Engine
  • Go
  • kubectl
  • kind

Build process

For building idpbuilder the source code needs to be downloaded and compiled:

git clone https://github.com/cnoe-io/idpbuilder.git
cd idpbuilder
go build

The idpbuilder binary will be created in the current directory.

Start idpbuilder

To start the idpbuilder binary execute the following command:

./idpbuilder create --use-path-routing  --log-level debug --package https://github.com/cnoe-io/stacks//ref-implementation

Logging into ArgoCD

At the end of the idpbuilder execution a link of the installed ArgoCD is shown. The credentianls for access can be obtained by executing:

./idpbuilder get secrets

Logging into KIND

A Kubernetes config is created in the default location $HOME/.kube/config. A management of the Kubernetes config is recommended to not unintentionally delete acces to other clusters like the OSC.

To show all running KIND nodes execute:

kubectl get nodes -o wide

To see all running pods:

kubectl get pods -o wide

Next steps

Follow this documentation: https://github.com/cnoe-io/stacks/tree/main/ref-implementation

Delete the idpbuilder KIND cluster

The cluster can be deleted by executing:

idpbuilder delete cluster

Remote installation into a bare metal Kubernetes instance

CNOE provides two implementations of an IDP:

  • Amazon AWS implementation
  • KIND implementation

Both are not useable to run on bare metal or an OSC instance. The Amazon implementation is complex and makes use of Terraform which is currently not supported by either base metal or OSC. Therefore the KIND implementation is used and customized to support the idpbuilder installation. The idpbuilder is also doing some network magic which needs to be replicated.

Several prerequisites have to be provided to support the idpbuilder on bare metal or the OSC:

  • Kubernetes dependencies
  • Network dependencies
  • Changes to the idpbuilder

Prerequisites

Talos Linux is choosen for a bare metal Kubernetes instance.

  • talosctl
  • Go
  • Docker Engine
  • kubectl
  • kustomize
  • helm
  • nginx

As soon as the idpbuilder works correctly on bare metal, the next step is to apply it to an OSC instance.

Add *.cnoe.localtest.me to hosts file

Append this lines to /etc/hosts

127.0.0.1 gitea.cnoe.localtest.me
127.0.0.1 cnoe.localtest.me

Install nginx and configure it

Install nginx by executing:

sudo apt install nginx

Replace /etc/nginx/sites-enabled/default with the following content:

server {
        listen 8443 ssl default_server;
        listen [::]:8443 ssl default_server;

        include snippets/snakeoil.conf;

        location / {
                    proxy_pass http://10.5.0.20:80;
                    proxy_http_version                 1.1;
                    proxy_cache_bypass                 $http_upgrade;
                    proxy_set_header Host              $host;
                    proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
                    proxy_set_header X-Real-IP         $remote_addr;
                    proxy_set_header X-Forwarded-Host  $host;
                    proxy_set_header X-Forwarded-Proto $scheme;
        }
}

Start nginx by executing:

sudo systemctl enable nginx
sudo systemctl restart nginx

Building idpbuilder

For building idpbuilder the source code needs to be downloaded and compiled:

git clone https://github.com/cnoe-io/idpbuilder.git
cd idpbuilder
go build

The idpbuilder binary will be created in the current directory.

Configure VS Code launch settings

Open the idpbuilder folder in VS Code:

code .

Create a new launch setting. Add the "args" parameter to the launch setting:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Launch Package",
            "type": "go",
            "request": "launch",
            "mode": "auto",
            "program": "${fileDirname}",
            "args": ["create", "--use-path-routing", "--package", "https://github.com/cnoe-io/stacks//ref-implementation"]
        }
    ]
}

Create the Talos bare metal Kubernetes instance

Talos by default will create docker containers, similar to KIND. Create the cluster by executing:

talosctl cluster create

Install local path privisioning (storage)

mkdir -p localpathprovisioning
cd localpathprovisioning
cat > localpathprovisioning.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- github.com/rancher/local-path-provisioner/deploy?ref=v0.0.26
patches:
- patch: |-
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: local-path-config
      namespace: local-path-storage
    data:
      config.json: |-
        {
                "nodePathMap":[
                {
                        "node":"DEFAULT_PATH_FOR_NON_LISTED_NODES",
                        "paths":["/var/local-path-provisioner"]
                }
                ]
        }
- patch: |-
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: local-path
      annotations:
        storageclass.kubernetes.io/is-default-class: "true"
- patch: |-
    apiVersion: v1
    kind: Namespace
    metadata:
      name: local-path-storage
      labels:
        pod-security.kubernetes.io/enforce: privileged
EOF
kustomize build | kubectl apply -f -
rm localpathprovisioning.yaml kustomization.yaml
cd ..
rmdir localpathprovisioning

Install an external load balancer

kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml
sleep 50

cat <<EOF | kubectl apply -f -
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: first-pool
  namespace: metallb-system
spec:
  addresses:
  - 10.5.0.20-10.5.0.130
EOF

cat <<EOF | kubectl apply -f -
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: homelab-l2
  namespace: metallb-system
spec:
  ipAddressPools:
  - first-pool
EOF

Install an ingress controller which uses the external load balancer

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx --create-namespace
sleep 30

Execute idpbuilder

Modify the idpbuilder source code

Edit the function Run in pkg/build/build.go and comment out the creation of the KIND cluster:

	/*setupLog.Info("Creating kind cluster")
	if err := b.ReconcileKindCluster(ctx, recreateCluster); err != nil {
		return err
	}*/

Compile the idpbuilder

go build

Start idpbuilder

Then, in VS Code, switch to main.go in the root directory of the idpbuilder and start debugging.

Logging into ArgoCD

At the end of the idpbuilder execution a link of the installed ArgoCD is shown. The credentianls for access can be obtained by executing:

./idpbuilder get secrets

Logging into Talos cluster

A Kubernetes config is created in the default location $HOME/.kube/config. A management of the Kubernetes config is recommended to not unintentionally delete acces to other clusters like the OSC.

To show all running Talos nodes execute:

kubectl get nodes -o wide

To see all running pods:

kubectl get pods -o wide

Delete the idpbuilder Talos cluster

The cluster can be deleted by executing:

talosctl cluster destroy

TODO’s for running idpbuilder on bare metal or OSC

Required:

  • Add *.cnoe.localtest.me to the Talos cluster DNS, pointing to the host device IP address, which runs nginx.

  • Create a SSL certificate with cnoe.localtest.me as common name. Edit the nginx config to load this certificate. Configure idpbuilder to distribute this certificate instead of the one idpbuilder distributes by idefault.

Optimizations:

  • Implement an idpbuilder uninstall. This is specially important when working on the OSC instance.

  • Remove or configure gitea.cnoe.localtest.me, it seems not to work even in the idpbuilder local installation with KIND.

  • Improvements to the idpbuilder to support Kubernetes instances other then KIND. This can either be done by parametrization or by utilizing Terraform / OpenTOFU or Crossplane.

3.2 - Http Routing

Routing switch

The idpbuilder supports creating platforms using either path based or subdomain based routing:

idpbuilder create --log-level debug --package https://github.com/cnoe-io/stacks//ref-implementation
idpbuilder create --use-path-routing --log-level debug --package https://github.com/cnoe-io/stacks//ref-implementation

However, even though argo does report all deployments as green eventually, not the entire demo is actually functional (verification?). This is due to hardcoded values that for example point to the path-routed location of gitea to access git repos. Thus, backstage might not be able to access them.

Within the demo / ref-implementation, a simple search & replace is suggested to change urls to fit the given environment. But proper scripting/templating could take care of that as the hostnames and necessary properties should be available. This is, however, a tedious and repetitive task one has to keep in mind throughout the entire system, which might lead to an explosion of config options in the future. Code that addresses correct routing is located in both the stack templates and the idpbuilder code.

Cluster internal routing

For the most part, components communicate with either the cluster API using the default DNS or with each other via http(s) using the public DNS/hostname (+ path-routing scheme). The latter is necessary due to configs that are visible and modifiable by users. This includes for example argocd config for components that has to sync to a gitea git repo. Using the same URL for internal and external resolution is imperative.

The idpbuilder achieves transparent internal DNS resolution by overriding the public DNS name in the cluster’s internal DNS server (coreDNS). Subsequently, within the cluster requests to the public hostnames resolve to the IP of the internal ingress controller service. Thus, internal and external requests take a similar path and run through proper routing (rewrites, ssl/tls, etc).

Conclusion

One has to keep in mind that some specific app features might not work properly or without haxx when using path based routing (e.g. docker registry in gitea). Futhermore, supporting multiple setup strategies will become cumbersome as the platforms grows. We should probably only support one type of setup to keep the system as simple as possible, but allow modification if necessary.

DNS solutions like nip.io or the already used localtest.me mitigate the need for path based routing

Excerpt

HTTP is a cornerstone of the internet due to its high flexibility. Starting from HTTP/1.1 each request in the protocol contains among others a path and a Hostname in its header. While an HTTP request is sent to a single IP address / server, these two pieces of data allow (distributed) systems to handle requests in various ways.

$ curl -v http://google.com/something > /dev/null

* Connected to google.com (2a00:1450:4001:82f::200e) port 80
* using HTTP/1.x
> GET /something HTTP/1.1
> Host: google.com
> User-Agent: curl/8.10.1
> Accept: */*
...

Path-Routing

Imagine requesting http://myhost.foo/some/file.html, in a simple setup, the web server myhost.foo resolves to would serve static files from some directory, /<some_dir>/some/file.html.

In more complex systems, one might have multiple services that fulfill various roles, for example a service that generates HTML sites of articles from a CMS and a service that can convert images into various formats. Using path-routing both services are available on the same host from a user’s POV.

An article served from http://myhost.foo/articles/news1.html would be generated from the article service and points to an image http://myhost.foo/images/pic.jpg which in turn is generated by the image converter service. When a user sends an HTTP request to myhost.foo, they hit a reverse proxy which forwards the request based on the requested path to some other system, waits for a response, and subsequently returns that response to the user.

Path-Routing Example

Such a setup hides the complexity from the user and allows the creation of large distributed, scalable systems acting as a unified entity from the outside. Since everything is served on the same host, the browser is inclined to trust all downstream services. This allows for easier ‘communication’ between services through the browser. For example, cookies could be valid for the entire host and thus authentication data could be forwarded to requested downstream services without the user having to explicitly re-authenticate.

Furthermore, services ‘know’ their user-facing location by knowing their path and the paths to other services as paths are usually set as a convention and / or hard-coded. In practice, this makes configuration of the entire system somewhat easier, especially if you have various environments for testing, development, and production. The hostname of the system does not matter as one can use hostname-relative URLs, e.g. /some/service.

Load balancing is also easily achievable by multiplying the number of service instances. Most reverse proxy systems are able to apply various load balancing strategies to forward traffic to downstream systems.

Problems might arise if downstream systems are not built with path-routing in mind. Some systems require to be served from the root of a domain, see for example the container registry spec.

Hostname-Routing

Each downstream service in a distributed system is served from a different host, typically a subdomain, e.g. serviceA.myhost.foo and serviceB.myhost.foo. This gives services full control over their respective host, and even allows them to do path-routing within each system. Moreover, hostname-routing allows the entire system to create more flexible and powerful routing schemes in terms of scalability. Intra-system communication becomes somewhat harder as the browser treats each subdomain as a separate host, shielding cookies for example form one another.

Each host that serves some services requires a DNS entry that has to be published to the clients (from some DNS server). Depending on the environment this can become quite tedious as DNS resolution on the internet and intranets might have to deviate. This applies to intra-cluster communication as well, as seen with the idpbuilder’s platform. In this case, external DNS resolution has to be replicated within the cluster to be able to use the same URLs to address for example gitea.

The following example depicts DNS-only routing. By defining separate DNS entries for each service / subdomain requests are resolved to the respective servers. In theory, no additional infrastructure is necessary to route user traffic to each service. However, as services are completely separated other infrastructure like authentication possibly has to be duplicated.

DNS-only routing

When using hostname based routing, one does not have to set different IPs for each hostname. Instead, having multiple DNS entries pointing to the same set of IPs allows re-using existing infrastructure. As shown below, a reverse proxy is able to forward requests to downstream services based on the Host request parameter. This way specific hostname can be forwarded to a defined service.

Hostname Proxy

At the same time, one could imagine a multi-tenant system that differentiates customer systems by name, e.g. tenant-1.cool.system and tenant-2.cool.system. Configured as a wildcard-sytle domain, *.cool.system could point to a reverse proxy that forwards requests to a tenants instance of a system, allowing re-use of central infrastructure while still hosting separate systems per tenant.

The implicit dependency on DNS resolution generally makes this kind of routing more complex and error-prone as changes to DNS server entries are not always possible or modifiable by everyone. Also, local changes to your /etc/hosts file are a constant pain and should be seen as a dirty hack. As mentioned above, dynamic DNS solutions like nip.io are often helpful in this case.

Conclusion

Path and hostname based routing are the two most common methods of HTTP traffic routing. They can be used separately but more often they are used in conjunction. Due to HTTP’s versatility other forms of HTTP routing, for example based on the Content-Type Header are also very common.

4 - ArgoCD

A description of ArgoCD and its role in CNOE

What is ArgoCD?

ArgoCD is a Continuous Delivery tool for kubernetes based on GitOps principles.

ELI5: ArgoCD is an application running in kubernetes which monitors Git repositories containing some sort of kubernetes manifests and automatically deploys them to some configured kubernetes clusters.

From ArgoCD’s perspective, applications are defined as custom resource definitions within the kubernetes clusters that ArgoCD monitors. Such a definition describes a source git repository that contains kubernetes manifests, in the form of a helm chart, kustomize, jsonnet definitions or plain yaml files, as well as a target kubernetes cluster and namespace the manifests should be applied to. Thus, ArgoCD is capable of deploying applications to various (remote) clusters and namespaces.

ArgoCD monitors both the source and the destination. It applies changes from the git repository that acts as the source of truth for the destination as soon as they occur, i.e. if a change was pushed to the git repository, the change is applied to the kubernetes destination by ArgoCD. Subsequently, it checks whether the desired state was established. For example, it verifies that all resources were created, enough replicas started, and that all pods are in the running state and healthy.

Architecture

Core Components

An ArgoCD deployment mainly consists of 3 main components:

Application Controller

The application controller is a kubernetes operator that synchronizes the live state within a kubernetes cluster with the desired state derived from the git sources. It monitors the live state, can detect derivations, and perform corrective actions. Additionally, it can execute hooks on life cycle stages such as pre- and post-sync.

Repository Server

The repository server interacts with git repositories and caches their state, to reduce the amount of polling necessary. Furthermore, it is responsible for generating the kubernetes manifests from the resources within the git repositories, i.e. executing helm or jsonnet templates.

API Server

The API Server is a REST/gRPC Service that allows the Web UI and CLI, as well as other API clients, to interact with the system. It also acts as the callback for webhooks particularly from Git repository platforms such as GitHub or Gitlab to reduce repository polling.

Others

The system primarily stores its configuration as kubernetes resources. Thus, other external storage is not vital.

Redis
A Redis store is optional but recommended to be used as a cache to reduce load on ArgoCD components and connected systems, e.g. git repositories.
ApplicationSetController
The ApplicationSet Controller is similar to the Application Controller a kubernetes operator that can deploy applications based on parameterized application templates. This allows the deployment of different versions of an application into various environments from a single template.

Overview

Conceptual Architecture

Core components

Role in CNOE

ArgoCD is one of the core components besides gitea/forgejo that is being bootstrapped by the idpbuilder. Future project creation, e.g. through backstage, relies on the availability of ArgoCD.

After the initial bootstrapping phase, effectively all components in the stack that are deployed in kubernetes are managed by ArgoCD. This includes the bootstrapped components of gitea and ArgoCD which are onboarded afterward. Thus, the idpbuilder is only necessary in the bootstrapping phase of the platform and the technical coordination of all components shifts to ArgoCD eventually.

In general, the creation of new projects and applications should take place in backstop. It is a catalog of software components and best practices that allows developers to grasp and to manage their software portfolio. Underneath, however, the deployment of applications and platform components is managed by ArgoCD. Among others, backstage creates Application CRDs to instruct ArgoCD to manage deployments and subsequently report on their current state.

Glossary

Initially shamelessly copied from the docs

Application
A group of Kubernetes resources as defined by a manifest. This is a Custom Resource Definition (CRD).
ApplicationSet
A CRD that is a template that can create multiple parameterized Applications.
Application source type
Which Tool is used to build the application.
Configuration management tool
See Tool.
Configuration management plugin
A custom tool.
Health
The health of the application, is it running correctly? Can it serve requests?
Live state
The live state of that application. What pods etc are deployed.
Refresh
Compare the latest code in Git with the live state. Figure out what is different.
Sync
The process of making an application move to its target state. E.g. by applying changes to a Kubernetes cluster.
Sync status
Whether or not the live state matches the target state. Is the deployed application the same as Git says it should be?
Sync operation status
Whether or not a sync succeeded.
Target state
The desired state of an application, as represented by files in a Git repository.
Tool
A tool to create manifests from a directory of files. E.g. Kustomize. See Application Source Type.

5 - Validation and Verification

How does CNOE ensure equality between actual and desired state

Definition

The CNOE docs do somewhat interchange validation and verification but for the most part they adhere to the general definition:

Validation is used when you check your approach before actually executing an action.

Examples:

  • Form validation before processing the data
  • Compiler checking syntax
  • Rust’s borrow checker

Verification describes testing if your ’thing’ complies with your spec

Examples:

  • Unit tests
  • Testing availability (ping, curl health check)
  • Checking a ZKP of some computation

In CNOE

It seems that both validation and verification within the CNOE framework are not actually handled by some explicit component but should be addressed throughout the system and workflows.

As stated in the docs, validation takes place in all parts of the stack by enforcing strict API usage and policies (signing, mitigations, security scans etc, see usage of kyverno for example), and using code generation (proven code), linting, formatting, LSP. Consequently, validation of source code, templates, etc is more a best practice rather than a hard fact or feature and it is up to the user to incorporate them into their workflows and pipelines. This is probably due to the complexity of the entire stack and the individual properties of each component and applications.

Verification of artifacts and deployments actually exists in a somewhat similar state. The current CNOE reference-implementation does not provide sufficient verification tooling.

However, as stated in the docs within the framework cnoe-cli is capable of extremely limited verification of artifacts within kubernetes. The same verification is also available as a step within a backstage plugin. This is pretty much just a wrapper of the cli tool. The tool consumes CRD-like structures defining the state of pods and CRDs and checks for their existence within a live cluster (example).

Depending on the aspiration of ‘verification’ this check is rather superficial and might only suffice as an initial smoke test. Furthermore, it seems like the feature is not actually used within the CNOE stacks repo.

For a live product more in depth verification tools and schemes are necessary to verify the correct configuration and authenticity of workloads, which is, in the context of traditional cloud systems, only achievable to a limited degree.

Existing tools within the stack, e.g. Argo, provide some verification capabilities. But further investigation into the general topic is necessary.