Kubernetes Custom Resource Definition (CRD) Controller: A Comprehensive Guide

Kubernetes has revolutionized the way we deploy and manage containerized applications. At its core, Kubernetes provides a set of built - in resources like Pods, Services, and Deployments. However, in many real - world scenarios, these built - in resources may not fully meet the specific needs of an organization or application. This is where Custom Resource Definitions (CRDs) and CRD controllers come into play. A CRD allows you to define your own custom resources in Kubernetes, essentially extending the Kubernetes API. A CRD controller, on the other hand, is a piece of software that watches the state of these custom resources and takes appropriate actions to ensure that the actual state matches the desired state, similar to how built - in Kubernetes controllers work for built - in resources.

Table of Contents

  1. Core Concepts
    • Custom Resource Definitions (CRDs)
    • CRD Controllers
    • Desired State vs. Actual State
  2. Typical Usage Example
    • Creating a CRD
    • Developing a CRD Controller
    • Testing the CRD and Controller
  3. Common Practices
    • Error Handling
    • Event Filtering
    • Caching
  4. Best Practices
    • Modularity and Reusability
    • Monitoring and Logging
    • Versioning of CRDs
  5. Conclusion
  6. References

Core Concepts

Custom Resource Definitions (CRDs)

A CRD is a way to teach Kubernetes about a new kind of object. It is a declarative way to define the schema of a custom resource. For example, if you are building a machine learning platform on top of Kubernetes, you might want to define a custom resource for a “ModelDeployment”. The CRD will specify the structure of this resource, including fields like the model name, version, and the number of replicas.

Here is a simple example of a CRD YAML:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: modeldeployments.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                modelName:
                  type: string
                modelVersion:
                  type: string
                replicas:
                  type: integer
  scope: Namespaced
  names:
    plural: modeldeployments
    singular: modeldeployment
    kind: ModelDeployment
    shortNames:
      - md

CRD Controllers

A CRD controller is a controller loop that continuously watches the API server for changes to the custom resources defined by the CRD. When a change is detected, the controller takes action to reconcile the actual state of the system with the desired state specified in the custom resource. For instance, if a ModelDeployment resource is created with a desired number of replicas, the controller will ensure that the appropriate number of model pods are running.

Desired State vs. Actual State

The concept of desired state vs. actual state is fundamental to Kubernetes controllers. The desired state is what the user specifies in the custom resource. For example, in the ModelDeployment resource, the desired number of replicas is part of the desired state. The actual state is the current state of the system. The controller’s job is to bridge the gap between the two states. If the desired number of replicas is 3 and the actual number is 1, the controller will create two more pods.

Typical Usage Example

Creating a CRD

To create the ModelDeployment CRD, you can use the kubectl command:

kubectl apply -f modeldeployment-crd.yaml

You can verify that the CRD has been created successfully by running:

kubectl get crds modeldeployments.example.com

Developing a CRD Controller

There are several frameworks available for developing CRD controllers, such as controller - runtime and operator - SDK. Here is a high - level overview of how to develop a simple controller using controller - runtime:

package main

import (
    "context"
    "fmt"
    "log"

    "k8s.io/apimachinery/pkg/runtime"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"

    examplecomv1 "github.com/yourusername/modeldeployment/api/v1"
)

func main() {
    scheme := runtime.NewScheme()
    if err := examplecomv1.AddToScheme(scheme); err != nil {
        log.Fatalf("failed to add scheme: %v", err)
    }

    mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
        Scheme: scheme,
    })
    if err != nil {
        log.Fatalf("failed to create manager: %v", err)
    }

    if err = (&examplecomv1.ModelDeploymentReconciler{
        Client: mgr.GetClient(),
        Log:    ctrl.Log.WithName("controllers").WithName("ModelDeployment"),
    }).SetupWithManager(mgr); err != nil {
        log.Fatalf("failed to setup controller: %v", err)
    }

    fmt.Println("Starting manager")
    if err := mgr.Start(context.Background()); err != nil {
        log.Fatalf("failed to start manager: %v", err)
    }
}

Testing the CRD and Controller

Create a custom resource instance:

apiVersion: example.com/v1
kind: ModelDeployment
metadata:
  name: my-model-deployment
spec:
  modelName: mymodel
  modelVersion: v1.0
  replicas: 2

Apply the custom resource:

kubectl apply -f my-model-deployment.yaml

The controller should detect the new resource and start creating the necessary pods.

Common Practices

Error Handling

Error handling is crucial in CRD controllers. When an error occurs during the reconciliation process, the controller should handle it gracefully. For example, if the controller fails to create a pod, it should log the error and retry the operation a certain number of times before giving up.

Event Filtering

To improve the efficiency of the controller, event filtering can be used. The controller can filter events based on the type of change (create, update, delete) or the resource’s labels. For example, if the controller only needs to handle updates to a specific set of ModelDeployment resources, it can filter out other events.

Caching

Caching can significantly improve the performance of the controller. Instead of querying the API server for every reconciliation, the controller can cache the relevant resources. This reduces the load on the API server and speeds up the reconciliation process.

Best Practices

Modularity and Reusability

Design your controller in a modular way. Break down the reconciliation logic into smaller functions that can be easily tested and reused. This makes the code more maintainable and easier to extend.

Monitoring and Logging

Implement proper monitoring and logging in your controller. Use tools like Prometheus and Grafana to monitor the controller’s performance metrics, such as the number of reconciliations per second and the error rate. Log detailed information about the reconciliation process, including the actions taken and any errors encountered.

Versioning of CRDs

When evolving your custom resources, use versioning in your CRDs. This allows you to make changes to the resource schema without breaking existing deployments. Kubernetes supports multiple versions of a CRD, and you can gradually migrate users to the new version.

Conclusion

Kubernetes CRD controllers are a powerful tool for extending the Kubernetes API and automating custom workflows. By understanding the core concepts, following common practices, and implementing best practices, you can develop robust and efficient CRD controllers. Whether you are building a custom application platform or integrating with third - party services, CRD controllers can help you manage your custom resources effectively.

References