# Standalone Kubernetes Operator

This guide explains how to run the Teleport Kubernetes Operator against any remote Teleport cluster. If your Teleport cluster is deployed using the `teleport-cluster` Helm chart, you might want to follow [the guide for Helm-deployed clusters](https://goteleport.com/docs/zero-trust-access/infrastructure-as-code/teleport-operator/teleport-operator-helm.md) instead.

## How it works

The Teleport Kubernetes Operator is a Teleport Auth Service client that you install using the `teleport-operator` Helm chart.

For the Operator to manage Teleport resources in your cluster, you need to authenticate it with your Teleport cluster and authorize it to manage Teleport resources. This requires the following additional resources, which we show you how to create in this guide:

- A Teleport role
- A join token
- A Machine & Workload Identity Bot

You can then deploy the Operator by installing the `teleport-operator` chart.

## Prerequisites

- A running Teleport cluster. If you want to get started with Teleport, [sign up](https://goteleport.com/signup) for a free trial or [set up a demo environment](https://goteleport.com/docs/get-started/deploy-community.md).

- The `tctl` and `tsh` clients.

  Installing `tctl` and `tsh` clients

  1. Determine the version of your Teleport cluster. The `tctl` and `tsh` clients must be at most one major version behind your Teleport cluster version. Send a GET request to the Proxy Service at `/v1/webapi/find` and use a JSON query tool to obtain your cluster version. Replace teleport.example.com:443 with the web address of your Teleport Proxy Service:

     ```
     $ TELEPORT_DOMAIN=teleport.example.com:443
     $ TELEPORT_VERSION="$(curl -s https://$TELEPORT_DOMAIN/v1/webapi/find | jq -r '.server_version')"
     ```

  2. Follow the instructions for your platform to install `tctl` and `tsh` clients:

     **Mac**

     Download the signed macOS .pkg installer for Teleport, which includes the `tctl` and `tsh` clients:

     ```
     $ curl -O https://cdn.teleport.dev/teleport-${TELEPORT_VERSION?}.pkg
     ```

     In Finder double-click the `pkg` file to begin installation.

     ---

     DANGER

     Using Homebrew to install Teleport is not supported. The Teleport package in Homebrew is not maintained by Teleport and we can't guarantee its reliability or security.

     ---

     **Windows - Powershell**

     ```
     $ curl.exe -O https://cdn.teleport.dev/teleport-v${TELEPORT_VERSION?}-windows-amd64-bin.zip
     Unzip the archive and move the `tctl` and `tsh` clients to your %PATH%
     NOTE: Do not place the `tctl` and `tsh` clients in the System32 directory, as this can cause issues when using WinSCP.
     Use %SystemRoot% (C:\Windows) or %USERPROFILE% (C:\Users\<username>) instead.
     ```

     **Linux**

     All of the Teleport binaries in Linux installations include the `tctl` and `tsh` clients. For more options (including RPM/DEB packages and downloads for i386/ARM/ARM64) see our [installation page](https://goteleport.com/docs/installation.md).

     ```
     $ curl -O https://cdn.teleport.dev/teleport-v${TELEPORT_VERSION?}-linux-amd64-bin.tar.gz
     $ tar -xzf teleport-v${TELEPORT_VERSION?}-linux-amd64-bin.tar.gz
     $ cd teleport
     $ sudo ./install
     Teleport binaries have been copied to /usr/local/bin
     ```

* a Kubernetes cluster. You must be able to create/read Namespace, ServiceAccount, Deployment, Secret, Role, RoleBinding and CustomResourceDefinition resources.
* [Helm](https://helm.sh/docs/intro/quickstart/)
* [kubectl](https://kubernetes.io/docs/tasks/tools/)

Validate Kubernetes connectivity by running the following command:

```
$ kubectl cluster-info
Kubernetes control plane is running at https://127.0.0.1:6443
CoreDNS is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/https:metrics-server:https/proxy
```

---

TIP

Users wanting to experiment locally with the operator can use [minikube](https://minikube.sigs.k8s.io/docs/start/) to start a local Kubernetes cluster:

```
$ minikube start
```

---

### Step 1/4. Create the operator role

In this step we create the role the operator uses to interact with Teleport resources.

Download and apply the operator role manifest:

```
$ curl -L https://raw.githubusercontent.com/gravitational/teleport/v19.0.0-dev/integrations/operator/hack/fixture-operator-role.yaml -o operator-role.yaml
$ tctl create -f operator-role.yaml
```

---

NOTE

If you upgrade the operator to a new version that adds support for new Teleport resources, you will need to re-apply the operator role manifest. This will grant the operator access to the new resources.

---

### Step 2/4. Create the operator join token

The join token is used by the operator on each startup to join the Teleport cluster and retrieve its client certificates.

To establish trust between the connecting operator and Teleport, we are delegating the authentication to Kubernetes. Kubernetes has its own internal CA which signs the ServiceAccount tokens that are mounted in the pods. In the following setup, Teleport will trust SA tokens signed by Kubernetes to join the cluster.

1. Retrieve the Kubernetes JWKS (the keys Teleport can use to validate Kubernetes SA tokens)
   ```
   $ export JWKS="$(kubectl get --raw /openid/v1/jwks)"
   ```
2. Create the token manifest that allows serviceaccount teleport-iac-operator from the namespace teleport-iac to join the cluster as the operator.
   ```
   $ cat <<EOF > operator-token.yaml
   kind: token
   version: v2
   metadata:
     name: operator-bot
   spec:
     roles: [Bot]
     # bot_name will match the name of the bot created later in this guide.
     bot_name: operator
     join_method: kubernetes
     kubernetes:
       type: static_jwks
       static_jwks:
         jwks: |
           $JWKS
       allow:
       - service_account: "teleport-iac:teleport-operator" # namespace:serviceaccount
   EOF
   ```
3. Then, apply the token manifest:
   ```
   $ tctl create -f operator-token.yaml
   ```
4. Finally, retrieve the Teleport cluster name that will be required to use the token:
   ```
   $ export CLUSTER_NAME="$(tctl status | awk '/Cluster/ {print $2}')"
   ```

### Step 3/4. Create the operator bot

In Teleport, a bot is a resource allowing a machine to access Teleport. Create a bot for the operator with the following command:

```
$ tctl bots add operator --token operator-bot --roles operator
```

### Step 4/4. Deploy the operator in the Kubernetes cluster

At this point, you can configure and run the operator:

Configure Helm to fetch Teleport charts from the Teleport Helm repository:

```
$ helm repo add teleport https://charts.releases.teleport.dev
```

Refresh the local Helm cache by fetching the latest charts:

```
$ helm repo update
```

1. Recover the version of your Teleport cluster
   ```
   export TELEPORT_VERSION="$(tsh version | awk '/Proxy[[:space:]]version/ {print $3}')"
   echo "$TELEPORT_VERSION"
   ```
2. Create the Kubernetes namespace that will contain both the operator Pods and the CustomResources to configure Teleport:
   ```
   $ kubectl create namespace teleport-iac
   ```
3. Apply the strictest Pod Security Standard on the namespace:
   ```
   $ kubectl label namespace teleport-iac 'pod-security.kubernetes.io/enforce=restricted'
   ```
4. Deploy the operator with Helm:
   ```
   $ helm install teleport-operator teleport/teleport-operator -n teleport-iac --version "$TELEPORT_VERSION" --set teleportAddress=teleport.example.com:443 --set "teleportClusterName=$CLUSTER_NAME" --set token=operator-bot
   ```
5. Validate that operator is running properly (the operator might take a few seconds to start):
   ```
   $ kubectl get pods -n teleport-iac
   ```

## Next steps

Follow [the user and role IaC guide](https://goteleport.com/docs/zero-trust-access/infrastructure-as-code/managing-resources/user-and-role.md) to use your newly deployed Teleport Kubernetes Operator to create Teleport users and grant them roles.

Helm Chart parameters are documented in the [`teleport-operator` Helm chart reference](https://goteleport.com/docs/reference/helm-reference/teleport-operator.md).

## Troubleshooting

### The CustomResources (CRs) are not reconciled

The Teleport Operator watches for new resources or changes in Kubernetes. When a change happens, it triggers the reconciliation loop. This loop is in charge of validating the resource, checking if it already exists in Teleport and making calls to the Teleport API to create/update/delete the resource. The reconciliation loop also adds a `status` field on the Kubernetes resource.

If an error happens and the reconciliation loop is not successful, an item in `status.conditions` will describe what went wrong. This allows users to diagnose errors by inspecting Kubernetes resources with `kubectl`:

```
$ kubectl describe teleportusers myuser
```

For example, if a user has been granted a nonexistent role the status will look like:

```
apiVersion: resources.teleport.dev/v2
kind: TeleportUser
# [...]
status:
  conditions:
  - lastTransitionTime: "2022-07-25T16:15:52Z"
    message: Teleport resource has the Kubernetes origin label.
    reason: OriginLabelMatching
    status: "True"
    type: TeleportResourceOwned
  - lastTransitionTime: "2022-07-25T17:08:58Z"
    message: 'Teleport returned the error: role my-non-existing-role is not found'
    reason: TeleportError
    status: "False"
    type: SuccessfullyReconciled

```

Here `SuccessfullyReconciled` is `False` and the error is `role my-non-existing-role is not found`.

If the status is not present or does not give sufficient information to solve the issue, check the operator logs:

### The CR doesn't have a status

1. Check if the CR is in the same namespace as the operator. The operator only watches for resource in its own namespace.

2. Check if the operator pods are running and healthy:

   ```
   kubectl get pods -n "$OPERATOR_NAMESPACE"`
   ```

3. Check the operator logs:

   ```
   $ kubectl logs deploy/<OPERATOR_DEPLOYMENT_NAME> -n "$OPERATOR_NAMESPACE"
   ```

   ---

   NOTE

   In case of multi-replica deployments, only one operator instance is running the reconciliation loop. This operator is called the leader and is the only one producing reconciliation logs. The other operator instances are waiting with the following log:

   ```
   leaderelection.go:248] attempting to acquire leader lease teleport/431e83f4.teleport.dev...

   ```

   To diagnose reconciliation issues, you will have to inspect all pods to find the one reconciling the resources.

   ---

### I cannot delete the Kubernetes CR

The operator protects Kubernetes CRs from deletion with a finalizer. It will not allow the CR to be deleted until the Teleport resource is deleted as well, this is a safety to avoid leaving dangling resources and potentially grant unintentional access.

There might be some reasons causing Teleport to refuse a resource deletion, the most frequent one is if another resource depends on it. For example: you cannot delete a role if it is still assigned to a user.

If this happens, the operator will report the error sent by Teleport in its log.

To resolve this lock, you can either:

- resolve the dependency issue so the resource gets successfully deleted in Teleport. In the role example, this would imply removing any mention of the role from the various users who had it.

- patch the Kubernetes CR to remove the finalizers. This will tell Kubernetes to stop waiting for the operator deletion and remove the CR. If you do this, the CR will be removed but the Teleport resource will remain. The operator will never attempt to remove it again.

  For example, if the role is named `my-role`:

  ```
  kubectl patch TeleportRole my-role -p '{"metadata":{"finalizers":null}}' --type=merge
  ```
