Deploy the operator

Helm is the officially supported method to install the ToolHive operator in a Kubernetes cluster.

Prerequisites

A Kubernetes cluster (current and two previous minor versions are supported)
Permissions to create resources in the cluster
kubectl configured to communicate with your cluster
Helm (v3.10 minimum, v3.14+ recommended)

Install the CRDs

The ToolHive operator requires Custom Resource Definitions (CRDs) to manage MCPServer resources. The CRDs define the structure and behavior of MCPServers in your cluster.

helm upgrade --install toolhive-operator-crds oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds

This command installs the latest version of the ToolHive operator CRDs Helm chart. To install a specific version, append --version <VERSION> to the command, for example:

helm upgrade --install toolhive-operator-crds oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds --version 0.0.52

Install the operator

To install the ToolHive operator using default settings, run the following command:

helm upgrade --install toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system --create-namespace

This command installs the latest version of the ToolHive operator CRDs Helm chart. To install a specific version, append --version <VERSION> to the command, for example:

helm upgrade --install toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system --create-namespace --version 0.3.7

Verify the installation:

kubectl get pods -n toolhive-system

After about 30 seconds, you should see the toolhive-operator pod running.

Check the logs of the operator pod:

kubectl logs -f -n toolhive-system <TOOLHIVE_OPERATOR_POD_NAME>

This shows you the logs of the operator pod, which can help you debug any issues. For comprehensive logging and audit capabilities, see the Logging infrastructure guide.

Customize the operator

You can customize the operator installation by providing a values.yaml file with your configuration settings. For example, to change the number of replicas and set a specific ToolHive version, create a values.yaml file:

values.yaml
operator:
  replicaCount: 2
  toolhiveRunnerImage: ghcr.io/stacklok/toolhive:v0.2.17 # or `latest`

Install the operator with your custom values:

helm upgrade --install toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator\
  -n toolhive-system --create-namespace\
  -f values.yaml

To see all available configuration options, run:

helm show values oci://ghcr.io/stacklok/toolhive/toolhive-operator

Operator deployment modes

The ToolHive operator supports two distinct deployment modes to accommodate different security requirements and organizational structures.

Cluster mode (default)

Cluster mode provides the operator with cluster-wide access to manage MCPServer resources in any namespace. This is the default mode and is suitable for platform teams managing MCPServers across the entire cluster.

Characteristics:

Full cluster-wide access to manage MCPServers in any namespace
Uses ClusterRole and ClusterRoleBinding for broad permissions
Simplest configuration and management
Best for single-tenant clusters or trusted environments

To explicitly configure cluster mode, include the following property in your Helm values.yaml file:

values.yaml
operator:
  rbac:
    scope: 'cluster'

Reference the values.yaml file when you install the operator using Helm:

helm upgrade --install toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator \
  -n toolhive-system --create-namespace
  -f values.yaml

This is the default configuration used in the standard installation commands.

Namespace mode

Namespace mode restricts the operator's access to only specified namespaces. This mode is perfect for multi-tenant environments and organizations following the principle of least privilege.

Characteristics:

Restricted access to only specified namespaces
Uses ClusterRole with namespace-specific RoleBindings for precise access control
Enhanced security through reduced blast radius
Ideal for multi-tenant environments and compliance requirements

To configure namespace mode, include the following in your Helm values.yaml:

values.yaml
operator:
  rbac:
    scope: 'namespace'
    allowedNamespaces:
      - 'team-frontend'
      - 'team-backend'
      - 'staging'
      - 'production'

This example lets the operator manage MCPServer resources in the four namespaces listed in the allowedNamespaces property. Adjust the list to match your environment.

Reference the values.yaml file when you install the operator using Helm:

helm upgrade --install toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator \
  -n toolhive-system --create-namespace
  -f values.yaml

Verify the RoleBindings are created:

kubectl get rolebinding --all-namespaces | grep toolhive

You should see RoleBindings in the specified namespaces, granting the operator access to manage MCPServers. Example output:

NAMESPACE        NAME                                           ROLE
team-frontend    toolhive-operator-manager-rolebinding          ClusterRole/toolhive-operator-manager-role
team-backend     toolhive-operator-manager-rolebinding          ClusterRole/toolhive-operator-manager-role
staging          toolhive-operator-manager-rolebinding          ClusterRole/toolhive-operator-manager-role
production       toolhive-operator-manager-rolebinding          ClusterRole/toolhive-operator-manager-role
toolhive-system  toolhive-operator-leader-election-rolebinding  Role/toolhive-operator-leader-election-role

Migrate between modes

You can switch between cluster mode and namespace mode by updating the values.yaml file and reapplying the Helm chart as shown above. Migration in both directions is supported.

Check operator status

To verify the operator is working correctly:

# Verify CRDs are installed
kubectl get crd | grep toolhive

# Check operator deployment status
kubectl get deployment -n toolhive-system toolhive-operator

# Check operator service account and RBAC
kubectl get serviceaccount -n toolhive-system
kubectl get clusterrole | grep toolhive
kubectl get clusterrolebinding | grep toolhive

# Check operator pod status
kubectl get pods -n toolhive-system
# Check operator pod logs
kubectl logs -n toolhive-system <TOOLHIVE_OPERATOR_POD_NAME>

Upgrade the operator

To upgrade the ToolHive operator to a new version, you need to upgrade both the CRDs and the operator installation.

Upgrade the CRDs

To upgrade the ToolHive operator to a new version, upgrade the CRDs first. Helm does not upgrade CRDs automatically, so you need to upgrade the CRD Helm chart and then apply the CRDs using kubectl.

First, upgrade the CRD Helm chart to match your target operator version:

helm upgrade -i toolhive-operator-crds oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds --version 0.0.52

Then apply the CRDs from the same version tag:

kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/tags/toolhive-operator-crds-0.0.52/deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpexternalauthconfigs.yaml
kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/tags/toolhive-operator-crds-0.0.52/deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcptoolconfigs.yaml
kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/tags/toolhive-operator-crds-0.0.52/deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpremoteproxies.yaml
kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/tags/toolhive-operator-crds-0.0.52/deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpservers.yaml
kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/tags/toolhive-operator-crds-0.0.52/deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpgroups.yaml
kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/tags/toolhive-operator-crds-0.0.52/deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpregistries.yaml

Replace 0.0.52 in both commands with your target CRD version.

Upgrade the operator Helm release

Then, upgrade the operator installation using Helm.

helm upgrade -i toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system --reuse-values

This upgrades the operator to the latest version available in the OCI registry. To upgrade to a specific version, add the --version flag:

helm upgrade -i toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system --reuse-values --version 0.3.7

If you have a custom values.yaml file, include it with the -f flag:

helm upgrade -i toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system --reuse-values -f values.yaml

Uninstall the operator

To uninstall the operator and CRDs:

First, uninstall the operator:

helm uninstall toolhive-operator -n toolhive-system

Then, if you want to completely remove ToolHive including all CRDs and related resources, delete the CRDs manually.

warning

This will delete all MCPServer and related resources in your cluster!

kubectl delete crd mcpexternalauthconfigs.toolhive.stacklok.dev
kubectl delete crd mcptoolconfigs.toolhive.stacklok.dev
kubectl delete crd mcpremoteproxies.toolhive.stacklok.dev
kubectl delete crd mcpservers.toolhive.stacklok.dev
kubectl delete crd mcpgroups.toolhive.stacklok.dev
kubectl delete crd mcpregistries.toolhive.stacklok.dev

Finally, uninstall the CRD Helm chart metadata:

helm uninstall toolhive-operator-crds

If you created the toolhive-system namespace with Helm's --create-namespace flag, delete it manually:

kubectl delete namespace toolhive-system

Next steps

See Run MCP servers in Kubernetes to learn how to create and manage MCP servers using the ToolHive operator in your Kubernetes cluster. The operator supports deploying MCPServer resources based on the deployment mode configured during installation.

Kubernetes introduction - Overview of ToolHive's Kubernetes integration
ToolHive operator tutorial - Step-by-step tutorial for getting started using a local kind cluster

Troubleshooting

Authentication error with ghcr.io

If you encounter an authentication error when pulling the Helm chart, it might indicate a problem with your access to the GitHub Container Registry (ghcr.io).

ToolHive's charts and images are public, but if you've previously logged into ghcr.io using a personal access token, you might need to re-authenticate if your token has expired or been revoked.

See the GitHub documentation to re-authenticate to the registry.

Operator pod fails to start

If the operator pod is not starting or is in a CrashLoopBackOff state, check the pod logs for error messages:

kubectl get pods -n toolhive-system
# Note the name of the toolhive-operator pod

kubectl describe pod -n toolhive-system <TOOLHIVE_OPERATOR_POD_NAME>
kubectl logs -n toolhive-system <TOOLHIVE_OPERATOR_POD_NAME>

Common causes include:

Missing CRDs: Ensure the CRDs were installed successfully before installing the operator. The operator requires the CRDs to function properly.
Configuration errors: Check your values.yaml file for any misconfigurations
Insufficient permissions: Ensure your cluster has the necessary RBAC permissions for the operator to function
Resource constraints: Check if the cluster has sufficient CPU and memory resources available
Image pull issues: Verify that the cluster can pull images from ghcr.io

CRDs installation fails

If the CRDs installation fails, you might see errors about existing resources or permission issues:

# Check if CRDs already exist
kubectl get crd | grep toolhive

# Remove existing CRDs if needed (this will delete all related resources)
kubectl delete crd <CRD_NAME>

To reinstall the CRDs:

helm uninstall toolhive-operator-crds
helm upgrade -i toolhive-operator-crds oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds

Namespace creation issues

If you encounter permission errors when creating the toolhive-system namespace, create it manually first:

kubectl create namespace toolhive-system

Then install the operator without the --create-namespace flag:

helm upgrade -i toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system

Helm chart not found

If Helm cannot find the chart, ensure you're using the correct OCI registry URL and that your Helm version supports OCI registries (v3.8.0+):

# Check Helm version
helm version

# Try pulling the chart explicitly
helm pull oci://ghcr.io/stacklok/toolhive/toolhive-operator

Network connectivity issues

If you're experiencing network timeouts or connection issues:

Verify your cluster has internet access to reach ghcr.io
Check if your organization uses a proxy or firewall that might block access
Consider using a private registry mirror if direct access is restricted

Prerequisites​

Install the CRDs​

Install the operator​

Customize the operator​

Operator deployment modes​

Cluster mode (default)​

Namespace mode​

Migrate between modes​

Check operator status​

Upgrade the operator​

Upgrade the CRDs​

Upgrade the operator Helm release​

Uninstall the operator​

Next steps​

Related information​

Troubleshooting​