Skip to main content
Version: Next

Deploying with Helm Charts

This page provides instructions for deploying a Fluss cluster on Kubernetes using Helm charts. The chart creates a distributed streaming storage system with CoordinatorServer and TabletServer components.

Prerequisites

Before installing the Fluss Helm chart, ensure you have:

note

A Fluss cluster deployment requires a running ZooKeeper ensemble. To provide flexibility in deployment and enable reuse of existing infrastructure, the Fluss Helm chart does not include a bundled ZooKeeper cluster. If you don’t already have a ZooKeeper running, the installation documentation provides instructions for deploying one using Bitnami’s Helm chart.

Supported Versions

ComponentMinimum VersionRecommended Version
Kubernetesv1.19+v1.25+
Helmv3.8.0+v3.18.6+
ZooKeeperv3.6+v3.8+
Apache Fluss (Container Image)1.0-SNAPSHOT1.0-SNAPSHOT
Minikube (Local Development)v1.25+v1.32+
Docker (Local Development)v20.10+v24.0+

Installation

Running Fluss locally with Minikube

For local testing and development, you can deploy Fluss on Minikube. This is ideal for development, testing and learning purposes.

Prerequisites

  • Docker container runtime
  • At least 4GB RAM available for Minikube
  • At least 2 CPU cores available

Start Minikube

# Start Minikube with recommended settings for Fluss
minikube start

# Verify cluster is ready
kubectl cluster-info

Configure Docker Environment (Optional)

To build images directly in Minikube you need to configure the Docker CLI to use Minikube's internal Docker daemon:

# Configure shell to use Minikube's Docker daemon
eval $(minikube docker-env)

To build custom images please refer to Custom Container Images.

Installing the chart on a cluster

This installation process is generally working for a distributed Kubernetes cluster or a Minikube setup.

Step 1: Deploy ZooKeeper (Optional if ZooKeeper is existing)

To start Zookeeper use Bitnami’s chart or your own deployment. If you have an existing Zookeeper cluster, you can skip this step. Example with Bitnami’s chart:

# Add Bitnami repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# Deploy ZooKeeper
helm install zk bitnami/zookeeper

Step 2: Deploy Fluss

Install from Helm repo

helm repo add fluss https://downloads.apache.org/incubator/fluss/helm-chart
helm repo update
helm install helm-repo/fluss

Install from Local Chart

helm install fluss ./helm

Install with Custom Values

You can customize the installation by providing your own values.yaml file or setting individual parameters via the --set flag. Using a custom values file:

helm install fluss ./helm -f my-values.yaml

Or for example to change the ZooKeeper address via the --set flag:

helm install fluss ./helm \
--set configurationOverrides.zookeeper.address=<my-zk-cluster>:2181

Cleanup

# Uninstall Fluss
helm uninstall fluss

# Uninstall ZooKeeper
helm uninstall zk

# Delete PVCs
kubectl delete pvc -l app.kubernetes.io/name=fluss

# Stop Minikube
minikube stop

# Delete Minikube cluster
minikube delete

Architecture Overview

The Fluss Helm chart deploys the following Kubernetes resources:

Core Components

  • CoordinatorServer: 1x StatefulSet with Headless Service for cluster coordination
  • TabletServer: 3x StatefulSet with Headless Service for data storage and processing
  • ConfigMap: Configuration management for server.yaml settings
  • Services: Headless services providing stable pod DNS names, plus optional dedicated headless services when metrics are enabled

Step 3: Verify Installation

# Check pod status
kubectl get pods -l app.kubernetes.io/name=fluss

# Check services
kubectl get svc -l app.kubernetes.io/name=fluss

# View logs
kubectl logs -l app.kubernetes.io/component=coordinator
kubectl logs -l app.kubernetes.io/component=tablet

Configuration Parameters

The following table lists the configurable parameters of the Fluss chart, and their default values.

Global Parameters

ParameterDescriptionDefault
nameOverrideOverride the name of the chart""
fullnameOverrideOverride the full name of the resources""

Image Parameters

ParameterDescriptionDefault
image.registryContainer image registry""
image.repositoryContainer image repositoryfluss
image.tagContainer image tag1.0-SNAPSHOT
image.pullPolicyContainer image pull policyIfNotPresent
image.pullSecretsContainer image pull secrets[]

Application Configuration

ParameterDescriptionDefault
listeners.internal.portInternal communication port9123
listeners.client.portClient port (intra-cluster)9124

Security Configuration

ParameterDescriptionDefault
security.client.sasl.mechanismClient listener SASL mechanism ("", plain)""
security.internal.sasl.mechanismInternal listener SASL mechanism ("", plain)""
security.client.sasl.plain.usersClient listener username and password pairs for PLAIN[]
security.internal.sasl.plain.usernameInternal listener PLAIN username""
security.internal.sasl.plain.passwordInternal listener PLAIN password""
security.internal.sasl.plain.existingSecretReference to a pre-existing Secret for internal SASL credentials{}

Only plain mechanism is supported for now. An empty string disables the SASL authentication, and maps to the PLAINTEXT protocol.

If the internal SASL username or password is left empty, the chart automatically generates credentials based on the Helm release name:

  • Username is set to the "fluss-internal-user-<release-name>"
  • Password is set to the SHA-256 hash of "fluss-internal-password-<release-name>"

It is recommended to set these explicitly in production.

ZooKeeper SASL Parameters

ParameterDescriptionDefault
security.zookeeper.sasl.mechanismZooKeeper SASL mechanism ("", plain)""
security.zookeeper.sasl.plain.usernameZooKeeper SASL username""
security.zookeeper.sasl.plain.passwordZooKeeper SASL password""
security.zookeeper.sasl.plain.loginModuleClassJAAS login module class for ZooKeeperorg.apache.fluss.shaded.zookeeper3.org.apache.zookeeper.server.auth.DigestLoginModule
security.zookeeper.sasl.plain.existingSecretReference to a pre-existing Secret for ZooKeeper SASL credentials{}

Sourcing SASL Credentials from a Pre-existing Secret

To keep SASL passwords out of values.yaml and the Helm release storage, reference a Secret managed separately — e.g., via External Secrets Operator, Sealed Secrets, or a CI pipeline.

For internal and ZooKeeper listeners, set existingSecret on the listener:

security:
internal:
sasl:
mechanism: plain
plain:
existingSecret:
name: fluss-internal-sasl # required
usernameKey: username # optional, defaults to "username"
passwordKey: password # optional, defaults to "password"
zookeeper:
sasl:
mechanism: plain
plain:
existingSecret:
name: fluss-zk-sasl

Client users follow the same shape as internal/ZooKeeper listeners: each entry is either a literal {username, password} pair or an existingSecret reference that sources both fields from a Secret.

security:
client:
sasl:
mechanism: plain
plain:
users:
- username: alice
password: alice-literal-password # literal — visible in values.yaml
- existingSecret: # or resolved at pod startup
name: fluss-client-sasl-bob
usernameKey: username # optional, defaults to "username"
passwordKey: password # optional, defaults to "password"

Whenever JAAS is required, the chart renders a ConfigMap (<release>-fluss-sasl-jaas-config) containing a jaas.conf template with ${FLUSS_JAAS_…} placeholders — no credentials. An init container mounts that template, runs envsubst with credentials supplied via env vars (either literal value: entries from values.yaml or valueFrom.secretKeyRef to a pre-existing Secret), and writes the resolved jaas.conf to an in-memory emptyDir that the main Fluss container reads.

  • Literal and Secret-sourced credentials can be mixed across listeners.
  • When every credential comes from a Secret, no plaintext password lives in the Helm release.
  • The init container reuses the main Fluss image (already present on the node), keeping zero extra image dependencies.
Example: External Secrets Operator

If you use External Secrets Operator to sync credentials from an upstream secret manager (AWS Secrets Manager, Vault, GCP Secret Manager, etc.), the flow is: upstream → ExternalSecret CR → a Kubernetes Secret → the chart.

For internal listener credentials stored at prod/fluss/internal in AWS Secrets Manager with fields username and password:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: fluss-internal-sasl
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secretsmanager
kind: SecretStore
target:
name: fluss-internal-sasl
data:
- secretKey: username
remoteRef:
key: prod/fluss/internal
property: username
- secretKey: password
remoteRef:
key: prod/fluss/internal
property: password

Then in values.yaml:

security:
internal:
sasl:
mechanism: plain
plain:
existingSecret:
name: fluss-internal-sasl

For the multi-user client listener, provision one Secret per user with username and password keys:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: fluss-client-sasl-alice
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secretsmanager
kind: SecretStore
target:
name: fluss-client-sasl-alice
data:
- secretKey: username
remoteRef:
key: prod/fluss/clients/alice
property: username
- secretKey: password
remoteRef:
key: prod/fluss/clients/alice
property: password
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: fluss-client-sasl-bob
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secretsmanager
kind: SecretStore
target:
name: fluss-client-sasl-bob
data:
- secretKey: username
remoteRef:
key: prod/fluss/clients/bob
property: username
- secretKey: password
remoteRef:
key: prod/fluss/clients/bob
property: password
security:
client:
sasl:
mechanism: plain
plain:
users:
- existingSecret: { name: fluss-client-sasl-alice }
- existingSecret: { name: fluss-client-sasl-bob }

The same pattern works with Sealed Secrets, HashiCorp Vault Agent Injector (producing a native Secret), or any other controller that lands credentials in a Secret — the chart only cares about the final Secret, not how it got there.

Metrics Parameters

ParameterDescriptionDefault
metrics.reportersComma-separated reporter selector; use "" to disable metrics""
metrics.jmx.portJMX reporter port range9250
metrics.prometheus.portPrometheus reporter port9249
metrics.prometheus.service.portNameNamed port exposed on metrics servicesmetrics
metrics.prometheus.service.labelsAdditional labels added to metrics services{}
metrics.prometheus.service.annotationsOptional annotations added to metrics services{}

Fluss Configuration Overrides

ParameterDescriptionDefault
configurationOverrides.default.bucket.numberDefault number of buckets for tables3
configurationOverrides.default.replication.factorDefault replication factor3
configurationOverrides.zookeeper.path.rootZooKeeper root path for Fluss/fluss
configurationOverrides.zookeeper.addressZooKeeper ensemble addresszk-zookeeper.{{ .Release.Namespace }}.svc.cluster.local:2181
configurationOverrides.remote.data.dirRemote data directory for snapshots/tmp/fluss/remote-data
configurationOverrides.data.dirLocal data directory/tmp/fluss/data
configurationOverrides.internal.listener.nameInternal listener nameINTERNAL

Tablet Server Parameters

ParameterDescriptionDefault
tablet.numberOfReplicasNumber of TabletServer replicas to deploy3

Scheduling Parameters

ParameterDescriptionDefault
tablet.affinityAffinity rules for TabletServer pods{}
tablet.nodeSelectorNode selector for TabletServer pods{}
tablet.tolerationsTolerations for TabletServer pods[]
tablet.topologySpreadConstraintsTopology spread constraints for TabletServer pods[]
coordinator.affinityAffinity rules for CoordinatorServer pods{}
coordinator.nodeSelectorNode selector for CoordinatorServer pods{}
coordinator.tolerationsTolerations for CoordinatorServer pods[]
coordinator.topologySpreadConstraintsTopology spread constraints for CoordinatorServer pods[]

Storage Parameters

ParameterDescriptionDefault
coordinator.storage.enabledEnable persistent volume claims for CoordinatorServerfalse
coordinator.storage.sizeCoordinator persistent volume size1Gi
coordinator.storage.storageClassCoordinator storage class namenil (uses default)
tablet.storage.enabledEnable persistent volume claims for TabletServerfalse
tablet.storage.sizeTablet persistent volume size1Gi
tablet.storage.storageClassTablet storage class namenil (uses default)

Resource Parameters

ParameterDescriptionDefault
resources.coordinatorServer.requests.cpuCPU requests for coordinatorNot set
resources.coordinatorServer.requests.memoryMemory requests for coordinatorNot set
resources.coordinatorServer.limits.cpuCPU limits for coordinatorNot set
resources.coordinatorServer.limits.memoryMemory limits for coordinatorNot set
resources.tabletServer.requests.cpuCPU requests for tablet serversNot set
resources.tabletServer.requests.memoryMemory requests for tablet serversNot set
resources.tabletServer.limits.cpuCPU limits for tablet serversNot set
resources.tabletServer.limits.memoryMemory limits for tablet serversNot set

Pod Extension Parameters

ParameterDescriptionDefault
coordinator.extraVolumesExtra volumes to add to the CoordinatorServer pod spec[]
coordinator.extraVolumeMountsExtra volume mounts to add to the coordinator container[]
coordinator.initContainersInit containers to run before the coordinator container starts[]
coordinator.extraEnvAdditional environment variables for the coordinator container[]
coordinator.envFromAdditional envFrom sources (e.g., Secrets, ConfigMaps) for the coordinator container[]
coordinator.podAnnotationsAnnotations to add to CoordinatorServer pods{}
coordinator.podLabelsAdditional labels to add to CoordinatorServer pods{}
coordinator.podDisruptionBudget.enabledEnable PodDisruptionBudget for CoordinatorServerfalse
coordinator.podDisruptionBudget.minAvailableMinimum available coordinator pods during disruptionNot set
coordinator.podDisruptionBudget.maxUnavailableMaximum unavailable coordinator pods during disruptionNot set
tablet.extraVolumesExtra volumes to add to TabletServer pod specs[]
tablet.extraVolumeMountsExtra volume mounts to add to the tablet container[]
tablet.initContainersInit containers to run before the tablet container starts[]
tablet.extraEnvAdditional environment variables for the tablet container[]
tablet.envFromAdditional envFrom sources (e.g., Secrets, ConfigMaps) for the tablet container[]
tablet.podAnnotationsAnnotations to add to TabletServer pods{}
tablet.podLabelsAdditional labels to add to TabletServer pods{}
tablet.podDisruptionBudget.enabledEnable PodDisruptionBudget for TabletServerfalse
tablet.podDisruptionBudget.minAvailableMinimum available tablet server pods during disruptionNot set
tablet.podDisruptionBudget.maxUnavailableMaximum unavailable tablet server pods during disruptionNot set

Advanced Configuration

Injecting Environment Variables from External Secrets

You can inject environment variables from Kubernetes Secrets or ConfigMaps using envFrom. This is useful when combined with the External Secrets Operator or similar tools that provision Secrets from external stores (AWS Secrets Manager, HashiCorp Vault, etc.).

tablet:
envFrom:
- secretRef:
name: aws-credentials
coordinator:
envFrom:
- secretRef:
name: aws-credentials

You can also set individual environment variables using extraEnv:

tablet:
extraEnv:
- name: AWS_REGION
value: us-east-1
- name: MY_SECRET
valueFrom:
secretKeyRef:
name: my-secret
key: password

Custom ZooKeeper Configuration

For external ZooKeeper clusters:

configurationOverrides:
zookeeper.address: "zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181"
zookeeper.path.root: "/my-fluss-cluster"

Network Configuration

The chart automatically configures listeners for internal cluster communication and external client access:

  • Internal Port (9123): Used for internal communication within the cluster
  • Client Port (9124): Used for client connections

Custom listener configuration:

listeners:
internal:
port: 9123
client:
port: 9124

security:
client:
sasl:
mechanism: ""
internal:
sasl:
mechanism: ""

Enabling Secure Connection

With the helm deployment, you can specify authentication mechanisms when connecting to the Fluss cluster.

The following table shows the supported mechanisms and security they provide:

MechanismMethodAuthenticationTLS Encryption
""PLAINTEXTNoNo
plainSASLYesNo

By default, the PLAINTEXT protocol is used.

You can set the SASL authentication by enabling plain mechanism.

security:
client:
sasl:
mechanism: plain
plain:
users:
- username: client-user
password: client-password
internal:
sasl:
mechanism: plain
plain:
username: internal-user
password: internal-password

Enabling ZooKeeper SASL Authentication

You can enable ZooKeeper ensemble SASL authentication, with the following values in the Fluss Helm chart:

security:
zookeeper:
sasl:
mechanism: plain
plain:
username: fluss-zk-user
password: fluss-zk-password

The security.zookeeper.sasl.plain.username and security.zookeeper.sasl.plain.password fields are required when security.zookeeper.sasl.mechanism is set to plain.

ZooKeeper SASL can be enabled independently or together with the listeners SASL authentication.

Metrics and Monitoring

When metrics.reporters is set, the chart adds the following server.yaml configuration entries:

  • metrics.reporters: comma-separated reporter names from metrics.reporters
  • metrics.reporter.<name>.port: port value from metrics.<name>.port

These values are managed by the chart and cannot be set via configurationOverrides. All other metrics reporter options (refer to the Fluss configuration) should be specified via configurationOverrides.

Prometheus Annotation Based Scraping

The example values below show how to add annotations to the metrics services so that a Prometheus server can discover and scrape them automatically based on the annotations:

metrics:
reporters: prometheus
prometheus:
port: 9249
service:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "9249"

Prometheus ServiceMonitor Based Scraping

Similarly, if using the Prometheus Operator, use the values below to add labels to the metrics services:

metrics:
reporters: prometheus
prometheus:
port: 9249
service:
portName: metrics
labels:
monitoring: enabled

Then create a ServiceMonitor that selects them matching the labels:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: fluss-metrics
spec:
selector:
matchLabels:
monitoring: enabled
endpoints:
# Matches `metrics.prometheus.service.portName`
- port: metrics

Storage Configuration

Configure different storage volumes for coordinator or tablet pods:

coordinator:
storage:
enabled: true
size: 5Gi
storageClass: fast-ssd

tablet:
storage:
enabled: true
size: 20Gi
storageClass: fast-ssd

Configure remote storage:

configurationOverrides:
data.dir: "/data/fluss"
remote.data.dir: "s3://my-bucket/fluss-data"

Pod Scheduling

By default, Kubernetes may schedule all tablet server pods on the same node. Even with replication factor 3, a single node failure could take out all replicas simultaneously, causing data loss for segments not yet tiered to remote storage.

Use pod anti-affinity to spread tablet server pods across availability zones and nodes:

tablet:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: topology.kubernetes.io/zone
labelSelector:
matchLabels:
app.kubernetes.io/instance: <release-name>
app.kubernetes.io/component: tablet
- weight: 50
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
app.kubernetes.io/instance: <release-name>
app.kubernetes.io/component: tablet

Replace <release-name> with your Helm release name (the value passed to helm install) so the selector scopes to pods of this release only. This matters when multiple Fluss releases share the cluster — otherwise anti-affinity would count pods across releases.

This configuration prioritizes zone-level spreading (weight 100) while also avoiding co-location on the same node (weight 50). For stricter guarantees, use requiredDuringSchedulingIgnoredDuringExecution instead — but note that pods will stay pending if no suitable node is available.

Alternatively, use topologySpreadConstraints for even distribution across failure domains:

tablet:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/instance: <release-name>
app.kubernetes.io/component: tablet
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/instance: <release-name>
app.kubernetes.io/component: tablet

You can also pin pods to specific nodes using nodeSelector or allow scheduling on tainted nodes with tolerations:

tablet:
nodeSelector:
workload: fluss
tolerations:
- key: dedicated
operator: Equal
value: fluss
effect: NoSchedule

The same scheduling fields are available for coordinator servers under coordinator.affinity, coordinator.nodeSelector, coordinator.tolerations, and coordinator.topologySpreadConstraints.

Loading Filesystem Plugins via Init Containers

Fluss discovers filesystem plugins at startup by scanning subdirectories under $FLUSS_HOME/plugins/.
To load a plugin that is not bundled in the base image, you can use an init container to download the plugin jar into a shared emptyDir volume before the main container starts.

The example below loads the Azure filesystem plugin (fluss-fs-azure) so that Fluss can read and write remote data to Azure Blob Storage (the example is for version 0.9, adapt to your necessities):

_fsAzurePlugin: &fsAzurePlugin
extraVolumes:
- name: azure-plugin
emptyDir: {}
extraVolumeMounts:
- name: azure-plugin
mountPath: /opt/fluss/plugins/azure
subPath: azure
initContainers:
- name: download-fs-azure
image: alpine:3.20
command:
- sh
- -c
- |
wget -O /plugins/azure/fluss-fs-azure-0.9.jar \
https://repo1.maven.org/maven2/org/apache/fluss/fluss-fs-azure/0.9.0-incubating/fluss-fs-azure-0.9.0-incubating.jar
volumeMounts:
- name: azure-plugin
mountPath: /plugins

coordinator:
<<: *fsAzurePlugin

tablet:
<<: *fsAzurePlugin

Upgrading

Upgrade the Chart

# Upgrade to a newer chart version
helm upgrade fluss ./helm

# Upgrade with new configuration
helm upgrade fluss ./helm -f values-new.yaml

Rolling Updates

The StatefulSets support rolling updates. When you update the configuration, pods will be restarted one by one to maintain availability.

Custom Container Images

Building Custom Images

To build and use custom Fluss images:

  1. Build the project with Maven:
mvn clean package -DskipTests
  1. Build the Docker image:
# Copy build artifacts
cp -r build-target/* docker/fluss/build-target

# Build image
cd docker
docker build -t my-registry/fluss:custom-tag .
  1. Use in Helm values:
image:
registry: my-registry
repository: fluss
tag: custom-tag
pullPolicy: Always

Monitoring and Observability

Health Checks

The chart includes liveness and readiness probes:

livenessProbe:
tcpSocket:
port: 9124
initialDelaySeconds: 10
periodSeconds: 3
failureThreshold: 100

readinessProbe:
tcpSocket:
port: 9124
initialDelaySeconds: 10
periodSeconds: 3
failureThreshold: 100

Logs

Access logs from different components:

# Coordinator logs
kubectl logs -l app.kubernetes.io/component=coordinator -f

# Tablet server logs
kubectl logs -l app.kubernetes.io/component=tablet -f

# Specific pod logs
kubectl logs coordinator-server-0 -f
kubectl logs tablet-server-0 -f

Troubleshooting

Common Issues

Pod Startup Issues

Symptoms: Pods stuck in Pending or CrashLoopBackOff state

Solutions:

# Check pod events
kubectl describe pod <pod-name>

# Check resource availability
kubectl describe nodes

# Verify ZooKeeper connectivity
kubectl exec -it <fluss-pod> -- nc -zv <zookeeper-host> 2181

Image Pull Errors

Symptoms: ImagePullBackOff or ErrImagePull

Solutions:

  • Verify image repository and tag exist
  • Check pull secrets configuration
  • Ensure network connectivity to registry

Connection Issues

Symptoms: Clients cannot connect to Fluss cluster

Solutions:

# Check service endpoints
kubectl get endpoints

# Test network connectivity
kubectl exec -it <client-pod> -- nc -zv <fluss-service> 9124

# Verify DNS resolution
kubectl exec -it <client-pod> -- nslookup <fluss-service>

Debug Commands

# Get all resources
kubectl get all -l app.kubernetes.io/name=fluss

# Check configuration
kubectl get configmap fluss-conf-file -o yaml


# Get detailed pod information
kubectl get pods -o wide -l app.kubernetes.io/name=fluss