Configuration #
Specifying Operator Configuration #
The operator allows users to specify default configuration that will be shared by the Flink operator itself and the Flink deployments.
These configuration files are mounted externally via ConfigMaps. The Configuration files with default values are shipped in the Helm chart. It is recommended to review and adjust them if needed in the values.yaml
file before deploying the Operator in production environments.
To append to the default configuration, simply define the flink-conf.yaml
key in the defaultConfiguration
section of the Helm values.yaml
file:
defaultConfiguration:
create: true
# Set append to false to replace configuration files
append: true
flink-conf.yaml: |+
# Flink Config Overrides
kubernetes.operator.metrics.reporter.slf4j.factory.class: org.apache.flink.metrics.slf4j.Slf4jReporterFactory
kubernetes.operator.metrics.reporter.slf4j.interval: 5 MINUTE
kubernetes.operator.reconciler.reschedule.interval: 15 s
kubernetes.operator.observer.progress-check.interval: 5 s
To learn more about metrics and logging configuration please refer to the dedicated docs page.
Dynamic Operator Configuration #
The Kubernetes operator supports dynamic config changes through the operator ConfigMaps. Dynamic operator configuration is enabled by default, and can be disabled by setting kubernetes.operator.dynamic.config.enabled
to false. Time interval for checking dynamic config changes is specified by kubernetes.operator.dynamic.config.check.interval
of which default value is 5 minutes.
Verify whether dynamic operator configuration updates is enabled via the deploy/flink-kubernetes-operator
log has:
2022-05-28 13:08:29,222 o.a.f.k.o.c.FlinkConfigManager [INFO ] Enabled dynamic config updates, checking config changes every PT5M
To change config values dynamically the ConfigMap can be directly edited via kubectl patch
or kubectl edit
command. For example to change the reschedule interval you can override kubernetes.operator.reconciler.reschedule.interval
.
Verify whether the config value of kubernetes.operator.reconciler.reschedule.interval
is updated to 30 seconds via the deploy/flink-kubernetes-operator
log has:
2022-05-28 13:08:30,115 o.a.f.k.o.c.FlinkConfigManager [INFO ] Updating default configuration to {kubernetes.operator.reconciler.reschedule.interval=PT30S}
Operator Configuration Reference #
Key | Default | Type | Description |
---|---|---|---|
kubernetes.operator.config.cache.size |
1000 | Integer | Max config cache size. |
kubernetes.operator.config.cache.timeout |
10 min | Duration | Expiration time for cached configs. |
kubernetes.operator.deployment.readiness.timeout |
1 min | Duration | The timeout for deployments to become ready/stable before being rolled back if rollback is enabled. |
kubernetes.operator.deployment.rollback.enabled |
false | Boolean | Whether to enable rolling back failed deployment upgrades. |
kubernetes.operator.dynamic.config.check.interval |
5 min | Duration | Time interval for checking config changes. |
kubernetes.operator.dynamic.config.enabled |
true | Boolean | Whether to enable on-the-fly config changes through the operator configmap. |
kubernetes.operator.job.upgrade.ignore-pending-savepoint |
false | Boolean | Whether to ignore pending savepoint during job upgrade. |
kubernetes.operator.observer.flink.client.timeout |
10 s | Duration | The timeout for the observer to wait the flink rest client to return. |
kubernetes.operator.observer.progress-check.interval |
10 s | Duration | The interval for observing status for in-progress operations such as deployment and savepoints. |
kubernetes.operator.observer.rest-ready.delay |
10 s | Duration | Final delay before deployment is marked ready after port becomes accessible. |
kubernetes.operator.observer.savepoint.trigger.grace-period |
10 s | Duration | The interval before a savepoint trigger attempt is marked as unsuccessful. |
kubernetes.operator.reconciler.flink.cancel.job.timeout |
1 min | Duration | The timeout for the reconciler to wait for flink to cancel job. |
kubernetes.operator.reconciler.flink.cluster.shutdown.timeout |
1 min | Duration | The timeout for the reconciler to wait for flink to shutdown cluster. |
kubernetes.operator.reconciler.jm-deployment-recovery.enabled |
true | Boolean | Whether to enable recovery of missing/deleted jobmanager deployments. |
kubernetes.operator.reconciler.max.parallelism |
5 | Integer | The maximum number of threads running the reconciliation loop. Use -1 for infinite. |
kubernetes.operator.reconciler.reschedule.interval |
1 min | Duration | The interval for the controller to reschedule the reconcile process. |
kubernetes.operator.savepoint.history.max.age |
86400000 ms | Duration | Maximum age for savepoint history entries to retain. Due to lazy clean-up, the most recent savepoint may live longer than the max age. |
kubernetes.operator.savepoint.history.max.count |
10 | Integer | Maximum number of savepoint history entries to retain. |
kubernetes.operator.user.artifacts.base.dir |
"/opt/flink/artifacts" | String | The base dir to put the session job artifacts. |
kubernetes.operator.user.artifacts.http.header |
(none) | Map | Custom HTTP header for HttpArtifactFetcher. The header will be applied when getting the session job artifacts. Expected format: headerKey1:headerValue1,headerKey2:headerValue2. |
Job Specific Configuration Reference #
Job specific configuration can be configured under spec.flinkConfiguration
and it will override flink configurations defined in flink-conf.yaml
.
- For application clusters,
spec.flinkConfiguration
will be located inFlinkDeployment
CustomResource. - For session clusters, configuring
spec.flinkConfiguration
in parentFlinkDeployment
will be applied to all session jobs within the session cluster.- You can configure some additional job specific supplemental configuration through
spec.flinkConfiguration
inFlinkSessionJob
CustomResource. Those session job level configurations will override the parent session cluster’s Flink configuration. Please note only the following configurations are considered to be valid configurations.kubernetes.operator.user.artifacts.http.header
- You can configure some additional job specific supplemental configuration through