Apache Flink provides efficient workloads on top of the JVM by tightly controlling the memory usage of its various components. While the community strives to offer sensible defaults to all configurations, the full breadth of applications that users deploy on Flink means this isn’t always possible. To provide the most production value to our users, Flink allows both high level and fine-grained tuning of memory allocation within clusters.
The further described memory configuration is applicable starting with the release version 1.10. If you upgrade Flink from earlier versions, check the migration guide because many changes were introduced with the 1.10 release.
Note This memory setup guide is relevant only for task executors! Check job manager related configuration options for the memory setup of job manager.
The total process memory of Flink JVM processes consists of memory consumed by Flink application (total Flink memory) and by the JVM to run the process. The total Flink memory consumption includes usage of JVM heap, managed memory (managed by Flink) and other direct (or native) memory.
If you run Flink locally (e.g. from your IDE) without creating a cluster, then only a subset of the memory configuration options are relevant, see also local execution for more details.
Otherwise, the simplest way to setup memory in Flink is to configure either of the two following options:
taskmanager.memory.flink.size
)taskmanager.memory.process.size
)The rest of the memory components will be adjusted automatically, based on default values or additionally configured options. Here are more details about the other memory components.
Configuring total Flink memory is better suited for standalone deployments where you want to declare how much memory is given to Flink itself. The total Flink memory splits up into JVM heap, managed memory size and direct memory.
If you configure total process memory you declare how much memory in total should be assigned to the Flink JVM process. For the containerized deployments it corresponds to the size of the requested container, see also how to configure memory for containers (Kubernetes, Yarn or Mesos).
Another way to setup the memory is to set task heap and managed memory
(taskmanager.memory.task.heap.size
and taskmanager.memory.managed.size
).
This more fine-grained approach is described in more detail here.
Note One of the three mentioned ways has to be used to configure Flink’s memory (except for local execution), or the Flink startup will fail. This means that one of the following option subsets, which do not have default values, have to be configured explicitly:
taskmanager.memory.flink.size
taskmanager.memory.process.size
taskmanager.memory.task.heap.size
and taskmanager.memory.managed.size
Note Explicitly configuring both total process memory and total Flink memory is not recommended. It may lead to deployment failures due to potential memory configuration conflicts. Additional configuration of other memory components also requires caution as it can produce further configuration conflicts.
As mentioned before in total memory description, another way to setup memory in Flink is to specify explicitly both task heap and managed memory. It gives more control over the available JVM heap to Flink’s tasks and its managed memory.
The rest of the memory components will be adjusted automatically, based on default values or additionally configured options. Here are more details about the other memory components.
Note If you have configured the task heap and managed memory explicitly, it is recommended to set neither total process memory nor total Flink memory. Otherwise, it may easily lead to memory configuration conflicts.
If you want to guarantee that a certain amount of JVM heap is available for your user code, you can set the task heap memory
explicitly (taskmanager.memory.task.heap.size
).
It will be added to the JVM heap size and will be dedicated to Flink’s operators running the user code.
Managed memory is managed by Flink and is allocated as native memory (off-heap). The following workloads use managed memory:
The size of managed memory can be
taskmanager.memory.managed.size
taskmanager.memory.managed.fraction
.Size will override fraction, if both are set. If neither size nor fraction is explicitly configured, the default fraction will be used.
See also how to configure memory for state backends and batch jobs.
The off-heap memory which is allocated by user code should be accounted for in task off-heap memory
(taskmanager.memory.task.off-heap.size
).
Note You can also adjust the framework off-heap memory. This option is advanced and only recommended to be changed if you are sure that the Flink framework needs more memory.
Flink includes the framework off-heap memory and task off-heap memory into the direct memory limit of the JVM, see also JVM parameters.
Note Although, native non-direct memory usage can be accounted for as a part of the framework off-heap memory or task off-heap memory, it will result in a higher JVM’s direct memory limit in this case.
Note The network memory is also part of JVM direct memory but it is managed by Flink and guaranteed to never exceed its configured size. Therefore, resizing the network memory will not help in this situation.
See also the detailed memory model.