Apache Flink provides efficient workloads on top of the JVM by tightly controlling the memory usage of its various components. While the community strives to offer sensible defaults to all configurations, the full breadth of applications that users deploy on Flink means this isn’t always possible. To provide the most production value to our users, Flink allows both high level and fine-grained tuning of memory allocation within clusters.
See also task manager configuration options.
The total memory in Flink consists of JVM heap, managed memory and network buffers. Managed memory can be either part of the JVM heap or direct off-heap memory. For containerized deployments, the total memory can additionally include a container cut-off.
All other memory components are computed from the total memory before starting the Flink process. After the start, the managed memory and network memory are adjusted in certain cases based on available JVM memory inside the process (see Adjustments inside Flink process).
When the Flink JVM process is started in standalone mode,
on Yarn or on Mesos, the total memory is defined by
the configuration option
taskmanager.heap.size (or deprecated
In case of a containerized deployment (Yarn or Mesos),
this is the size of the requested container.
In case of a containerized deployment, the total memory is reduced by the cut-off. The cut-off is the fraction (containerized.heap-cutoff-ratio of the total memory but always greater or equal than its minimum value (containerized.heap-cutoff-min).
The cut-off is introduced to accommodate for other types of consumed memory which is not accounted for in this memory model, e.g. RocksDB native memory, JVM overhead, etc. It is also a safety margin to prevent the container from exceeding its memory limit and being killed by the container manager.
The network memory is used for buffering records while shuffling them between operator tasks and their executors over the network. It is calculated as:
network = Min(max, Max(min, fraction x total)
See also setting memory fractions.
If the above mentioned options are not set but the legacy option is used then the network memory is assumed to be set explicitly without fraction as:
network = legacy buffers x page
The managed memory is used for batch jobs. It helps Flink to run the batch operators efficiently and prevents
OutOfMemoryErrors because Flink knows how much memory it can use to execute operations. If Flink runs out of managed memory,
it utilizes disk space. Using managed memory, some operations can be performed directly on the raw data without having
to deserialize the data to convert it into Java objects. All in all, managed memory improves the robustness and speed of the system.
The managed memory can be either part of the JVM heap (on-heap) or off-heap. It is on-heap by default
The managed memory size can be either set explicitly by
or if not set explicitly then it is defined as a fraction (
of total memory minus network memory and calculated the following way:
managed = (total - network) x fraction
The heap is set by JVM command line arguments (
heap = total - managed (if off-heap) - network
When the Flink process has been started, the size of managed memory and network buffers, eventually used in the process, are calculated in a slightly different way if the size is defined as a fraction. The values are derived from the available JVM heap size. They should be close to the values calculated before starting the process but can differ.
The JVM heap is estimated in two ways for further computations:
-Xmxis set then it is its value else ¼ of physical machine memory estimated by the JVM
Then the managed memory is a fraction of
The free JVM heap is used to derive the total memory for the calculation of the off-heap managed memory.
The max JVM heap is used to derive the total memory for the calculation of network buffers.