This documentation is for an unreleased version of Apache Flink. We recommend you use the latest stable version.
Metrics #
PyFlink exposes a metric system that allows gathering and exposing metrics to external systems.
Registering metrics #
You can access the metric system from a Python user-defined function
by calling function_context.get_metric_group()
in the open
method.
The get_metric_group()
method returns a MetricGroup
object on which you can create
and register new metrics.
Metric types #
PyFlink supports Counters
, Gauges
, Distribution
and Meters
.
Counter #
A Counter
is used to count something. The current value can be in- or decremented using inc()/inc(n: int)
or dec()/dec(n: int)
.
You can create and register a Counter
by calling counter(name: str)
on a MetricGroup
.
from pyflink.table.udf import ScalarFunction
class MyUDF(ScalarFunction):
def __init__(self):
self.counter = None
def open(self, function_context):
self.counter = function_context.get_metric_group().counter("my_counter")
def eval(self, i):
self.counter.inc(i)
return i
Gauge #
A Gauge
provides a value on demand. You can register a gauge by calling
gauge(name: str, obj: Callable[[], int])
on a MetricGroup. The Callable object will be used to
report the values. Gauge metrics are restricted to integer-only values.
from pyflink.table.udf import ScalarFunction
class MyUDF(ScalarFunction):
def __init__(self):
self.length = 0
def open(self, function_context):
function_context.get_metric_group().gauge("my_gauge", lambda : self.length)
def eval(self, i):
self.length = i
return i - 1
Distribution #
A metric that reports information(sum, count, min, max and mean) about the distribution of
reported values. The value can be updated using update(n: int)
. You can register a distribution
by calling distribution(name: str)
on a MetricGroup. Distribution metrics are restricted to
integer-only distributions.
from pyflink.table.udf import ScalarFunction
class MyUDF(ScalarFunction):
def __init__(self):
self.distribution = None
def open(self, function_context):
self.distribution = function_context.get_metric_group().distribution("my_distribution")
def eval(self, i):
self.distribution.update(i)
return i - 1
Meter #
A Meter measures an average throughput. An occurrence of an event can be registered with the
mark_event()
method. The occurrence of multiple events at the same time can be registered with
mark_event(n: int) method. You can register a meter by calling
meter(self, name: str, time_span_in_seconds: int = 60)
on a MetricGroup.
The default value of time_span_in_seconds is 60.
from pyflink.table.udf import ScalarFunction
class MyUDF(ScalarFunction):
def __init__(self):
self.meter = None
def open(self, function_context):
# an average rate of events per second over 120s, default is 60s.
self.meter = function_context.get_metric_group().meter("my_meter", time_span_in_seconds=120)
def eval(self, i):
self.meter.mark_event(i)
return i - 1
Scope #
You can refer to the Java metric document for more details on Scope definition.
User Scope #
You can define a user scope by calling MetricGroup.add_group(key: str, value: str = None)
.
If value
is not None
, creates a new key-value MetricGroup pair.
The key group is added to this group’s sub-groups, while the value group is added to the key
group’s sub-groups. In this case, the value group will be returned, and a user variable will be defined.
function_context \
.get_metric_group() \
.add_group("my_metrics") \
.counter("my_counter")
function_context \
.get_metric_group() \
.add_group("my_metrics_key", "my_metrics_value") \
.counter("my_counter")
System Scope #
You can refer to the Java metric document for more details on System Scope.
List of all Variables #
You can refer to the Java metric document for more details on List of all Variables.
User Variables #
You can define a user variable by calling MetricGroup.addGroup(key: str, value: str = None)
and
specifying the value parameter.
Important: User variables cannot be used in scope formats.
function_context \
.get_metric_group() \
.add_group("my_metrics_key", "my_metrics_value") \
.counter("my_counter")
Common part between PyFlink and Flink #
You can refer to the Java metric document for more details on the following sections: