Ctrl+K
Logo image Logo image

Site Navigation

  • API Reference
  • Examples

Site Navigation

  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
    • StreamExecutionEnvironment
    • DataStream
    • Functions
    • State
    • Timer
    • Window
    • Checkpoint
    • Side Outputs
    • Connectors
    • Formats
  • PyFlink Common

pyflink.datastream.data_stream.KeyedStream.sum#

KeyedStream.sum(position_to_sum: Union[int, str] = 0) → pyflink.datastream.data_stream.DataStream[source]#

Applies an aggregation that gives a rolling sum of the data stream at the given position grouped by the given key. An independent aggregate is kept per key.

Example(Tuple data to sum):

>>> ds = env.from_collection([('a', 1), ('a', 2), ('b', 1), ('b', 5)])
>>> ds.key_by(lambda x: x[0]).sum(1)

Example(Row data to sum):

>>> ds = env.from_collection([('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2)],
...                          type_info=Types.ROW([Types.STRING(), Types.INT()]))
>>> ds.key_by(lambda x: x[0]).sum(1)

Example(Row data with fields name to sum):

>>> ds = env.from_collection(
...     [('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2)],
...     type_info=Types.ROW_NAMED(["key", "value"], [Types.STRING(), Types.INT()])
... )
>>> ds.key_by(lambda x: x[0]).sum("value")
Parameters

position_to_sum – The field position in the data points to sum, type can be int which indicates the index of the column to operate on or str which indicates the name of the column to operate on.

Returns

The transformed DataStream.

New in version 1.16.0.

previous

pyflink.datastream.data_stream.KeyedStream.filter

next

pyflink.datastream.data_stream.KeyedStream.min

Show Source

Created using Sphinx 4.5.0.