Azure Blob Storage

This documentation is for an out-of-date version of Apache Flink. We recommend you use the latest stable version.

Azure Blob Storage is a Microsoft-managed service providing cloud storage for a variety of use cases. You can use Azure Blob Storage with Flink for reading and writing data as well in conjunction with the streaming state backends

Shaded Hadoop Azure Blob Storage file system
Credentials Configuration

You can use Azure Blob Storage objects like regular files by specifying paths in the following format:

wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>

// SSL encrypted access
wasbs://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>

See below for how to use Azure Blob Storage in a Flink job:

// Read from Azure Blob storage
env.readTextFile("wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>");

// Write to Azure Blob storage
stream.writeAsText("wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>")

// Use Azure Blob Storage as FsStatebackend
env.setStateBackend(new FsStateBackend("wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>"));

Shaded Hadoop Azure Blob Storage file system

To use flink-azure-fs-hadoop, copy the respective JAR file from the opt directory to the plugins directory of your Flink distribution before starting Flink, e.g.

mkdir ./plugins/azure-fs-hadoop
cp ./opt/flink-azure-fs-hadoop-1.10.2.jar ./plugins/azure-fs-hadoop/

flink-azure-fs-hadoop registers default FileSystem wrappers for URIs with the wasb:// and wasbs:// (SSL encrypted access) scheme.

Credentials Configuration

Hadoop’s Azure Filesystem supports configuration of credentials via the Hadoop configuration as outlined in the Hadoop Azure Blob Storage documentation. For convenience Flink forwards all Flink configurations with a key prefix of fs.azure to the Hadoop configuration of the filesystem. Consequentially, the azure blob storage key can be configured in flink-conf.yaml via:

fs.azure.account.key.<account_name>.blob.core.windows.net: <azure_storage_key>

Alternatively, the the filesystem can be configured to read the Azure Blob Storage key from an environment variable AZURE_STORAGE_KEY by setting the following configuration keys in flink-conf.yaml.

fs.azure.account.keyprovider.<account_name>.blob.core.windows.net: org.apache.flink.fs.azurefs.EnvironmentVariableKeyProvider