Class AvroParquetReaders


  • @Experimental
    public class AvroParquetReaders
    extends Object
    A convenience builder to create AvroParquetRecordFormat instances for the different kinds of Avro record types.
    • Method Detail

      • forSpecificRecord

        public static <T extends org.apache.avro.specific.SpecificRecordBase> StreamFormat<T> forSpecificRecord​(Class<T> typeClass)
        Creates a new AvroParquetRecordFormat that reads the parquet file into Avro SpecificRecords.

        To read into Avro GenericRecords, use the forGenericRecord(Schema) method.

        See Also:
        forGenericRecord(Schema)
      • forGenericRecord

        public static StreamFormat<org.apache.avro.generic.GenericRecord> forGenericRecord​(org.apache.avro.Schema schema)
        Creates a new AvroParquetRecordFormat that reads the parquet file into Avro GenericRecords.

        To read into GenericRecords, this method needs an Avro Schema. That is because Flink needs to be able to serialize the results in its data flow, which is very inefficient without the schema. And while the Schema is stored in the Avro file header, Flink needs this schema during 'pre-flight' time when the data flow is set up and wired, which is before there is access to the files.