Skip to content

Parquet

The Parquet plugin reads from and writes to Apache Parquet files. Parquet is a columnar storage format common in data engineering tools such as Spark, Azure Data Lake, AWS Athena, DuckDB, and pandas.

The plugin supports flat Parquet schemas. Output column types are detected at design time from a sample file and persisted in the Linx solution.

Functions:

  • ReadParquetFile:
    Read rows from a Parquet file, one row at a time or as a complete list.

  • WriteParquetFile:
    Write Linx data to a Parquet file with configurable compression and file-exist handling.