Skip to content

WriteParquetFile

WriteParquetFile writes data rows to an Apache Parquet file.

The function accepts any Linx data source (a database result set or any list) and writes a Parquet file. The output schema is inferred from the data at runtime. No schema file is required.


The absolute or UNC path for the output .parquet file. Linx expressions are supported.

The data source providing the rows to write. Connect any Linx list of objects. Each item in the source becomes one row in the Parquet file.

The output schema is inferred from the first row at runtime. Each property of the row object maps to a column in the Parquet file.

The compression algorithm applied to each row group written to the file.

ValueDescription
SnappyFast compression with a moderate ratio. Default.
GzipSlower compression with a better ratio. Use when minimising file size is a priority.
BrotliHigh compression ratio with higher CPU cost.
Lz4Very fast compression with a lower ratio. Use when write speed is critical.
ZstdGood compression ratio with fast decompression. Suitable for archival use.
NoneNo compression. Use for maximum read speed or when the data is already compressed.

Behaviour when the output file already exists at the path specified.

ValueBehaviour
OverwriteFileDeletes the existing file and writes a new one.
IncrementFileNameAppends _1, _2, etc. to the file name until a non-existing name is found (for example, report_1.parquet).
ThrowExceptionRaises an exception if the file already exists.

The number of rows buffered in memory before each Parquet row group is flushed to disk. Default: 5000. Minimum: 1.

Lower values reduce memory usage but weaken compression. Higher values improve compression but increase peak memory during writing. The default works for most workloads.


The output schema is inferred from the first row at runtime. Each property becomes a column, with the .NET type mapped to the corresponding Parquet type.

Linx TypeParquet Type
integerINT32
doubleDOUBLE
booleanBOOLEAN
stringBYTE_ARRAY (UTF8)
decimalFIXED_LEN_BYTE_ARRAY (DECIMAL)
DateTimeINT64 (TIMESTAMP millis)
byteBYTE_ARRAY
int?, double?, etc.Nullable column

If there are no rows, an empty Parquet file is written with the correct schema.


ConditionError
File path is empty"File path cannot be null or empty."
Output directory does not exist"Output directory does not exist: {dir}."
File exists and Exist option is ThrowException"File already exists: {path}."
Row group size is less than 1"RowGroupSize must be at least 1."

Export data from a database to a Parquet file

Section titled “Export data from a database to a Parquet file”

This example queries a database and writes the results to a Parquet file.

Steps:

  1. Add an ExecuteSQL function (or equivalent database query function) and set Return options to List of rows.
  2. From the Parquet plugin, drag WriteParquetFile onto the design canvas, placed after the query function.
  3. Set File path to the destination .parquet file path.
  4. Set Data to the result list from the database query (for example, ExecuteSQL.Result).
  5. Choose a Compression codec and Exist option as needed.