WriteParquetFile

WriteParquetFile writes data rows to an Apache Parquet file.

The function accepts any Linx data source (a database result set or any list) and writes a Parquet file. The output schema is inferred from the data at runtime. No schema file is required.

Properties

File path

The absolute or UNC path for the output .parquet file. Linx expressions are supported.

Data

The data source providing the rows to write. Connect any Linx list of objects. Each item in the source becomes one row in the Parquet file.

The output schema is inferred from the first row at runtime. Each property of the row object maps to a column in the Parquet file.

Compression codec

The compression algorithm applied to each row group written to the file.

Value	Description
Snappy	Fast compression with a moderate ratio. Default.
Gzip	Slower compression with a better ratio. Use when minimising file size is a priority.
Brotli	High compression ratio with higher CPU cost.
Lz4	Very fast compression with a lower ratio. Use when write speed is critical.
Zstd	Good compression ratio with fast decompression. Suitable for archival use.
None	No compression. Use for maximum read speed or when the data is already compressed.

Exist option

Behaviour when the output file already exists at the path specified.

Value	Behaviour
OverwriteFile	Deletes the existing file and writes a new one.
IncrementFileName	Appends `_1`, `_2`, etc. to the file name until a non-existing name is found (for example, `report_1.parquet`).
ThrowException	Raises an exception if the file already exists.

Row group size

The number of rows buffered in memory before each Parquet row group is flushed to disk. Default: 5000. Minimum: 1.

Lower values reduce memory usage but weaken compression. Higher values improve compression but increase peak memory during writing. The default works for most workloads.

Schema Inference

The output schema is inferred from the first row at runtime. Each property becomes a column, with the .NET type mapped to the corresponding Parquet type.

Linx Type	Parquet Type
`integer`	INT32
`double`	DOUBLE
`boolean`	BOOLEAN
`string`	BYTE_ARRAY (UTF8)
`decimal`	FIXED_LEN_BYTE_ARRAY (DECIMAL)
`DateTime`	INT64 (TIMESTAMP millis)
`byte`	BYTE_ARRAY
`int?`, `double?`, etc.	Nullable column

If there are no rows, an empty Parquet file is written with the correct schema.

Validation

Condition	Error
File path is empty	`"File path cannot be null or empty."`
Output directory does not exist	`"Output directory does not exist: {dir}."`
File exists and Exist option is ThrowException	`"File already exists: {path}."`
Row group size is less than 1	`"RowGroupSize must be at least 1."`

How To

Export data from a database to a Parquet file

This example queries a database and writes the results to a Parquet file.

Steps:

Add an ExecuteSQL function (or equivalent database query function) and set Return options to List of rows.
From the Parquet plugin, drag WriteParquetFile onto the design canvas, placed after the query function.
Set File path to the destination .parquet file path.
Set Data to the result list from the database query (for example, ExecuteSQL.Result).
Choose a Compression codec and Exist option as needed.