ReadParquetFile

ReadParquetFile reads rows from an Apache Parquet file.

The function has two output modes: Row by row streams rows one at a time through a ForEachRow path, and List of rows returns all rows as a typed list.

Before the output properties are available in the Designer, you must load a Schema. Click the ellipses on the Schema property, select a template .parquet file, and the column definitions are populated automatically.

Properties

File path

The absolute or UNC path to the .parquet file to read. Linx expressions are supported.

Output type

Controls how rows are returned:

Row by row
A ForEachRow execution path is added to the function and runs once per row. Each row exposes one property per column. Use this mode for large files or row-by-row processing.
List of rows
The function returns a typed List of all rows. Use this mode when you need the full dataset at once.

Row group index

Specifies which Parquet row group to read. The default value of -1 reads all row groups in sequence.

Set to a non-negative integer to read only that row group (0-based).

If the specified index is out of range, the function raises an error identifying the total number of row groups available in the file.

Schema

Stores the column definitions used to type the function output. Click the ellipses (…) to open the schema loader and select a .parquet template file. The columns are read from the file and stored in this property. The column list cannot be edited directly — to update it, reload using the schema loader.

The template file is only needed at design time and does not need to be present at runtime.

Validation

Condition	Error
File path is empty	`"File path cannot be null or empty."`
File does not exist at runtime	`"Parquet file not found: {path}."`
A specified column is not in the schema	`"Column '{name}' not found in Parquet file schema."`
Row group index is out of range	`"Row group index {n} is out of range. File has {count} row group(s)."`

How To

Read data from a Parquet file into a database

This example reads every row from a Parquet file and inserts each row into a database table.

Steps:

Get a sample .parquet file with the same schema as the file you want to read.
From the Parquet plugin, drag ReadParquetFile onto the design canvas.
Click the ellipses on the Schema property, select the sample file, and wait for the column list to load.
Set File path to the target .parquet file (this can be a static file location, or an expression).
Set Output type to RowByRow.
Inside the ForEachRow loop, add an ExecuteSQL function and write an INSERT statement referencing the row fields (for example, ForEachRow.CustomerId, ForEachRow.Name, ForEachRow.Amount).

Note: For large data volumes, use DBBulkCopy instead of row-by-row inserts with ExecuteSQL.

Read data from a Parquet file into a list for downstream processing

From the Parquet plugin, drag ReadParquetFile onto the design canvas.
Click the ellipses on the Schema property and select a template file to load the column definitions.
Set File path to the target .parquet file.
Set Output type to ListOfRows.
Use ReadParquetFile.Result in downstream functions such as WriteParquetFile, a REST call, or any function that accepts a list. You can use a ForEach to loop through the items in the list.