What is Parquet file in PySpark?

Question

What is Parquet file in PySpark?

1 Answer

rajeshsharma · Answer 1 · 2022-03-13T12:26:38+0000

In PySpark, the Parquet file is a column-type format supported by several data processing systems. By using the Parquet file, Spark SQL can perform both read and write operations.

The Parquet file contains a column type format storage which provides the following advantages:

It is small and consumes less space.
It facilitates us to fetch specific columns for access.
It follows type-specific encoding.
It offers better-summarized data.
It contains very limited I/O operations.

What is Parquet file in PySpark?

Please log in or register to answer this question.

1 Answer