0 votes
in Big Data | Hadoop by
What is a Parquet file in Spark?

1 Answer

0 votes
by
Apache Parquet is a columnar storage format that is available to any project in Hadoop ecosystem. Any data processing framework, data model or programming language can use it.

It is a compressed, efficient and encoding format common to Hadoop system projects.

Spark SQL supports both reading

 

and writing of parquet files. Parquet files also automatically preserves the schema of the original data.

During write operations, by default all columns in a parquet file are converted to nullable column.

Related questions

0 votes
asked Jan 13, 2020 in Big Data | Hadoop by sharadyadav1986
0 votes
asked Jan 13, 2020 in Big Data | Hadoop by sharadyadav1986
...