The map-reduce API is used for the data partition in Spark. In the input format, one can make more than one partition. For best performance, the HDFS block size is the partition size, but you can change partition sizes with tools like Split.