Repartition is the process of movement of data on the basis of some column or expression or random into required number of partitions. This depends on the kind of value/s you are passing which determines how many partitions will be created. You may want to do Repartition when you have understanding of your data and you know how you can improve the performance of dataframe operations by repartitioning it on the basis of some key columns. Also understand that repartition is a costly operation because it requires shuffling of all the data across nodes. You can increase or decrease the number of partitions using “Repartition”.
Let’s see this with an example:
scala> df_states.show()
+-----------+----------+-------------+
| state_name|state_abbr|state_capital|
+-----------+----------+-------------+
| Alabama| AL| Montgomery|
| Alaska| AK| Juneau|
| Arizona| AZ| Phoenix|
| Arkansas| AR| Little Rock|
| California| CA| Sacramento|
| Colorado| CO| Denver|
|Connecticut| CT| Hartford|
| Delaware| DE| Dover|
| Florida| FL| Tallahassee|
| Georgia| GA| Atlanta|
| Hawaii| HI| Honolulu|
| Idaho| ID| Boise|
| Illinois| IL| Springfield|
| Indiana| IN| Indianapolis|
| Iowa| IA| Des Moines|
| Kansas| KS| Topeka|
| Kentucky| KY| Frankfort|
| Louisiana| LA| Baton Rouge|
| Maine| ME| Augusta|
| Maryland| MD| Annapolis|
+-----------+----------+-------------+
only showing top 20 rows
Check the number of partitions for this dataframe:
scala> df_states.rdd.partitions.size
res6: Int = 1
So this means all the data is present in 1 partition only.
Re-Partition by giving number of partitions you want (say 5) and verify partitions size.
scala> val df_states_part5 = df_states.repartition(5)
df_states_part5: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [state_name: string, state_abbr: string ... 1 more field]
scala> df_states_part5.rdd.partitions.size
res7: Int = 5
So we have repartitioned existing dataframe from 1 partition to 5. The data was “randomly” shuffled to number of partitions required.This can be confirmed from explain plan.
scala> df_states_part5.explain()
== Physical Plan ==
Exchange RoundRobinPartitioning(5)
Similarly, you can also specify the column on the basis of which repartition is required. The data is repartitioned using “HASH” and number of partition will be determined by value set for numpartitions. i.e.spark.sql.shuffle.partitions. Change this value if want different number of partitions.
scala> val df_states_partCol = df_states.repartition($"state_abbr")
df_states_partCol: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [state_name: string, state_abbr: string ... 1 more field]
scala> df_states_partCol.explain()
== Physical Plan ==
Exchange hashpartitioning(state_abbr#35, 200)
scala> spark.sql("set spark.sql.shuffle.partitions").show(false)
+----------------------------+-----+
|key |value|
+----------------------------+-----+
|spark.sql.shuffle.partitions|200 |
+----------------------------+-----+
The number of partitions determine the file parts created when the dataframe is saved as file. Since the values were less i.e. 50 and also for some HASH values resultant partition was same, we will get 200 parts but most of them will be empty files. It determines by using formula: VALUE.hashCode()%numpartitions. Let’s verify this too.
scala> df_states_partCol.write.format("csv").save("/tmp/raj/dfdata")
[root@sandbox-hdp ~]# hdfs dfs -ls -S /tmp/raj/dfdata/
Found 201 items
-rw-r--r-- 1 root hdfs 81 2019-08-21 04:10 /tmp/raj/dfdata/part-00004-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 81 2019-08-21 04:10 /tmp/raj/dfdata/part-00110-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 79 2019-08-21 04:10 /tmp/raj/dfdata/part-00175-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 77 2019-08-21 04:10 /tmp/raj/dfdata/part-00049-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 77 2019-08-21 04:10 /tmp/raj/dfdata/part-00066-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 76 2019-08-21 04:10 /tmp/raj/dfdata/part-00091-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 49 2019-08-21 04:10 /tmp/raj/dfdata/part-00115-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 48 2019-08-21 04:10 /tmp/raj/dfdata/part-00185-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 47 2019-08-21 04:10 /tmp/raj/dfdata/part-00078-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 47 2019-08-21 04:10 /tmp/raj/dfdata/part-00154-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 46 2019-08-21 04:10 /tmp/raj/dfdata/part-00047-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 46 2019-08-21 04:10 /tmp/raj/dfdata/part-00065-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 46 2019-08-21 04:10 /tmp/raj/dfdata/part-00195-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 45 2019-08-21 04:10 /tmp/raj/dfdata/part-00009-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 45 2019-08-21 04:10 /tmp/raj/dfdata/part-00070-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 45 2019-08-21 04:10 /tmp/raj/dfdata/part-00097-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 44 2019-08-21 04:10 /tmp/raj/dfdata/part-00042-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 44 2019-08-21 04:10 /tmp/raj/dfdata/part-00051-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 44 2019-08-21 04:10 /tmp/raj/dfdata/part-00183-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 42 2019-08-21 04:10 /tmp/raj/dfdata/part-00010-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 42 2019-08-21 04:10 /tmp/raj/dfdata/part-00116-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 41 2019-08-21 04:10 /tmp/raj/dfdata/part-00080-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 41 2019-08-21 04:10 /tmp/raj/dfdata/part-00095-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 41 2019-08-21 04:10 /tmp/raj/dfdata/part-00108-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 40 2019-08-21 04:10 /tmp/raj/dfdata/part-00055-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 40 2019-08-21 04:10 /tmp/raj/dfdata/part-00071-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 40 2019-08-21 04:10 /tmp/raj/dfdata/part-00074-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 39 2019-08-21 04:10 /tmp/raj/dfdata/part-00059-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 39 2019-08-21 04:10 /tmp/raj/dfdata/part-00094-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 39 2019-08-21 04:10 /tmp/raj/dfdata/part-00167-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 37 2019-08-21 04:10 /tmp/raj/dfdata/part-00016-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 37 2019-08-21 04:10 /tmp/raj/dfdata/part-00130-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 36 2019-08-21 04:10 /tmp/raj/dfdata/part-00054-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 36 2019-08-21 04:10 /tmp/raj/dfdata/part-00075-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 36 2019-08-21 04:10 /tmp/raj/dfdata/part-00165-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 36 2019-08-21 04:10 /tmp/raj/dfdata/part-00199-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 35 2019-08-21 04:10 /tmp/raj/dfdata/part-00067-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 35 2019-08-21 04:10 /tmp/raj/dfdata/part-00107-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 34 2019-08-21 04:10 /tmp/raj/dfdata/part-00083-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 34 2019-08-21 04:10 /tmp/raj/dfdata/part-00179-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 33 2019-08-21 04:10 /tmp/raj/dfdata/part-00030-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 33 2019-08-21 04:10 /tmp/raj/dfdata/part-00151-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 33 2019-08-21 04:10 /tmp/raj/dfdata/part-00169-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 29 2019-08-21 04:10 /tmp/raj/dfdata/part-00060-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/_SUCCESS
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00000-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00001-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00002-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00003-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00005-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00006-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00007-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00008-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00011-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00012-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00013-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00014-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00015-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00017-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00018-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00019-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00020-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00021-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00022-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00023-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00024-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00025-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00026-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00027-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00028-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00029-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00031-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00032-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00033-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00034-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00035-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00036-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00037-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00038-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00039-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00040-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00041-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00043-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00044-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00045-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00046-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00048-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00050-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00052-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00053-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00056-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00057-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00058-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00061-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00062-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00063-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00064-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00068-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00069-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00072-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00073-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00076-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00077-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00079-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00081-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00082-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00084-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00085-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00086-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00087-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00088-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00089-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00090-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00092-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00093-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00096-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00098-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00099-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00100-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00101-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00102-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00103-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00104-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00105-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00106-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00109-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00111-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00112-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00113-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00114-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00117-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00118-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00119-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00120-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00121-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00122-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00123-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00124-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00125-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00126-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00127-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00128-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00129-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00131-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00132-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00133-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00134-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00135-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00136-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00137-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00138-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00139-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00140-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00141-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00142-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00143-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00144-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00145-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00146-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00147-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00148-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00149-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00150-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00152-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00153-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00155-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00156-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00157-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00158-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00159-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00160-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00161-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00162-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00163-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00164-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00166-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00168-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00170-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00171-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00172-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00173-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00174-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00176-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00177-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00178-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00180-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00181-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00182-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00184-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00186-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00187-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00188-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00189-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00190-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00191-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00192-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00193-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00194-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00196-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00197-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
-rw-r--r-- 1 root hdfs 0 2019-08-21 04:10 /tmp/raj/dfdata/part-00198-f153bf3d-0759-42b5-87b8-c4cc28fb568d-c000.csv
It generated 200 file parts. Also we can see top 44 files have data while remaining are empty files.