PySpark RDD operations – Map, Filter, SortBy, reduceByKey, Joins
In the last post, we discussed about basic operations on RDD in PySpark. In this post, we will see other common operations one can perform on RDD in PySpark. Let’s quickly see the syntax and examples for various RDD operations: Read a file into RDD Convert record into LIST of elements Remove the header data […]
PySpark RDD operations – Map, Filter, SortBy, reduceByKey, Joins Read More »