Spark Dataframe concatenate strings

In many scenarios, you may want to concatenate multiple strings into one. For example, you may want to concatenate “FIRST NAME” & “LAST NAME” of a customer to show his “FULL NAME”. In Spark SQL Dataframe, we can use concat function to join multiple string into one string.
Let’s look at the example below:

scala> df_pres.select(concat($"pres_id",$"pres_name")).show()
+-------------------------+
|concat(pres_id,pres_name)|
+-------------------------+
|       1George Washington|
|              2John Adams|
|        3Thomas Jefferson|
|           4James Madison|
|            5James Monroe|
|       6John Quincy Adams|
|          7Andrew Jackson|
|        8Martin Van Buren|
|     9William Henry Ha...|
|             10John Tyler|
|          11James K. Polk|
|         12Zachary Taylor|
|       13Millard Fillmore|
|        14Franklin Pierce|
|         15James Buchanan|
|        16Abraham Lincoln|
|         17Andrew Johnson|
|       18Ulysses S. Grant|
|     19Rutherford B. H...|
|      20James A. Garfield|
+-------------------------+
only showing top 20 rows

Now in above output,we were able to join two columns into one column. However the output looks little uncomfortable to read or view. Most of the times, we may want a delimiter to distinguish between first and second string. In order to introduce a delimiter between strings, we will use concat_ws function. The first parameter is the delimiter. It could be a single character or multi character delimiter.
Let’s look at the example where we will use “-” as delimiter while concatenating two columns.

scala> df_pres.select(concat_ws("-",$"pres_id",$"pres_name")).show()
+------------------------------+
|concat_ws(-,pres_id,pres_name)|
+------------------------------+
|           1-George Washington|
|                  2-John Adams|
|            3-Thomas Jefferson|
|               4-James Madison|
|                5-James Monroe|
|           6-John Quincy Adams|
|              7-Andrew Jackson|
|            8-Martin Van Buren|
|          9-William Henry H...|
|                 10-John Tyler|
|              11-James K. Polk|
|             12-Zachary Taylor|
|           13-Millard Fillmore|
|            14-Franklin Pierce|
|             15-James Buchanan|
|            16-Abraham Lincoln|
|             17-Andrew Johnson|
|           18-Ulysses S. Grant|
|          19-Rutherford B. ...|
|          20-James A. Garfield|
+------------------------------+
only showing top 20 rows

Leave a Reply

Your email address will not be published. Required fields are marked *