Spark Dataframe WHEN case

In SQL, if we have to check multiple conditions for any column value then we use case statament. In Spark SQL dataframes also we can replicate same functionality by using WHEN clause multiple times, once for each conditional check. No requirement to add CASE keyword though. So let’s see an example to see how to check for multiple conditions and replicate SQL CASE statement in Spark SQL.

scala> df_pres.select($"pres_name",$"pres_dob",$"pres_bs",
 when($"pres_bs"==="Virginia","VA").when($"pres_bs"==="Massachusetts","MA")
.when($"pres_bs"==="Ohio","OH").otherwise("Others").alias("state_abbr")).show()
+--------------------+----------+--------------------+----------+
|           pres_name|  pres_dob|             pres_bs|state_abbr|
+--------------------+----------+--------------------+----------+
|   George Washington|1732-02-22|            Virginia|        VA|
|          John Adams|1735-10-30|       Massachusetts|        MA|
|    Thomas Jefferson|1743-04-13|            Virginia|        VA|
|       James Madison|1751-03-16|            Virginia|        VA|
|        James Monroe|1758-04-28|            Virginia|        VA|
|   John Quincy Adams|1767-07-11|       Massachusetts|        MA|
|      Andrew Jackson|1767-03-15|South/North Carolina|    Others|
|    Martin Van Buren|1782-12-05|            New York|    Others|
|William Henry Har...|1773-02-09|            Virginia|        VA|
|          John Tyler|1790-03-29|            Virginia|        VA|
|       James K. Polk|1795-11-02|      North Carolina|    Others|
|      Zachary Taylor|1784-11-24|            Virginia|        VA|
|    Millard Fillmore|1800-01-07|            New York|    Others|
|     Franklin Pierce|1804-11-23|       New Hampshire|    Others|
|      James Buchanan|1791-04-23|        Pennsylvania|    Others|
|     Abraham Lincoln|1809-02-12|            Kentucky|    Others|
|      Andrew Johnson|1808-12-29|      North Carolina|    Others|
|    Ulysses S. Grant|1822-04-27|                Ohio|        OH|
| Rutherford B. Hayes|1822-10-04|                Ohio|        OH|
|   James A. Garfield|1831-11-19|                Ohio|        OH|
+--------------------+----------+--------------------+----------+
only showing top 20 rows

1 Comment

  1. Hi Nitin,

    How can we add more conditions within single WHEN clause?
    For example –
    when(condition1 && condition2 , “this value”).otherwise(“that value”)

    Thank you in advance.

Leave a Reply

Your email address will not be published. Required fields are marked *