IN or NOT IN conditions are used in FILTER/WHERE or even in JOINS when we have to specify multiple possible values for any column. If the value is one of the values mentioned inside “IN” clause then it will qualify. It is opposite for “NOT IN” where the value must not be among any one present inside NOT IN clause. So let’s look at the example for IN condition scala> df_pres.filter($”pres_bs” in (“New York”,”Ohio”,”Texas”)).select($”pres_name”,$”pres_dob”,$”pres_bs”).show() +——————–+———-+——–+ | pres_name| pres_dob| pres_bs| +——————–+———-+——–+ | Martin Van Buren|1782-12-05|New York| | Millard Fillmore|1800-01-07|New York| | Ulysses S. Grant|1822-04-27| Ohio| | Rutherford B. Hayes|1822-10-04| Ohio| | James A. Garfield|1831-11-19| Ohio| |Read More →