Recently I was working on a task where I wanted Spark Dataframe Column List in a variable. This was required to do further processing depending on some technical columns present in the list. So we know that you can print Schema of Dataframe using printSchema method. It will show tree hierarchy of columns along with data type and other info. Example:
scala> df_pres.printSchema() root |-- pres_id: byte (nullable = true) |-- pres_name: string (nullable = true) |-- pres_dob: date (nullable = true) |-- pres_bp: string (nullable = true) |-- pres_bs: string (nullable = true) |-- pres_in: date (nullable = true) |-- pres_out: date (nullable = true)
To Fetch column details, we can use “columns” to return all the column names in the dataframe. This return array of Strings.
Dataframe Columns
scala> df_pres.columns res8: Array[String] = Array(pres_id, pres_name, pres_dob, pres_bp, pres_bs, pres_in, pres_out)
The requirement was to get this info into a variable. So we can convert Array of String to String using “mkString” method. This will result in “String” return type.
scala> df_pres.columns.mkString(",")
res11: String = pres_id,pres_name,pres_dob,pres_bp,pres_bs,pres_in,pres_outI wanted the column list to be comma separated. Let’s store this output into a variable to be used later for processing.
scala> var ColList = df_pres.columns.mkString(",")
ColList: String = pres_id,pres_name,pres_dob,pres_bp,pres_bs,pres_in,pres_outTo check value of this variable we can print and check it.
scala> print (ColList) pres_id,pres_name,pres_dob,pres_bp,pres_bs,pres_in,pres_out
We can also specify the separator to be used inside mkString method. You can change the delimiter too. Below we set it to “|” delimiter in place of “,”
scala> var ColList = df_pres.columns.mkString("|")
ColList: String = pres_id|pres_name|pres_dob|pres_bp|pres_bs|pres_in|pres_out
scala> print (ColList)
pres_id|pres_name|pres_dob|pres_bp|pres_bs|pres_in|pres_outThis way we can fetch all the columns present in a Dataframe and store in a variable with desired delimiter.
