Online SQL to PySpark Converter

Recently many people reached out to me requesting if I can assist them in learning PySpark , I thought of coming up with a utility which can convert SQL to PySpark code. I am sharing my weekend project with you guys where I have given a try to convert input SQL into PySpark dataframe code.

Feel free to use it and share you feedback. I am reading your feedback & comments now and will release new version of the utility depending on comments. So please do leave comment.

Generate PySpark Code Automatically

Enter your SQL here

PySpark output

Important points:

It is almost impossible to cover all types SQL and this utility is no exception. Considering this is my weekend project and I am still working on it, the SQL coverage may not be as much you or I would have loved to cover. That being said, I would like to share some points with you which you can consider while using the utility.

  • The utility does not support JOIN & SUBQUERIES right now.
  • While using aggregate functions make sure to use group by too
  • Try to use alias for derived columns.
  • Look at the sample query and you can use similar SQL to convert to PySpark.
  • I have tried to make sure that the output generated is accurate however I will recommend you to verify the results at your end too.

3 thoughts on “Online SQL to PySpark Converter”

  1. Hi Sir, I am Data Engineer and I am new to pyspark and have assigned task so would you be so kind to help me regarding that?

