Skip to content

SQL & Hadoop

SQL on Hadoop with Hive, Spark & PySpark on EMR & AWS Glue

  • Home
  • About
  • Contact
  • Privacy Policy

SQL & Hadoop

SQL on Hadoop with Hive, Spark & PySpark on EMR & AWS Glue

Close menu
  • Home
  • About
  • Contact
  • Privacy Policy

SQL & Hadoop

SQL on Hadoop with Hive, Spark & PySpark on EMR & AWS Glue

Toggle menu

Category: SQL on Hadoop

Columnar Storage & why you must use it

If you are working on Hadoop or any other platform and storing structured data, I am sure you must have heard about columnar storage types. In the past 7-8 years the popularity “columnar” has gained confirms that the buzz is […]

Read more
SQL on HadoopBy Raj2 comments

Hadoop & Hive – Introduction for Beginners

Hadoop is a very popular framework for data storage and data processing. So it suffice two main purposes: Distributed Data Storage using HDFS ( Hadoop Distributed File System) Data processing using Map-Reduce. In Hadoop everything is in File format. It […]

Read more
SQL on HadoopBy Raj1 comment

SQL on RDBMS to SQL on Hadoop/Cloud

We all have been using SQL on RDBMS for so long now. The time has come when we shall switch to SQL on Hadoop/Cloud. SQL (Structured Query Language) help us in communicating with any RDBMS like Teradata, Oracle, Netezza, SQL […]

Read more
SQL on HadoopBy Raj0 comments

Recent Posts

  • AWS Glue create dynamic frame
  • AWS Glue read files from S3
  • How to check Spark run logs in EMR
  • PySpark apply function to column
  • Run Spark Job in existing EMR using AIRFLOW

Join the discussion

  1. Raj on Free Online SQL to PySpark ConverterAugust 9, 2022

    Thank you for sharing this. I will give it a try as well.

  2. John K-W on Free Online SQL to PySpark ConverterAugust 8, 2022

    Might be interesting to add a PySpark dialect to SQLglot https://github.com/tobymao/sqlglot https://github.com/tobymao/sqlglot/tree/main/sqlglot/dialects

  3. Meena M on Spark Dataframe WHEN caseJuly 28, 2022

    try something like df.withColumn("type", when(col("flag1"), lit("type_1")).when(!col("flag1") && (col("flag2") || col("flag3") || col("flag4") || col("flag5")), lit("type2")).otherwise(lit("other")))

  4. tagu on Free Online SQL to PySpark ConverterJuly 20, 2022

    It will be great if you can have a link to the convertor. It helps the community for anyone starting…

  5. Kyle on Hive Date Functions – all possible Date operationsMay 13, 2022

    I am wondering if there is a way to preserve time information when adding/subtracting days from a datetime. If I…

© 2023 SQL & Hadoop.