Skip to content

SQL & Hadoop

SQL on Hadoop with Hive, Spark & PySpark on EMR & AWS Glue

  • Home
  • About
  • Contact
  • Privacy Policy
Search

SQL & Hadoop

SQL on Hadoop with Hive, Spark & PySpark on EMR & AWS Glue

Close menu
  • Home
  • About
  • Contact
  • Privacy Policy

SQL & Hadoop

SQL on Hadoop with Hive, Spark & PySpark on EMR & AWS Glue

Toggle menu

Category: Apache HIVE

Apache Hive is one of the most popular SQL framework in Hadoop ecosystem.

How to Subtract TIMESTAMP-DATE-TIME in HIVE

We may want to subtract two timestamps in order to find out the difference between occurrence of two events. This is a very common operation which we perform on any TIMESTAMP or DATE or TIME data type. Now the question […]

Read more
Apache HIVEBy Raj0 comments

Handle Date and Timestamp in HIVE like a pro – Everything you must know

Hive supports traditional UNIX timestamp data type with nanosecond upto 9 decimal precision (in Teradata it is till 6 decimal precision for timestamp data type). Typical TIMESTAMP data type has DATE part which is YEAR/MONTH/DAY and TIME part which is […]

Read more
Apache HIVEBy Raj1 comment

Create your first Table in HIVE and load data into it.

Once you have access to HIVE , the first thing you would like to do is Create a Database and Create few tables in it. Before we start with the SQL commands, it is good to know how HIVE stores […]

Read more
Apache HIVEBy Raj1 comment

Bucketized tables do not support INSERT INTO

Have you encountered this error while working on HIVE tables with clusters and buckets ? This is one of the most common error we face in HIVE while doing transactions on tables. Since transactions in HIVE is still not as […]

Read more
Apache HIVEBy Raj0 comments

Posts navigation

< 1 2 3

Recent Posts

  • AWS Glue create dynamic frame
  • AWS Glue read files from S3
  • How to check Spark run logs in EMR
  • PySpark apply function to column
  • Run Spark Job in existing EMR using AIRFLOW

Join the discussion

  1. Raj on PySpark-How to Generate MD5 of entire row with columnsMarch 9, 2023

    Done. Please check now.

  2. Anand on PySpark-How to Generate MD5 of entire row with columnsFebruary 25, 2023

    can you please make the video available to learn

  3. Raj on Free Online SQL to PySpark ConverterAugust 9, 2022

    Thank you for sharing this. I will give it a try as well.

  4. John K-W on Free Online SQL to PySpark ConverterAugust 8, 2022

    Might be interesting to add a PySpark dialect to SQLglot https://github.com/tobymao/sqlglot https://github.com/tobymao/sqlglot/tree/main/sqlglot/dialects

  5. Meena M on Spark Dataframe WHEN caseJuly 28, 2022

    try something like df.withColumn("type", when(col("flag1"), lit("type_1")).when(!col("flag1") && (col("flag2") || col("flag3") || col("flag4") || col("flag5")), lit("type2")).otherwise(lit("other")))

© 2023 SQL & Hadoop.
x
x