How to Subtract TIMESTAMP-DATE-TIME in HIVE

We may want to subtract two timestamps in order to find out the difference between occurrence of two events. This is a very common operation which we perform on any TIMESTAMP or DATE or TIME data type. Now the question is how to find out the difference or subtract two timestamp ?

Hive Timestamp Difference

In HIVE we can directly subtract two timestamp columns/values and get the result. Let’s see this with an example:

TIMESTAMP 1: 2020-09-22 17:22:38
TIMESTAMP 2: 2020-09-12 14:53:19

Now I want to find the difference between these 2 timestamp values. I can directly subtract it.

hive subtract timestamp

hive> select CAST('2017-09-22 17:22:38' as timestamp) - CAST('2017-09-12 14:53:19' as timestamp );
 OK
 10 02:29:19.000000000

The OUTPUT is in format “interval_day_time”.
Now if you want the output in SECONDS rather than DAY HOUR:MINUTES:SECOND then you can extract the values and convert it individually to SECONDS.

hive extract day, hour, minute, second from timestamp

hive> select day(CAST('2017-09-22 17:22:38' as timestamp) - CAST('2017-09-12 14:53:19' as timestamp ));
10
hive> select hour(CAST('2017-09-22 17:22:38' as timestamp) - CAST('2017-09-12 14:53:19' as timestamp ));
2
hive> select minute(CAST('2017-09-22 17:22:38' as timestamp) - CAST('2017-09-12 14:53:19' as timestamp ));
29
hive> select second(CAST('2017-09-22 17:22:38' as timestamp) - CAST('2017-09-12 14:53:19' as timestamp ));
19

So 10 DAYS will become 10*24*60*60 = 864000
02 HOUR will become 02*60*60 = 7200
29 MINS will become 29*60 = 1740
19 SECS will remain 19 = 19
So TOTAL SECONDS: 872959 sec

hive subtract interval from timestamp

You can use “INTERVAL” keyword to specify the interval to subtract from given timestamp. This interval could be only hour or could be a combination of day-hour-min-sec.

hive subtract interval from timestamp

hive> select CAST('2017-09-22 17:22:38' as timestamp) - interval '10 02:29:19' day to second;
OK
2017-09-12 14:53:19

hive subtract hours from timestamp

hive> select CAST('2017-09-22 17:22:38' as timestamp) - interval '10' hour as col1;
OK
2017-09-22 07:22:38

hive subtract minutes from timestamp

hive> select CAST('2017-09-22 17:22:38' as timestamp) - interval '10' minute as col1;
OK
2017-09-22 17:12:38

hive subtract seconds from timestamp

hive> select CAST('2017-09-22 17:22:38' as timestamp) - interval '10' second as col1;
OK
2017-09-22 17:22:28

hive subtract date

Similarly we may want to subtract two DATEs and find the difference. There are functions available in HIVE to find difference between two dates however we can follow the same method to find the difference too. Let’s see how we can use DATEDIFF function to get the output:

hive> select datediff(to_date('2017-09-22'), to_date('2017-09-12'));
 OK
 10

The return type is “INT” and it shows the difference in number of DAYS. So in this case output is 10 which means 10 Days difference between 2 dates.
We can use the other method too to directly subtract the dates and output will be in format “interval_day_time”.

hive> select CAST('2017-09-22' as DATE) - CAST('2017-09-12' as DATE );
 OK
 10 00:00:00.000000000

Now if you want the output in SECONDS then just convert DAY into seconds.
So 10 DAYS will become 10*24*60*60 = 864000

hive subtract time

Similarly we want to subtract two TIMEs and find the difference. In this we cannot directly subtract as there is no TIME data type in HIVE. There is DATE and there is TIMESTAMP however presently we don’t have any explicit TIME data type in HIVE. So to subtract 2 TIME (present as String in HIVE) we will use unix_timestamp function. The function return type is “BIGINT” and is the difference between TIMESTAMP and UNIX Epoch.

hive> select unix_timestamp('17:22:38','HH:mm:ss') - unix_timestamp('14:53:19','HH:mm:ss');
 OK
 8959

So there is 8959 seconds difference between these two TIME slots.

Hope the above examples helped you in calculating difference between 2 TIMESTAMPs or 2 DATEs or even 2 TIMEs value.

Leave a Reply