Hive Date Functions

Hive Date Functions – all possible Date operations

Date Functions are used to apply different transformations on DATE datatype in HIVE. DATE consists of primarily 3 components – YEAR , MONTH & DAY. Any function which help us in identifying, extract or modify value of these DATE components can be classified as DATE Function.

Hive Date Functions
Hive Date Functions

Handling DATE in HIVE is not as easy as compared to many traditional RDBMS in the market. One of the reason why I feel so is the inability of HIVE to throw proper error when dealing with wrong date formats. If any function or date format is not compatible or is not used correctly, it will give NULL as output in place of error unless it is syntax error. This makes identification of wrong date manipulation in HIVE more tricky.

hive date functions

There are many date functions available in hive which you can use and below is the list of most common hive date functions.

hive current_date : fetch today’s date in hive

hive> select current_date;
OK
2021-04-27

hive date_add : add number of days to given date

hive> select current_date as today_date, date_add(current_date,10) as add_10_days;
OK
today_date      add_10_days
2021-04-27      2021-05-07
Time taken: 0.046 seconds, Fetched: 1 row(s)

Note : You can also give negative number to subtract days using date_add function.

hive> select current_date as today_date, date_add(current_date,-10) as `add_-10_days`;
OK
today_date      add_-10_days
2021-04-27      2021-04-17
Time taken: 0.045 seconds, Fetched: 1 row(s)

hive date_sub : subtract number of days from given date

hive> select current_date as today_date, date_sub(current_date,10) as sub_10_days;
OK
today_date      sub_10_days
2021-04-27      2021-04-17
Time taken: 0.045 seconds, Fetched: 1 row(s)

Note : If you will pass negative number to date_sub , it will work as date_add.

hive> select current_date as today_date, date_sub(current_date,-10) as `sub_-10_days`;
OK
today_date      sub_-10_days
2021-04-27      2021-05-07
Time taken: 0.045 seconds, Fetched: 1 row(s)

hive current day minus 1 day

hive> select current_date as today_date, date_sub(current_date,1) as sub_1_day;
OK
today_date      sub_1_day
2021-04-27      2021-04-26
Time taken: 0.039 seconds, Fetched: 1 row(s)

hive day of week : fetch weekday number – sunday is 1, monday is 2 …sat is 7

hive> select extract(dayofweek from current_date);
OK
_c0
3
Time taken: 0.044 seconds, Fetched: 1 row(s)

hive add_months : add months to given date

hive> select current_date as today_date, add_months(current_date,1) as add_1_month;
OK
today_date      add_1_month
2021-04-27      2021-05-27
Time taken: 0.042 seconds, Fetched: 1 row(s)

hive first day of month : use trunc (truncate) in hive

hive> select current_date as today_date, trunc(current_date,'MM') as first_day_month;
OK
today_date      first_day_month
2021-04-27      2021-04-01
Time taken: 0.04 seconds, Fetched: 1 row(s)

hive first day of year : use trunc (truncate) in hive

hive> select current_date as today_date, trunc(current_date,'YYYY') as first_day_year;
OK
today_date      first_day_year
2021-04-27      2021-01-01
Time taken: 0.041 seconds, Fetched: 1 row(s)

hive extract year from date

hive> select current_date as today_date, extract(year from current_date) as extract_year;
OK
today_date      extract_year
2021-04-27      2021
Time taken: 0.042 seconds, Fetched: 1 row(s)

hive get month from date

hive> select current_date as today_date, extract(month from current_date) as extract_month;
OK
today_date      extract_month
2021-04-27      4
Time taken: 0.042 seconds, Fetched: 1 row(s)

hive last day of month

hive> select current_date as today_date, last_day(current_date) as last_day_month;
OK
today_date      last_day_month
2021-04-27      2021-04-30
Time taken: 0.039 seconds, Fetched: 1 row(s)

hive datediff

Hive DATEDIFF function is used to calculate the difference between two dates. Datediff returns the number of days between two input dates. DATEDIFF function accepts two input parameters i.e. end date and start date. Make sure to pass end date as first parameter and start date as second parameter to DATEDIFF function in hive. Below is the example:

hive> select datediff('2021-04-27','2021-04-20') as days_diff;
OK
days_diff
7
Time taken: 0.037 seconds, Fetched: 1 row(s)

hive date format

In the below examples have used DATE_FORMAT which is available from Hive 1.2 onwards. There is more than one way to achieve same result and we have listed it where ever possible. Also you can use your own methods to get desired result like for adding a year you can try to add 12 months in place of a year. You can use most of the DATE FORMAT combinations together to change DATE into required format.

DATE_FORMAT will give correct output when your input date is in format ‘yyyy-MM-dd’. If it is in any other format , it will result in NULL output or error. In order to handle such scenarios you have to use combination of from_unixtime & unix_timestamp.

select from_unixtime(unix_timestamp('05-JAN-20', 'dd-MMM-yy'),'yyyy-MM-dd');

30+ hive date format examples

Title SQL Code Output Description
Current Date select current_date; 2018-06-15 How to select present date
Change Date Format select date_format(current_date,’MM/dd/YYYY’); 06/15/2018 Display date in some other date format
Change Date Format select date_format(current_date,’dd-MMM-YYYY’); 15-Jun-2018 Display Month Name in Date
Change Date Format select date_format(current_date,’dd-MMMM-YYYY’); 15-June-2018 Display full Month Name in Date
Day of the Week select date_format(current_date,’E’); Fri Display Day Name of the week
Day of the Week select date_format(current_date,’EEEE’); Friday Display Day Name(Full) of the week
Day of the Week select date_format(current_date,’u’); 5 Display Day Number of the week (1=Mon..5=Fri..7=Sun)
Day of the Year select date_format(current_date,’D’); 166 Display Day of the Year out of 365/366
Day Occurrence of the Month select date_format(current_date,’YYYY-MM-dd EEEE(F)’); 2018-06-15 Friday(3) Display occurrence of day in the month like 3rd Friday of month
Week of the year select date_format(current_date,’w’); 24 Display which week of the year date belongs to
Week of the month select date_format(current_date,’W’); 3 Display which week of the month date belongs to
TimeZone info select date_format(current_date,’z:Z’); UTC:+0000 Display current TIMEZONE info
Year select year(current_date); 2018 Display Year from the date
Month select month(current_date); 6 Display Month from the date
Day select day(current_date); 15 Display Day from the date
Week of the year select weekofyear(current_date); 24 Display which week of the year date belongs to
Add Days select date_add(current_date,2); 2018-06-17 Display date after adding days to it
Add Days select date_format(current_date + interval ‘2’ day,’YYYY-MM-dd’); 2018-06-17 Display date after adding days to it
Add Months select add_months(current_date,2); 2018-08-15 Display date after adding months to it
Add Months select date_format(current_date + interval ‘2’ month,’YYYY-MM-dd’); 2018-08-15 Display date after adding months to it
Add Year select date_format(current_date + interval ‘2’ year,’YYYY-MM-dd’); 2020-06-15 Display date after adding year/s
Add Year select add_months(current_date,12); 2019-06-15 Display date after adding year (or 12 months)
Subtract Days select date_add(current_date,-2); 2018-06-13 Display date after subtracting days to it
Subtract Days select date_sub(current_date,2); 2018-06-13 Display date after subtracting days to it
Subtract Days select date_format(current_date – interval ‘2’ day,’YYYY-MM-dd’); 2018-06-13 Display date after subtracting days
Subtract Days select date_format(current_date + interval ‘-2′ day,’YYYY-MM-dd’); 2018-06-13 Display date after subtracting days
Subtract Months select add_months(current_date,-2); 2018-04-15 Display date after subtracting months
Subtract Months select date_format(current_date + interval ‘-2′ month,’YYYY-MM-dd’); 2018-04-15 Display date after subtracting months
Subtract Months select date_format(current_date – interval ‘2’ month,’YYYY-MM-dd’); 2018-04-15 Display date after subtracting months
Subtract Year select date_format(current_date + interval ‘-2′ year,’YYYY-MM-dd’); 2016-06-15 Display date after subtracting Year
Subtract Year select add_months(current_date,-12); 2017-06-15 Display date after subtracting Year (12 months)
Subtract 2 dates select datediff(‘2018-12-31’,current_date); 199 Display difference of 2 dates

For more details about different formats available kindly check our previous post: Handle Date and Timestamp in HIVE like a pro

Hive Date Functions – FAQs

How do I cast a date format in hive ?

If the date value is in default format of ‘yyyy-MM-dd’ then you can use DATE_FORMAT function to cast a date value to other date format. If the value is in some other format then you can use from_unixtime and unix_timestamp together to cast date to another format.

How do I select current date in hive ?

To select the current date in hive you can use current_date function.

How do I add 7 days to a date in hive ?

You can add days to a given date by using date_add function. The function takes 2 parameters – first is the date value and second is the number of days to add. To add 7 days to current date you can use date_add(current_date,7)

I will continue to update this table with more operations like timestamp handling etc. However if you are looking for anything specific, let me know by dropping a comment. I will try to address that quickly.
Hope this helps.

31 thoughts on “Hive Date Functions – all possible Date operations”

  1. I am wondering if there is a way to preserve time information when adding/subtracting days from a datetime. If I run the following code:
    ‘select min(activity_tsp), min(date_format(date_sub(activity_tsp, 10),’yyyy-MM-dd HH:mm:ss.SSS’)) from…’
    I get the output
    | 2022-05-05 05:45:39 | 2022-04-25 00:00:00.000 |.
    I would like to see the second column as | 2022-04-25 05:45:39 |.
    How could I make that work?

  2. I am getting the output if I use hardcoded dates but I’m not getting the output if I pass the direct column. Could you please suggest where I’m going wrong

    For example:
    I’m getting output for this
    select date_format(‘2019-07-12 22:30:00.000′,’M/d/yyyy’)
    but not for this
    select dintervalstart,date_format(cast(dintervalstart as string),’M/d/yyyy’) FROM audit.iwrkgrpqueuestats_daily_vw

    similarly
    I’m getting output for this
    select from_unixtime(unix_timestamp(‘2019-07-12 22:30:00.000’), ‘M/dd/yyyy’)
    but not for this
    select dintervalstart, from_unixtime(unix_timestamp(dintervalstart, “yyyy-MM-dd HH:mm:ss.sss”), ‘M/dd/yyyy’) from audit.iwrkgrpqueuestats_daily_vw

    1. Hi Avinash
      This is really strange. First thing I will check is “dintervalstart” column sample values. What is the datatype of this column in Hive table. I may also check min & max value for this column to check boundary conditions. Why are you applying “cast as string” in date_format ?
      Please share this info for further debugging.
      If you have already fixed the problem , kindly share the solution.

      Best,
      Raj

  3. Hi

    I have a table emp & Salary.
    Salary table having each employee latest salary see the example details.
    During join one latest record of salary of each emp to be joined.
    I need latest salary, empno & latest date from the salary.

    select * from emp;
    | emp_no | birth_date | first_name | last_name | gender | hire_date |

    | 10001 | 02-09-1953 | Georgi | Facello | M | 26-06-1986 |
    | 10002 | 02-06-1964 | Bezalel | Simmel | F | 21-11-1985 |
    | 10003 | 03-12-1959 | Parto | Bamford | M | 28-08-1986 |
    | 10004 | 01-05-1954 | Chirstian | Koblick | M | 01-12-1986 |
    | 10005 | 21-01-1955 | Kyoichi | Maliniak | M | 12-09-1989 |

    mysql> select * from salary_sample;

    | emp_no | salary | from_date | to_date |

    | 10001 | 60117 | 26-06-1986 | 26-06-1987 |
    | 10001 | 62102 | 26-06-1987 | 25-06-1988 |
    | 10001 | 66074 | 25-06-1988 | 25-06-1989 |
    | 10001 | 66596 | 25-06-1989 | 25-06-1990 |
    | 10001 | 66961 | 25-06-1990 | 25-06-1991 |
    | 10001 | 71046 | 25-06-1991 | 24-06-1992 |
    | 10001 | 74333 | 24-06-1992 | 24-06-1993 |
    | 10001 | 75286 | 24-06-1993 | 24-06-1994 |
    | 10001 | 75994 | 24-06-1994 | 24-06-1995 |
    | 10001 | 76884 | 24-06-1995 | 23-06-1996 |
    | 10001 | 80013 | 23-06-1996 | 23-06-1997 |
    | 10001 | 81025 | 23-06-1997 | 23-06-1998 |
    | 10001 | 81097 | 23-06-1998 | 23-06-1999 |
    | 10001 | 84917 | 23-06-1999 | 22-06-2000 |
    | 10001 | 85112 | 22-06-2000 | 22-06-2001 |
    | 10001 | 85097 | 22-06-2001 | 22-06-2002 |
    | 10001 | 88958 | 22-06-2002 | 01-01-9999 |

    Regards,
    Sai

    1. Hi Sai,

      Please try below:

      select emp_no , salary , from_date
      from
      (
      select emp_no , salary , from_date, row_number() over( partition by emp_no order by from_date desc) as rnk
      from salary_sample )tb1
      where rnk=1;

      I haven’t executed it but I think it should work fine.
      Best,
      Raj

      1. Hi Raj,

        Not working for the above syntax.

        I have a emp table & salary table

        emp table:

        | emp_no | birth_date | first_name | last_name | gender | hire_date |
        +——–+————+————+————-+——–+————+
        | 10001 | 02-09-1953 | Georgi | Facello | M | 26-06-1986 |

        salary table:
        | emp_no | salary | from_date | to_date |
        +——–+——–+————+————+
        | 10001 | 60117 | 26-06-1986 | 26-06-1987 |
        | 10001 | 62102 | 26-06-1987 | 25-06-1988 |
        | 10001 | 66074 | 25-06-1988 | 25-06-1989 |
        | 10001 | 66596 | 25-06-1989 | 25-06-1990 |
        | 10001 | 66961 | 25-06-1990 | 25-06-1991 |
        | 10001 | 71046 | 25-06-1991 | 24-06-1992 |
        | 10001 | 74333 | 24-06-1992 | 24-06-1993 |
        | 10001 | 75286 | 24-06-1993 | 24-06-1994 |
        | 10001 | 75994 | 24-06-1994 | 24-06-1995 |
        | 10001 | 76884 | 24-06-1995 | 23-06-1996 |
        | 10001 | 80013 | 23-06-1996 | 23-06-1997 |
        | 10001 | 81025 | 23-06-1997 | 23-06-1998 |
        | 10001 | 81097 | 23-06-1998 | 23-06-1999 |
        | 10001 | 84917 | 23-06-1999 | 22-06-2000 |
        | 10001 | 85112 | 22-06-2000 | 22-06-2001 |
        | 10001 | 85097 | 22-06-2001 | 22-06-2002 |

        I have joined both the table. But i need During the join one latest record of salary of each emp to be joined.

        Thanks sai

      2. Hi Raj,

        Above query not working.

        I have a emp table:

        mysql> select * from emp_sample;
        +——–+————+————+————-+——–+————+
        | emp_no | birth_date | first_name | last_name | gender | hire_date |
        +——–+————+————+————-+——–+————+
        | 10001 | 02-09-1953 | Georgi | Facello | M | 26-06-1986 |

        Salary table:

        mysql> select * from salary_sample;
        +——–+——–+————+————+
        | emp_no | salary | from_date | to_date |
        +——–+——–+————+————+
        | 10001 | 60117 | 26-06-1986 | 26-06-1987 |
        | 10001 | 62102 | 26-06-1987 | 25-06-1988 |
        | 10001 | 66074 | 25-06-1988 | 25-06-1989 |
        | 10001 | 66596 | 25-06-1989 | 25-06-1990 |
        | 10001 | 66961 | 25-06-1990 | 25-06-1991 |
        | 10001 | 71046 | 25-06-1991 | 24-06-1992 |
        | 10001 | 74333 | 24-06-1992 | 24-06-1993 |
        | 10001 | 75286 | 24-06-1993 | 24-06-1994 |
        | 10001 | 75994 | 24-06-1994 | 24-06-1995 |
        | 10001 | 76884 | 24-06-1995 | 23-06-1996 |
        | 10001 | 80013 | 23-06-1996 | 23-06-1997 |
        | 10001 | 81025 | 23-06-1997 | 23-06-1998 |
        | 10001 | 81097 | 23-06-1998 | 23-06-1999 |
        | 10001 | 84917 | 23-06-1999 | 22-06-2000 |
        | 10001 | 85112 | 22-06-2000 | 22-06-2001 |
        | 10001 | 85097 | 22-06-2001 | 22-06-2002 |

        joined both table and bring emp and salary details together. During the join one latest record of salary of each emp_no to be joined

        thanks sai

        1. Hi Sai

          One request, it really helps when you specify clearly what is the error or issue or wrong output. “Not working” does not help anyone in identifying the problem.
          I have assumed that when you say “not working” you meant that you did not get the right output and there was no ERROR in the query.
          Also I am assuming that you are storing date (from_date & to_date) as string column and not actual date in the hive table.
          If that is true please try below query:

          select emp_no , salary , from_date
          from
          (
          select emp_no , salary , from_date, row_number() over( partition by emp_no order by from_unixtime(unix_timestamp(from_date, 'dd-MM-yyyy')) desc) as rnk
          from salary_sample )tb1
          where rnk=1;

          If not then please share the actual output of your query.

          Best,
          Raj

          1. Hi Raj,
            See the below in detail

            I have created a table in Mysql emp_sample & Salary_sample

            I have 1 to 10 employee records and loaded in emp_sample

            emp_sample:

            mysql> select * from emp_sample;
            +——–+————+————+————-+——–+————+
            | emp_no | birth_date | first_name | last_name | gender | hire_date |
            +——–+————+————+————-+——–+————+
            | 10001 | 02-09-1953 | Georgi | Facello | M | 26-06-1986 |
            | 10002 | 02-06-1964 | Bezalel | Simmel | F | 21-11-1985 |
            | 10003 | 03-12-1959 | Parto | Bamford | M | 28-08-1986 |
            | 10004 | 01-05-1954 | Chirstian | Koblick | M | 01-12-1986 |
            | 10005 | 21-01-1955 | Kyoichi | Maliniak | M | 12-09-1989 |
            | 10006 | 20-04-1953 | Anneke | Preusig | F | 02-06-1989 |
            | 10007 | 23-05-1957 | Tzvetan | Zielinski | F | 10-02-1989 |
            | 10008 | 19-02-1958 | Saniya | Kalloufi | M | 15-09-1994 |
            | 10009 | 19-04-1952 | Sumant | Peac | F | 18-02-1985 |
            | 10010 | 01-06-1963 | Duangkaew | Piveteau | F | 24-08-1989 |

            Each employee have multiple salary based on their appraisal from date range (from_date , to date). see below table

            mysql> select * from salary_sample;
            +——–+——–+————+————+
            | emp_no | salary | from_date | to_date |
            +——–+——–+————+————+
            | 10001 | 60117 | 26-06-1986 | 26-06-1987 |
            | 10001 | 62102 | 26-06-1987 | 25-06-1988 |
            | 10001 | 66074 | 25-06-1988 | 25-06-1989 |
            | 10001 | 66596 | 25-06-1989 | 25-06-1990 |
            | 10001 | 66961 | 25-06-1990 | 25-06-1991 |
            | 10001 | 71046 | 25-06-1991 | 24-06-1992 |
            | 10001 | 74333 | 24-06-1992 | 24-06-1993 |
            | 10001 | 75286 | 24-06-1993 | 24-06-1994 |
            | 10001 | 75994 | 24-06-1994 | 24-06-1995 |
            | 10001 | 76884 | 24-06-1995 | 23-06-1996 |
            | 10001 | 80013 | 23-06-1996 | 23-06-1997 |
            | 10001 | 81025 | 23-06-1997 | 23-06-1998 |
            | 10001 | 81097 | 23-06-1998 | 23-06-1999 |
            | 10001 | 84917 | 23-06-1999 | 22-06-2000 |
            | 10001 | 85112 | 22-06-2000 | 22-06-2001 |
            | 10001 | 85097 | 22-06-2001 | 22-06-2002 |
            | 10001 | 88958 | 22-06-2002 | 01-01-9999 |
            | 10002 | 65828 | 03-08-1996 | 03-08-1997 |
            | 10002 | 65909 | 03-08-1997 | 03-08-1998 |
            | 10002 | 67534 | 03-08-1998 | 03-08-1999 |
            | 10002 | 69366 | 03-08-1999 | 02-08-2000 |
            | 10002 | 71963 | 02-08-2000 | 02-08-2001 |
            | 10002 | 72527 | 02-08-2001 | 01-01-9999 |
            | 10003 | 40006 | 03-12-1995 | 02-12-1996 |
            | 10003 | 43616 | 02-12-1996 | 02-12-1997 |
            | 10003 | 43466 | 02-12-1997 | 02-12-1998 |
            | 10003 | 43636 | 02-12-1998 | 02-12-1999 |
            | 10003 | 43478 | 02-12-1999 | 01-12-2000 |
            | 10003 | 43699 | 01-12-2000 | 01-12-2001 |
            | 10003 | 43311 | 01-12-2001 | 01-01-9999 |
            | 10004 | 40054 | 01-12-1986 | 01-12-1987 |
            | 10004 | 42283 | 01-12-1987 | 30-11-1988 |
            | 10004 | 42542 | 30-11-1988 | 30-11-1989 |
            | 10004 | 46065 | 30-11-1989 | 30-11-1990 |
            | 10004 | 48271 | 30-11-1990 | 30-11-1991 |
            | 10004 | 50594 | 30-11-1991 | 29-11-1992 |
            | 10004 | 52119 | 29-11-1992 | 29-11-1993 |
            | 10004 | 54693 | 29-11-1993 | 29-11-1994 |
            | 10004 | 58326 | 29-11-1994 | 29-11-1995 |
            | 10004 | 60770 | 29-11-1995 | 28-11-1996 |
            | 10004 | 62566 | 28-11-1996 | 28-11-1997 |
            | 10004 | 64340 | 28-11-1997 | 28-11-1998 |
            | 10004 | 67096 | 28-11-1998 | 28-11-1999 |
            | 10004 | 69722 | 28-11-1999 | 27-11-2000 |
            | 10004 | 70698 | 27-11-2000 | 27-11-2001 |
            | 10004 | 74057 | 27-11-2001 | 01-01-9999 |
            | 10005 | 78228 | 12-09-1989 | 12-09-1990 |
            | 10005 | 82621 | 12-09-1990 | 12-09-1991 |
            | 10005 | 83735 | 12-09-1991 | 11-09-1992 |
            | 10005 | 85572 | 11-09-1992 | 11-09-1993 |
            | 10005 | 85076 | 11-09-1993 | 11-09-1994 |
            | 10005 | 86050 | 11-09-1994 | 11-09-1995 |
            | 10005 | 88448 | 11-09-1995 | 10-09-1996 |
            | 10005 | 88063 | 10-09-1996 | 10-09-1997 |
            | 10005 | 89724 | 10-09-1997 | 10-09-1998 |
            | 10005 | 90392 | 10-09-1998 | 10-09-1999 |
            | 10005 | 90531 | 10-09-1999 | 09-09-2000 |
            | 10005 | 91453 | 09-09-2000 | 09-09-2001 |
            | 10005 | 94692 | 09-09-2001 | 01-01-9999 |
            | 10006 | 40000 | 05-08-1990 | 05-08-1991 |
            | 10006 | 42085 | 05-08-1991 | 04-08-1992 |
            | 10006 | 42629 | 04-08-1992 | 04-08-1993 |
            | 10006 | 45844 | 04-08-1993 | 04-08-1994 |
            | 10006 | 47518 | 04-08-1994 | 04-08-1995 |
            | 10006 | 47917 | 04-08-1995 | 03-08-1996 |
            | 10006 | 52255 | 03-08-1996 | 03-08-1997 |
            | 10006 | 53747 | 03-08-1997 | 03-08-1998 |
            | 10006 | 56032 | 03-08-1998 | 03-08-1999 |
            | 10006 | 58299 | 03-08-1999 | 02-08-2000 |
            | 10006 | 60098 | 02-08-2000 | 02-08-2001 |
            | 10006 | 59755 | 02-08-2001 | 01-01-9999 |
            | 10007 | 56724 | 10-02-1989 | 10-02-1990 |
            | 10007 | 60740 | 10-02-1990 | 10-02-1991 |
            | 10007 | 62745 | 10-02-1991 | 10-02-1992 |
            | 10007 | 63475 | 10-02-1992 | 09-02-1993 |
            | 10007 | 63208 | 09-02-1993 | 09-02-1994 |
            | 10007 | 64563 | 09-02-1994 | 09-02-1995 |
            | 10007 | 68833 | 09-02-1995 | 09-02-1996 |
            | 10007 | 70220 | 09-02-1996 | 08-02-1997 |
            | 10007 | 73362 | 08-02-1997 | 08-02-1998 |
            | 10007 | 75582 | 08-02-1998 | 08-02-1999 |
            | 10007 | 79513 | 08-02-1999 | 08-02-2000 |
            | 10007 | 80083 | 08-02-2000 | 07-02-2001 |
            | 10007 | 84456 | 07-02-2001 | 07-02-2002 |
            | 10007 | 88070 | 07-02-2002 | 01-01-9999 |
            | 10008 | 46671 | 11-03-1998 | 11-03-1999 |
            | 10008 | 48584 | 11-03-1999 | 10-03-2000 |
            | 10008 | 52668 | 10-03-2000 | 31-07-2000 |
            | 10009 | 60929 | 18-02-1985 | 18-02-1986 |
            | 10009 | 64604 | 18-02-1986 | 18-02-1987 |
            | 10009 | 64780 | 18-02-1987 | 18-02-1988 |
            | 10009 | 66302 | 18-02-1988 | 17-02-1989 |
            | 10009 | 69042 | 17-02-1989 | 17-02-1990 |
            | 10009 | 70889 | 17-02-1990 | 17-02-1991 |
            | 10009 | 71434 | 17-02-1991 | 17-02-1992 |
            | 10009 | 74612 | 17-02-1992 | 16-02-1993 |
            | 10009 | 76518 | 16-02-1993 | 16-02-1994 |
            | 10009 | 78335 | 16-02-1994 | 16-02-1995 |
            | 10009 | 80944 | 16-02-1995 | 16-02-1996 |
            | 10009 | 82507 | 16-02-1996 | 15-02-1997 |
            | 10009 | 85875 | 15-02-1997 | 15-02-1998 |
            | 10009 | 89324 | 15-02-1998 | 15-02-1999 |
            | 10009 | 90668 | 15-02-1999 | 15-02-2000 |
            | 10009 | 93507 | 15-02-2000 | 14-02-2001 |
            | 10009 | 94443 | 14-02-2001 | 14-02-2002 |
            | 10009 | 94409 | 14-02-2002 | 01-01-9999 |
            | 10010 | 72488 | 24-11-1996 | 24-11-1997 |
            | 10010 | 74347 | 24-11-1997 | 24-11-1998 |
            | 10010 | 75405 | 24-11-1998 | 24-11-1999 |
            | 10010 | 78194 | 24-11-1999 | 23-11-2000 |
            | 10010 | 79580 | 23-11-2000 | 23-11-2001 |
            | 10010 | 80324 | 23-11-2001 | 01-01-9999 |

            I have provided the below query:

            1. Joined both the table and bring emp_sample and salary_smple details together
            2. During the join one latest record of salary_sample of each emp_no to be joined (I mean only one latest salary records to be obtained for each emp_no and joined with emp_sample table)

            mysql> select emp_sample.emp_no, birth_date, first_name, last_name, gender, hire_date, salary, from_date, to_date from emp_sample join salary_sample on emp_sample.emp_no=salary_sample.emp_no where salary in (select max(salary) from salary_sample group by emp_no);

            While provide the above query i found 6th record duplicates see the below details. (Duplicate means minimum salary also present. )

            I need only one latest salary records to be obtained for each emp_no.

            mysql> select emp_sample.emp_no, birth_date, first_name, last_name, gender, hire_date, salary, from_date, to_date from emp_sample join salary_sample on emp_sample.emp_no=salary_sample.emp_no where salary in (select max(salary) from salary_sample group by emp_no);
            +——–+————+————+————-+——–+————+——–+————+————+
            | emp_no | birth_date | first_name | last_name | gender | hire_date | salary | from_date | to_date |
            +——–+————+————+————-+——–+————+——–+————+————+
            | 10001 | 02-09-1953 | Georgi | Facello | M | 26-06-1986 | 88958 | 22-06-2002 | 01-01-9999 |
            | 10002 | 02-06-1964 | Bezalel | Simmel | F | 21-11-1985 | 72527 | 02-08-2001 | 01-01-9999 |
            | 10003 | 03-12-1959 | Parto | Bamford | M | 28-08-1986 | 43699 | 01-12-2000 | 01-12-2001 |
            | 10004 | 01-05-1954 | Chirstian | Koblick | M | 01-12-1986 | 74057 | 27-11-2001 | 01-01-9999 |
            | 10005 | 21-01-1955 | Kyoichi | Maliniak | M | 12-09-1989 | 94692 | 09-09-2001 | 01-01-9999 |
            | 10006 | 20-04-1953 | Anneke | Preusig | F | 02-06-1989 | 40000 | 05-08-1990 | 05-08-1991 |
            | 10006 | 20-04-1953 | Anneke | Preusig | F | 02-06-1989 | 60098 | 02-08-2000 | 02-08-2001 |
            | 10007 | 23-05-1957 | Tzvetan | Zielinski | F | 10-02-1989 | 88070 | 07-02-2002 | 01-01-9999 |
            | 10008 | 19-02-1958 | Saniya | Kalloufi | M | 15-09-1994 | 52668 | 10-03-2000 | 31-07-2000 |
            | 10009 | 19-04-1952 | Sumant | Peac | F | 18-02-1985 | 94443 | 14-02-2001 | 14-02-2002 |
            | 10010 | 01-06-1963 | Duangkaew | Piveteau | F | 24-08-1989 | 80324 | 23-11-2001 | 01-01-9999 |

            Hope this is clear

            Thanks
            Sai

          2. Thank you Sai.This really helps.
            Please try below in hive:

            select t1.emp_no,t1.birth_date,t1.first_name,t1.last_name,t1.gender,t1.hire_date,t2.salary,t2.from_date,t2.to_date from emp_sample t1 inner join (
            select emp_no , salary , from_date,to_date, row_number() over( partition by emp_no order by from_unixtime(unix_timestamp(from_date, 'dd-MM-yyyy')) desc) as rnk
            from salary_sample )t2
            on
            t1.emp_no = t2.emp_no
            where t2.rnk=1;

  4. I have a product table with manufacturing date (Prod_id, prod_name, mfg_date). I need for the following:

    I have created data and stored. And i have queried for the following.

    1. I have added 30 days expiry date from the manufacturer date.

    select prod_id, prod_name, mfg_date, date_add (from_unixtime(unix_timestamp(mfg_date,’dd-MM-yyyy’)),30) as exp_date from Products;
    OK
    prod_id prod_name mfg_date exp_date
    1001 Biscuits 21-05-2021 2021-06-20
    1002 Dates 22-05-2021 2021-06-21
    1003 Ghee 23-05-2021 2021-06-22
    1004 CoffeePowder 24-05-2021 2021-06-23
    1005 Oil 25-05-2021 2021-06-24
    1006 Groundnut 26-05-2021 2021-06-25
    1007 VitaminSyrup 27-05-2021 2021-06-26
    1008 Dhal 28-05-2021 2021-06-27
    1009 Shampoo 28-05-2021 2021-06-27
    1010 CreamPowder 28-05-2021 2021-06-27
    Time taken: 0.212 seconds, Fetched: 10 row(s)
    hive>

    2. I need number of days for the following using date difference from current date to expiry date. Can any one suggest how to query for the above question.

    Thanks

    1. Hi Balaji
      try this :
      select datediff(exp_date,current_date) as days_pending.

      If you want this in same query then you may try this:
      datediff(date_add (from_unixtime(unix_timestamp(mfg_date,’dd-MM-yyyy’)),30),current_date) as days_pending.

      You cannot use derived columns directly in hive. So either replace it with column definition or create derived table and use that on top of it.

      Also will suggest to keep all date columns in default format ‘yyyy-MM-dd’ else you may have to apply date format before applying most of the date transformations.

      Best,
      Raj

    1. Hi Ash
      try this:
      hive> select extract(day from last_day(‘2021-01-15’)) as jan, extract(day from last_day(‘2021-02-15’)) as feb,extract(day from last_day(‘2021-03-15’)) as mar,extract(day from last_day(‘2021-04-15’)) as apr;
      OK
      jan feb mar apr
      31 28 31 30
      Time taken: 0.046 seconds, Fetched: 1 row(s)

      Replace ‘2021-01-15’ with the input date. Will this work ?

        1. I will recommend to identify if the problem is with last_day function or extract. So run below two commands :

          select last_day(‘2021-01-15’);
          select extract(day from current_date);

          Replace the logic with traditional method for the one which is failing.

  5. Hi,
    How to get the output as Dec 31 of the Year prior to some pre-existing date value from a table.
    Example: Data in a table is 5th March of 2020, 03/05/2020
    And expecting an output as 31st of Dec 2019, 12/31/2019.
    I was able to think of adding dateadd function to get the previous year but not able to figure out of adding Dec 31 as the date.
    dateadd(year, -1, sp.start_date)

  6. Raj,

    What would be the best way to calculate the 1st of a month? For example, I want the 1st day of the month two months earlier. Today is 2020-09-02 minus 2 months is 2020-07-02 but I want 2020-07-01.

    Would something like the below work?

    select date_format( add_months( current_date, -2 ), ‘YYYY-MM-01’ );

    1. Hi Sam,
      Yes the approach you have mentioned will definitely work and thanks for sharing this.
      You can also try TRUNC function to truncate date to first of month.
      select trunc(add_months(current_date,-2),’MM’);

      Best,
      Raj

  7. Hi,
    How can I convert day to week starting Monday/Sunday

    e.g. if I have dates(MM/DD/YYYY) as below –
    08/03/2020
    08/04/2020
    08/05/2020
    08/06/2020
    08/07/2020
    08/08/2020
    08/09/2020
    I expect 08/03/2020 for all of them, considering I am taking Monday as the start of the week.

    1. Hi Sourav
      Please try below:
      next_day(date_sub(from_unixtime(unix_timestamp(col1,'MM/dd/yyyy'),'yyyy-MM-dd'),7),'Mo')
      *col1 is column in the table defined as string which stores date in the MM/dd/yyyy format.
      So first you have to convert it into the preferred format which is yyyy-MM-dd and then apply date function on it.
      **Change Mo to Tu,Su etc depending on the day you want to consider as start of the week.


      Raj

  8. Sudhanshu Shekhar

    Fetching the 15th last working day date-yyyyMMdd (excluding only weekends) in Hive

    I have a table with date column (date in string format yyyyMMdd). My requirement is to design a logic to fetch data from the table where “date column value equals to the date of the 15th previous working day” (excluding only Saturdays and Sundays) without using a UDF or a shell script. For example it is 21st Feb 2020 today; the logic should produce an output: 20200203.

  9. Hi,
    How can I fetch the 15th previous working date (yyyyMMdd) – excluding all the Saturdays and Sundays.
    Lets say it is 20th feb 2020 today; I want 20200131 as output.
    I want this in HIVE.

    1. I am sorry, I missed the details over my requirement.

      I have a table with date column – date format is in yyyyMMdd (string). I want to filter data from this table everyday for the 15th previous working day. need to get the value from the date column in my table which satisfies this condition.

  10. Hi,
    How can I fetch the 15th previous working date (yyyyMMdd) – excluding all the Saturdays and Sundays.
    Lets say it is 20th feb 2020 today; I want 20200131 as output.

  11. how can i convert a date like ’05-JAN-20′ to 20200105 (‘yyyymmdd’)?

    i have been on this for days. This is not working

    CAST(date_Format(’05-JAN-20′, ‘yyyymmdd’) AS BIGINT)

    1. Hi Max

      Apologies for delay in reply.
      Please try below:
      select from_unixtime(unix_timestamp(’05-JAN-20′, ‘dd-MMM-yy’),’yyyyMMdd’);

      date_format will only work when you input date is in format ‘yyyy-MM-dd’. If it is in any other format , it will result in NULL output or error. If input is in other format, you have to try from_unixtime & unix_timestamp.

      Hope this helps.

      Best,
      Raj

Leave a Reply