Welcome to my website. I am Nitin Srivastava. A Data Engineer by profession with 14+ years of professional experience.I have worked with multiple enterprises using various technologies supporting Data Analytics requirements.
As a Data Engineer, primary skill has always been SQL. So when I started working on Hadoop projects I was excited to explore different SQL options available in it. I worked a lot on Apache Hive & Apache Spark.
During early days of Hadoop, it was on-premises Hadoop infrastructure in which enterprises invested heavily. So I got the opportunity to work on Hortonworks, Cloudera & MapR distribution.
From all that experience enterprises realised that Apache Spark is the best bet. Hence Apache Spark turns out to be the best thing coming out of that era. Now Spark is widely used by different enterprises for different data analytics requirements.
After few years, I got the opportunity to work on Apache Spark/Hive on AWS platform primarily leveraging AWS Glue & Amazon EMR.
Get started on Apache Spark with these free stuff
Check my blog post list
In this website I have shared my experience with SQL on “Hadoop” platform. I share posts about Apache Hive, Apache Spark, PySpark , Amazon EMR & AWS Glue.
Apache Hive Basics:
- hive sql tutorial
- hive variables
- hive partition
- hive select query
- hive distinct
- hive where
- hive subquery example
- hive between
- bucketized tables do not support
Apache Hive Date/Timestamp
Apache Hive Table Design