Never run INSERT OVERWRITE again – try Hadoop Distcp
Recently, I was working on one project where the ETL requirement was to have daily snapshot of the table. It was 15+ years old data model on which datawarehouse was designed and the client wanted to replicate it on Hadoop. So you can convert the ETL to Spark SQL however not everything works as-is on […]
Never run INSERT OVERWRITE again – try Hadoop Distcp Read More »