- Must have Primary skills required are Cloudera (Hadoop), Spark + Scala or Spark + Java and SQL
- The resources should also have good understanding of Hive, Aerospike.
- The resources should have strong analytical skills
-
- Documentation of lineage as per the existing template.
- Understanding the DQ rules from the data science team.
- Onboarding of new incremental datasets along with configuration of the DQ rules etc.
- Perform data validation checks.
- Copying of production data into test environments – to be confirmed.
- Report on the DQ issues
From:
madhavi,
veridian
madhavi@veridiants.com
Reply to: madhavi@veridiants.com