C2C
Anywhere
Posted 12 hours ago

Must have Primary skills required are Cloudera (Hadoop), Spark + Scala or Spark + Java and SQL
The resources should also have good understanding of Hive, Aerospike.
The resources should have strong analytical skills

Scope of work
Persistent resources will be taking the KT from the current team members on the developed framework.
The team will need to work on the following aspects:
Documentation of lineage as per the existing template.
Understanding the DQ rules from the data science team.
Onboarding of new incremental datasets along with configuration of the DQ rules etc.
Perform data validation checks.
Copying of production data into test environments – to be confirmed.
Report on the DQ issues

From:
Asanraj,
Vysystems
asanraj@vysystems.com
Reply to: asanraj@vysystems.com