Role –Data Architect
Location – San Jose, CA
Remote/Hybrid – Hybrid
C2C
JD
Technical Skills
• In-depth knowledge of Apache Spark, Spark APIs [Spark SQL and DataFrame APIs, Spark Structured Streaming & Spark MLlib for analytics] and Kafka, can code in Scala /Java.
• Knowledge of Flink, streaming and batching modes, caching and optimizing performance.
• Design and develop analytics workloads using Apache Spark and Scala for processing of big data
• Create and optimize data transformation pipelines using Spark or Apache Flink
• Proficiency in performance tuning and optimization of Spark jobs
• Experience on migrating existing analytics workloads from cloud platforms to open-source Apache Spark infrastructure running on Kubernetes.
• Expertise in data modeling and optimization techniques for large-scale datasets
• Extensive experience with real spark production instance.
• Strong understanding of Data Lake, Big Data, ETL processes, and data warehousing concepts
• Good understanding of lakehouse storage technologies like Delta Lake and Apache Iceberg
• AWS knowledge
Other skills
• Technical Leadership: Lead and mentor a team of data engineers, analysts, and architects. Provide guidance on best practices, architectural decisions.
• Collaboration: Work closely with cross-functional teams including data scientists, business analysts, and developers to ensure seamless integration of data solutions.
• Communication: Excellent verbal and written communication skills with the ability to convey complex technical concepts to non-technical stakeholders.
From:
Nitya,
Nitya software solutions
ryan@nityainc.com
Reply to: ryan@nityainc.com