Job Title: ETL Architect
Location: Chicago IL
Work Mode: Hybrid
Duration: Long Term
Managers Comments: ETL Architect – large scale ETL – large enterprises
Data volume is huge
Python , pyspark , / AWS
Glue , step functions , MySQL rds redshift
Key Responsibilities:
- Lead the architecture and design of robust ETL pipelines to process, transform, and load large datasets (8 billion+ records).
- Design, implement, and optimize data workflows using AWS Glue, Step Functions, Python, and PySpark.
- Build and maintain scalable data architectures for cloud-based systems, focusing on performance, scalability, and reliability.
- Collaborate with data engineers, data analysts, and stakeholders to ensure that data is integrated, processed, and delivered with high quality and speed.
- Oversee the ETL process to manage data flow from source systems to MySQL RDS, Redshift, and other data storage solutions.
- Troubleshoot and resolve performance issues, optimizing resource utilization, and ensuring data pipelines run efficiently.
- Ensure best practices in data security, governance, and compliance are followed throughout the ETL process.
- Provide technical leadership, mentorship, and guidance to junior team members.
- Work in an agile environment, delivering high-quality solutions and meeting deadlines effectively.
Skills and Qualifications:
- Strong expertise in Python, PySpark, and cloud-based data engineering technologies.
- Experience with AWS Glue, AWS Step Functions, MySQL RDS, and Amazon Redshift.
- Proven track record in designing scalable and performant ETL systems for large-scale data
- Familiarity with data modeling, data warehousing, and cloud architectures.
- Deep understanding of data pipeline orchestration and automation.
- Experience in working with distributed systems and parallel processing.
- Strong problem-solving skills and ability to optimize data flows and processing.
From:
Vinci Anuroop,
Whiztekcorp
vinci@whiztekcorp.com
Reply to: vinci@whiztekcorp.com