GCP Lead with Data integration Experience position in San Jose, CA (Remote, Locals Preferred)

C2C
  • C2C
  • Anywhere

Title: GCP Lead with Data integration Experience

Duration: Long Term

Location: San Jose, CA (Remote, Locals Preferred)

 

Job Description:

  • Data Engineer with extensive experience in building data integration pipelines in CI/CD model
  • Experience: Lead 12+ years (Lead experience 5 +yrs.)
  • Ability to design and develop a high performance data pipeline framework from scratch
  • Data ingestion across systems
  • Data quality and curation
  • Data transformation and efficient data storage
  • Data reconciliation, monitoring and controls
  • Support reporting model and other downstream application needs
  • Skill in technical design documentation, data modeling and performance tuning applications
  • Lead and manage a team of data engineers, contribute towards code reviews, and guide the team in designing and developing convoluted data pipelines adhering to the defined standards.
  • Be hands on, performs POCs on the open source/licensed tools in the market and share recommendations.
  • Provide technical leadership and contribute to the definition, development, integration, test, documentation and support across multiple platforms (GCP, Python, HANA)
  • Establish a consistent project management framework and develop processes to deliver high quality software, in rapid iterations, for the business partners in multiple geographies
  • Participate in a team that designs, develops, troubleshoots, and debugs software programs for databases, applications, tools etc.
  • Experience in balancing production platform stability, feature delivery and reduction of technical debt across a broad landscape of technologies.
  • Skill in the following platform, tools and technologies
  • GCP cloud platform – GCS, Big Query, Streaming (pub/sub), data proc and data flow
  • Python, PYSpark, Kafka, SQL, shell scripting & Stored procs
  • Data warehouse, distributed data platforms and data lake
  • Database definition, schema design, Looker Views, Models
  • CI/CD pipeline
  • Proven track record in scripting code in Python, PySpark and SQL
  • Excellent structured thinking skills, with the ability to break down multi-dimensional problems
  • Ability to navigate ambiguity and work in a fast-moving environment with multiple stakeholders
  • Good communication skills and ability to coordinate and work with cross functional teams.


From:
Arjun Kumar,
Infotech Spectrum, Inc
arjun@infotechspectrum.com
Reply to:   arjun@infotechspectrum.com