Role: Data Architect (Databricks, PySpark)

Location: Southwest Freeway, Sugar Land, TX 77478 (100% onsite)

Duration: Contract

Skills:

10+ years – Enterprise Data Management
10+ years – SQL Server based development of large datasets
5+ years with Data Warehouse Architecture, hands-on experience with Databricks platform. Extensive experience in PySpark coding. Snowflake experience is good to have
3+ years Python (NumPy, Pandas) coding experience
Experience in Data warehousing – OLTP, OLAP, Dimensions, Facts, and Data modeling
Good knowledge on Azure Cloud and services like ADF, Active Directory, App Services, ADLS etc
Hands on experience on CI/CD pipeline implementations
Previous experience leading an enterprise-wide Cloud Data Platform migration with strong architectural and design skills
Experience with Snowflake utilities such as SnowSQL and SnowPipe – good to have
Capable of discussing enterprise level services independent of technology stack
Experience with Cloud based data architectures, messaging, and analytics
Superior communication skills
Cloud certification(s)
Any experience with Reporting is a Plus
Excellent written and verbal communication, intellectual curiosity, a passion to understand and solve problems, consulting & customer service
Structured and conceptual mindset coupled with strong quantitative and analytical problem-solving aptitude
Exceptional interpersonal and collaboration skills within a team environment

Total Exp level: 15+ years

Responsibilities:

Migrate, Design, develop, and deploy AbInitio graphs to DBT jobs to process and analyze large volumes of data.
Collaborate with data engineers and data scientists to understand data requirements and implement appropriate data processing pipelines.
Optimize DBT jobs for performance and scalability to handle big data workloads.
Implement best practices for data management, security, and governance within the Databricks environment. Experience designing and developing Enterprise Data Warehouse solutions.
Demonstrated proficiency with Data Analytics, Data Insights
Proficient writing SQL queries and programming including stored procedures and reverse engineering existing process
Leverage SQL, programming language (Python or similar) and/or ETL Tools (Azure Data Factory, Data Bricks, Talend and SnowSQL) to develop data pipeline solutions to ingest and exploit new and existing data sources.
Perform code reviews to ensure fit to requirements, optimal execution patterns and adherence to established standards.
Collaborate with data engineers and data scientists to understand data requirements and implement appropriate data processing pipelines.
Optimize Databricks jobs for performance and scalability to handle big data workloads.

From:
Sanyogita Dwivedi,
Veridian Tech Solutions
sanyogita@veridiants.com
Reply to: sanyogita@veridiants.com