Role: MLOps Engineer
Location: REMOTE (San Diego, CA)
As an MLOps Engineer, you will play a crucial role in the deployment, monitoring, and maintenance of machine learning pipelines. You will work closely with data scientists, software engineers, and IT operations to ensure that our machine learning models are reliable, scalable, and performing optimally in production environments. Your expertise will be essential in automating and streamlining our ML workflows, enhancing model reproducibility, and ensuring continuous integration and delivery. The MLOps Engineer will directly report to the Head of AI.
Responsibilities: –
· Design, build, and maintain the infrastructure required for efficient development, deployment, and monitoring of machine learning models.
· Implement CI/CD pipelines for machine learning applications.
· Develop and manage cloud-based and on-premises solutions for model training, deployment, and monitoring.
· Ensure the scalability, reliability, and performance of machine learning systems.
· Collaborate with data scientists to understand and implement requirements for model serving, versioning, and reproducibility.
· Monitor and optimize model performance in production, identifying and resolving issues proactively.
· Automate repetitive tasks to improve efficiency and reduce the risk of human error.
· Maintain documentation and provide training to team members on MLOps best practices.
· Stay updated with the latest developments in MLOps tools, technologies, and methodologies.
· Communicate and share knowledge with other team members and actively participate in various learning-sharing opportunities.
Mandatory skills
- 4+ years of experience in MLOps, DevOps, or related fields.
- Strong programming skills in Python, with experience in other languages such as Java, C++, or Scala being a plus.
- Experience with ML frameworks such as TensorFlow, PyTorch, and/or scikit-learn.
- Proficiency with CI/CD tools such as Jenkins, or GitLab CI.
- Extensive experience with cloud platforms such as AWS, Google Cloud, or Azure.
- Hands-on experience with containerization and orchestration tools like Docker and Kubernetes.
- Knowledge of infrastructure-as-code tools such as Terraform or CloudFormation.
- Strong understanding of machine learning lifecycle, including data preprocessing, model training, evaluation, and deployment.
- Excellent problem-solving skills and the ability to work independently as well as part of a team.
- Strong communication skills and the ability to explain complex technical concepts to non-technical stakeholders
- Experience with feature stores, model registries, and monitoring tools such as MLflow, Tecton, or Seldon.
- Familiarity with data engineering tools like Apache Spark, Kafka, or Airflow.
- Knowledge of security best practices for machine learning systems.
Experience with A/B testing and model performance monitoring.
From:
Ved,
PEOPLE FORCE CONSULTING INC
vyas.s@pforceinc.com
Reply to: vyas.s@pforceinc.com