[F618] - AI/ML INFRASTRUCTURE SPECIALIST

Bebeeinfrastructure


AI/ML Infrastructure Specialist We are seeking a highly skilled AI/ML Infrastructure Specialist to join our team. As a key member of our infrastructure team, you will be responsible for designing and implementing scalable AI/ML infrastructure on Databricks. The ideal candidate will have 3+ years of experience in MLOps, ML Engineering, Data Engineering or related roles, focusing on deploying and managing AI/ML workflows in production environments. You will also have strong knowledge of Git workflows, CI/CD practices, and tools like GitLab or similar. Responsibilities 1. Design and Implement Scalable AI/ML Infrastructure: Build and maintain scalable AI/ML infrastructure on Databricks, leveraging Unity Catalog and feature stores to support model development and deployment. 2. Drift Detection Frameworks: Design and implement frameworks for detecting data and model drift, ensuring continuous monitoring and high reliability of AI/ML models in production. 3. Model Calibration & Versioning: Develop model calibration frameworks and establish versioning practices to maintain transparency and reproducibility across the AI/ML lifecycle. 4. Low-Latency Orchestration: Design and optimize reinforcement learning (RL) orchestration pipelines, including Contextual Bandits, for real-time execution in low-latency environments. 5. Automated Training Pipelines: Create automated frameworks for training, retraining, and validating AI/ML models, enabling efficient experimentation and deployment. 6. CI/CD for AI/ML: Implement CI/CD best practices to streamline the deployment and monitoring of AI/ML models, integrating with Databricks workflows and Git-based version control systems. 7. Collaboration: Work closely with AI/ML Scientists to ship, deploy, and maintain models. 8. Monitoring & Optimization: Build tools for model performance monitoring, operational analytics, and drift mitigation, ensuring reliable operation in production environments. Requirements - MLOps Experience: 3+ years of experience in MLOps, ML Engineering, Data Engineering or related roles, focusing on deploying and managing AI/ML workflows in production environments. - Programming Skills: Strong programming skills in Python, with 5+ years of experience. - Infrastructure Knowledge: Proficient in using Databricks (2-3 years), Apache Spark, ML Flow, Unity Catalog, and feature stores. - Git Workflow Expertise: Strong knowledge of Git workflows, CI/CD practices, and tools like GitLab or similar. - Communication Skills: Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams. Benefits - Professional Growth: Opportunities for professional growth and development, with mentorship, TechTalks, and personalized growth roadmaps. - Competitive Compensation: Competitive USD-based compensation and benefits package. - Exciting Projects: Work on exciting projects with modern solutions development and top-tier clients. - Flextime: Flexible working hours and remote work options.

trabajosonline.net © 2017–2021
Más información