We are searching for a skilled DevOps engineer to join our team. The successful candidate will be responsible for developing and maintaining our AI-driven systems' operations, ensuring the reliability, scalability, and performance of our services. This role requires a deep understanding of AI, machine learning, data management, and systems administration. As a DevOps engineer, you will work closely with our software engineering and data engineering teams to facilitate continuous integration/continuous delivery (CI/CD) for AI models and to deploy AI-driven services in a scalable and maintainable manner.
• Manage the machine learning lifecycle for AI models from deployment to training to serving including data preparation, model training, tuning, optimization and release.
• Develop and implement DevOps processes and best practices for ML workflows including CI/CD, monitoring, logging and infrastructure automation.
• Collaborate with software engineers to understand and address system requirements, and ensure smooth integration and deployment of AI models.
• Identify and fix issues related to performance, scalability, and reliability.
• Perform system troubleshooting and problem-solving across platform and application domains.
• Keep up-to-date with the latest industry trends in AI and DevOps.
• Evaluate and implement new tools and technologies to improve the efficiency of our operations.
• Minimum 3 years of experience in DevOps, with a focus on AI-based systems and applications.
• Bachelor's degree in computer science, software engineering, or a related field; advanced degree preferred.
• Strong understanding of machine learning, natural language processing (NLP) concepts, tools, and techniques, as well as their infrastructure requirements.
• Experience with systems and IT operations, including Linux/Unix administration and scripting languages (Python, Bash).
• Familiarity with modern deployment and automation tools, such as Jenkins, Docker, Kubernetes, Ansible, and Terraform.
• Proficiency in cloud services (AWS, Google Cloud, Azure) and managing scaled AI services in the cloud.
• Exceptional problem-solving skills and attention to details.
• Excellent communication and collaboration abilities.