Key Responsibilities We're looking for a talented data engineer to join our team. The ideal candidate will have expertise in designing, developing, testing, and maintaining strong and scalable data pipelines using Python and tools for large-scale data processing. - Design and develop data pipelines using Python and tools such as Apache Spark, Dask, or similar on cloud platforms. - Take ownership of key parts of our machine learning systems, ensuring they are reliable, efficient, and can grow with the business. - Set up and manage MLOps practices, including automatic updates for machine learning models, model monitoring, and automated launch plans. - Improve and manage data processing jobs on cloud platforms, including GCP services like Dataproc, BigQuery, Cloud Run, and Cloud Build. Requirements To be successful in this role, you'll need: - 3-5+ years of experience in software engineering, with a strong focus on data engineering, ML engineering, or building applications that use a lot of data. - Expert-level knowledge of Python, with a strong understanding of object-oriented design, software system design, and experience building high-quality, testable code for production. - Strong, hands-on experience with tools for handling large amounts of data, such as Apache Spark (PySpark), Dask, or similar. - Solid experience with cloud platforms, including putting services live, managing them, making them handle more users, and working with large data systems. - Strong SQL skills and experience working with large, complex datasets. - A deep understanding of machine learning ideas, the full process of creating a model, and MLOps principles. - Excellent problem-solving skills, with the ability to fix complex issues in systems that run on many computers and make them perform better and handle more data. - Advanced English proficiency and excellent communication, teamwork, and consulting skills. - Passion for building strong, scalable systems and eagerness to guide and work with a team. - Care deeply about code quality, system reliability, and writing good documentation. Benefits We offer: - Languages: Python (expert), SQL (strong) - Version Control: Git (expert) - MLOps & Orchestration: Familiar with tools like Airflow, Kubeflow, Vertex AI Pipelines - Data Analysis Libraries: Pandas, NumPy (very good with these) - Machine Learning: scikit-learn, TensorFlow/PyTorch (understand how to get them to production) - AI Tools: Claude, Gemini, OpenAI offerings - Work from anywhere, during US business hours - Competitive compensation, including participation in equity program - Mentorship from experienced executives - Opportunity to grow and impact a rapidly growing startup - Travel to in-person team off-sites (visa-permitting) - And more About Us We're a rapidly growing startup looking for talented individuals to join our team. If you're passionate about building strong, scalable systems and have a strong background in data engineering, ML engineering, or software engineering, we'd love to hear from you.