Job summary At Infobip, we dream big. We value creativity, persistence, and innovation, passionately believing that it is through teamwork that we can all reach greater heights. Join us on our mission to create life-changing interactions between humans and online services with new and unseen solutions. As a part of Reliability Operations, you will work in a team which strives to identify, respond and mitigate platform incidents. Is your eye twitching when something breaks and you already have a list in your head of possible improvements? This is the place you're looking for. Qualifications - You have an engineering or support background and passion for IT with at least 1 year of prior experience in the same or similar jobs - You have an experience with tools for monitoring systems (Grafana, Prometheus, NewRelic, Graylog, Kibana, Elasticsearch, Opensearch…) - You have a strong system-thinking and problem-solving mindset - You are genuinely interested into how things work, and driven when they don't - You have strong analytical and investigative skills combined with the ability to navigate through substantial amounts of data to gather critical information in a timely manner - You are genuinely interested in site reliability and want to learn about mitigation tactics - Hands-on knowledge of a system administration tasks are an advantage, but not a prerequisite - You can speak fluently to clients, and colleagues alike, and have great command of English - You can exhibit an advanced level of teamwork, excellent communication skills and a high degree of independence - You are efficient in execution, prone to continuous improvements, experimentation, and self-education Responsibilities - Be a first responder to platform alerts - Monitor our products for issues, prioritize, triage them, and assess client impact - Detect issues, identify them (affected systems, locations, responsible teams) and respond in a timely manner by utilizing runbooks - Clearly communicate (summarize) and escalate platform incidents to responsible individuals - Actively contribute to current runbooks and create a new ones - When an incident is reported, be the driver of the incident resolution (incident commander) - Based on alerts, try to prevent an issue becoming an incident Skills - tech savvy - curious with attention to detail - critical thinkers - system-knowledge, holistic view - enjoys troubleshooting - responsible - clear communicator - problem solver - willing to teach / mentor others Descripción del trabajo Lorem ipsum dolor sit amet , consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare. Donec lacinia nisi nec odio ultricies imperdiet. Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula. Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit , at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus. Obtén acceso completo Accede a todos los puestos de alto nivel y consigue el trabajo de tus sueños. Inscríbete ahora