What we’re looking for
We are looking for an experienced Sr. SRE/DevOps with proven experience on Google Cloud (GCP) and Kubernetes engines, also building high volume, high performance, and highly available payment services solutions to help us build functional systems that improve customer experience. You will be responsible for managing people at Platform team, architecting the cloud infrastructure, automating, and modernizing the CI/CD flows, deploying new product and their updates, identifying production issues, and implementing integrations that meet our customers' needs.
Your performance will be measured based on improved deployment process/times, reduction in server alarms and more efficient operation of resources. The goal is that our app traffic scales without an increase in error reports and maintaining/improving our error rate, response times, uptime. We need you to have strong scripting skills.
The Role
The Sr. SRE/DevOps will focus on:
- Leverage Google Cloud Platform and Kubernetes to enable our platforms to our clients.
- Improve and maintain the infrastructure behind our continuous integration and delivery pipelines.
- Work close with business areas to understand how the products are built, designed and operated and the importance of them for the company.
- Discover, analyze, and troubleshoot anomalous application behaviors. Deploy monitoring and infrastructure tools exposing metrics and alerts.
- Designing and implementing cloud native solutions and cloudware to support the platforms and applications running on top of it.
- Designing and developing automation to support continuous delivery and continuous integration processes.
- Building and setting up new development and CI/CD tools and infrastructure.
- Understanding the needs of stakeholders and conveying this to developers.
- Working on ways to automate and improve development and release processes.
- Testing and examining code written by others and analyzing results.
- Ensuring that systems are safe and secure against cybersecurity threats.
- Identifying technical problems and developing software updates and fixes.
- Planning out projects and being involved in project management decisions.
Duties
- Set-up up new sites and applications via configuration management
- Maintain / upgrade / patch tracking and documentation software
- Support the development lifecycle of platform architectural design, deployment and debugging
- Ability to automate release deployments across development, QA and production stacks using a combination of scripting languages and other automation toolkits
- Deploy updates and fixes.
- Provide Level 2 and 3 technical support.
- Build tools to reduce occurrences of errors and improve customer experience.
- Develop automation solutions to improve cloudware support.
- Perform root cause analysis for production errors.
- Investigate and resolve technical issues.
- Develop scripts to automate visualization.
- Design procedures for system troubleshooting and maintenance

