Site Reliability Engineer - Cloud Developer (Remote)
Toronto, ON, CA Calgary, AB, CA Vancouver, British Columbia, CA Ottawa, ON, CA Edmonton, AB, CA
Description
We are looking for a talented Senior Cloud Developer with Site Reliability Engineer (SRE) focus to join us to help us drive forward on various Enterprise Notification Platform (ECP) projects.
It’s an exciting time to be a part of our mission to drive better outcomes through customer-centric, state of-the-art software platforms.
Here's how:
We are looking for individuals who will:
- Design and build efficient and maintainable cloud-based micro services, adopting standard components and libraries whenever possible, with the SRE focus on support, monitoring, alert, scalability-related aspects of the solution
- Leverage DevOps values to drive design decisions that support key operational needs such as configurability, logging and monitoring
- Take responsibility for your components across end to end development lifecycle, implementing, executing testing and supporting deployments
- Integrate security into all daily efforts, including secure design and coding practices
- Collaborate with senior architects to identify efficient architectures and mentor/lead more junior development resources
- Estimate the effort needed to execute your end-to-end implementation tasks and support our technical leads in creating unified and reliable delivery plans
- Leverage containers and other cloud-native tools
You're the missing piece of the puzzle
- Bachelors in Computer Science, Software / Computer Engineering or any equivalent combination of education and experience
- Minimum 3 years of hands-on SRE experience being engaged in designing SRE solutions for the whole lifecycle of services, from inception and design, deployment, operation, and refinement.
- Expertise in building cloud metrics, alerts and dashboards to monitor and measure availability, latency, and overall system health.
- Scale systems sustainably through mechanisms like automation; Evolve systems by pushing for changes that improve reliability and velocity.
- At least 3 years of hands-on experience building application in the cloud, such as Google Cloud Platform (GCP) and Amazon Web Service (AWS), with micro services architectures, with extensive knowledge on Kubernetes based platforms like GCP-GKE, Red Hat Openshift
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
- Ability to debug, optimize code, and automate routine tasks.
- Knowledge on Postgres and GCP Big Data services (Cloud Storage, PubSub, Dataproc, Dataflow, Dataprep, Cloud Composer, BigQuery, BigTable, AI Platform)
- At least 5 years of hands on working knowledge with deploy tools: Spinnaker, Terraform, Jenkins,
- At least 5 years of hands on working knowledge with one or more of following languages: Java Node.JS, Python
- At least 5 years in progressive, object oriented software development roles with proven technical leadership skills, as well as confidence in making and owning technical decisions
- Understand Unix/Linux operating systems
- Hands on experience building REST or Web Services with Spring framework.
- Hands on experience defining and integrating SQL or NoSQL databases
- Hands on experience hosting & running apps/services within a containerized environment
- Experience using testing frameworks, and implementing and testing features and functionalities using both manual and automated means
- Agile development practices, using DevOps best practices and a modern CI/CD pipeline
- Experience building positive relationships and collaborating with a variety of diverse groups and technical teams
- Strong verbal and written communication skills
- A creative approach to problem solving and the ability to work independently to manage deliverables in an environment with high levels of ambiguity
Nice to Have:
- Hands-on experience with API management gateways (e.g. KONG)
- Experience working with an Agile delivery environment
- Any type of Cloud certification
- Telecommunications industry knowledge
- Experience in a 24×7 support environment