Site Reliability Engineer-Cloud
Santa Clara-Bay, Area | Direct
In this visible role you will be the seasoned systems operations expert to lead initiatives focused on systems infrastructure management within a high volume fast scaling environment. Responsibilities
● You will be responsible for the systems deployment, operations, and monitoring for infrastructure, including design and development of infrastructure automation.
● You will get your hands dirty, troubleshooting infrastructure, and architectural challenges using your existing knowledge and toolkits.
● You will drive reliability and supportability aspects of cloud service by creating knowledge base and, working with DevOps, coordinate change management policies, deploy ticket/incident management system, service request queue triaging and auto-remediation.
● You will leverage your advanced system architecture & administration skills for collaboration with engineering and product management, test and automation teams to architect and develop strategic and tactical solutions.
● You will engage with suppliers for the purposes of infrastructure equipment procurement, technical design exercises, and supplier roadmap reviews.
● You will help develop requirements for customer on-boarding processes, target environment sizing and migration automations
To Succeed You Must Have:
● 7 or more years of experience in a technical support and data center operations role,
including team and process management responsibilities.
Experience with Cloud data centers is a must. Deep technical roots in data center technologies:
● Prior successful experience of working in an innovative, fast-paced startup with a high rate of flux. The candidate must demonstrate strong entrepreneurial spirit and vigor.
● Demonstrated proficiency in creating detailed technical design documents, facilitate design reviews, and execution of design implementation projects.
● Ability to communicate with Network Architects
● Must be an excellent verbal and written communicator
○ Experience in Large-scale Linux production environments, preferably as part of a cloud service provider environment.
○ Understand data center networking fabric topologies and common architectures deployed
○ Virtualization technologies, in particular VMware product suite (vCenter, ESXi, NSX) is required. Deep understanding of KVM, Microsoft technologies (Hyper-V, Azure Pack, Azure Stack, GoogleCloud, Rackspace, AWS) is a strong plus.
○ Experience in networking concepts is required – Layer 2/3, Load Balancers, VPN, Network Virtualization, BGP, OSPF
○ Reasonable technical understanding of configuration and maintenance of different NAS/SAN and networking systems in a virtualized environment. Understanding of Software Defined Data Center (SDS, SDN) is a strong plus.
○ Understanding of DevOps agility for continuous development and delivery with hands-on experience in Chef / Puppet / Ansible. Deep understanding of ITIL processes and systems, such ITSM (ServiceNow) is a strong plus.
● Professional certifications in CISSP, CCIE, VCIX-NV, VCDX6-NV is a strong plus
● BS/MS degree in Computer Science or equivalent experience