Job Information
Iron Mountain Site Reliability Engineer in Providence, Rhode Island
At Iron Mountain we protect what our customers value most, from the everyday to the extraordinary, while helping them bridge the physical and digital world. Our people have the opportunity to bring their creativity to a workplace that thrives on change. Here, you will be part of a team that doesn’t just embrace what’s exceptional. It creates exceptional.
As a trusted partner to our clients there is a requirement that our Mountaineers must be vaccinated.
Founded in 1951, Iron Mountain Incorporated (NYSE: IRM) is a Fortune 500 global leader in storage and information management services. Iron Mountain is committed to storing, managing and transforming what our customers value most, from paper records to data to priceless works of art and culture. Providing a full suite of solutions – records and information management, data management, digital solutions, data centers and secure destruction – Iron Mountain enables organizations to lower storage costs, comply with regulations, recover from disaster, and protect their data and assets from a complex world. Visit the company website at www.ironmountain.com for more information.
As Iron Mountain continues its digital transformation, we are forming a new site reliability team, directly supporting a specific customer’s digital imaging solutions, while maintaining close ties to similar teams within the Enterprise Information Technology organization. This team is responsible for availability, latency, performance, efficiency, change management, monitoring, security, emergency response, and capacity planning. The Iron Mountain site reliability engineers create a bridge between development and operations by applying a software engineering mindset to system administration topics.
The ideal candidate for this role is a software and systems savvy engineer that is comfortable with designing, building, and integrating solutions across technical and business capability domains with cost and strategic implications. Solutions may consist of proven or unproven technologies or multiple implementation technologies at once within domains that experience rapid change.
Our Site Reliability Engineer would need advanced knowledge of infrastructure, scripting skills, and engineering disciplines. This role requires equal parts development and operations with a software engineer mindset. Using technical and operational skills, this role will increase application reliability at scale.
Primary Skills:-
This role requires very good hands-on experience with Windows Servers and Powershell scripting.
Ability to support Okta, VMware, Nutanix and perform System Administration responsibilities like scripting, automation, Tanium patching, Solarwinds ticketing/monitoring, troubleshooting, bug fixes, on-call on rotation basis, operational support to internal customers, Auditing & compliance etc
Ability to support project responsibilities like designing, solution engineering and implementation of Infrastructure projects.
Good experience with Nutanix, Networking (firewall / Load Balancers / DNS) and Active Directory.
What you will do
Building software to help operations and support teams
Write clean, high-performance, and well tested, infrastructure code with a focus on reusability and automation
Develop monitoring, define SLAs, SLOs and error budgets for mission critical platforms while helping to coordinate product launches and reliability exercises
Collaborate and contribute with other enterprise teams on Iron Mountain’s digital transformation journey, including the impact on infrastructure, networks and security
Work closely with Architects and provide support to senior staff, ensuring designs align with the technological and business directions across the company
Support IT deployments with involvement Platform as a Service (PaaS), Software as a Service (SaaS), or Infrastructure as a Service (IaaS)
Manage central platforms as a service for growth and scale
Implement enhancements to the company's digital and data infrastructure, supporting internal customer's operational needs.
Take part in on-call rotations
Document previous “tribal knowledge” and eliminate tech debt
What you bring to Iron Mountain
Bachelor's Degree or equivalent and 3+ years of relevant work experience
1 - 2 years with provisioning automation
Ability to be part of a team working 100% remote
Experience with Operating System, troubleshooting and coding/scripting using high-level languages
Experience with infrastructure systems that support enterprise data science and analytics capabilities, including streaming and real-time analytics
Deep understanding of common scripting languages (Powershell, Python, Bash, Go).
Experience working with on premise hyperconverged environments (Nutanix)
Experience working with virtualization platforms (VMware, Acropolis)
Desire to deliver automation to the Windows Server platform
Experience with current pipeline tools (Terraform, Ansible, Jenkins, Packer, Git)
Involvement in some on-premise to cloud migration or Application Modernization efforts
Experience managing a full application stack with high availability requirements
Full Stack troubleshooting experience including networking, operating system (Windows Server, RHEL/CentOS), HA Proxy, Nginx, RDBMS is preferred
Experience leveraging monitoring tools to meet contracted SLAs with SLO and SLI responsibilities.
Strong written and verbal communication skills
Able to thrive in a collaborative and cross-functional environment
#LI-Remote
Category: Information Technology (IT)
Iron Mountain is committed to a policy of equal employment opportunity. We recruit and hire applicants without regard to race, color, religion, sex (including pregnancy), national origin, disability, age, sexual orientation, veteran status, genetic information, gender identity, gender expression, or any other factor prohibited by law.
To view the Equal Employment Opportunity is the Law posters and the supplement, as well as the Pay Transparency Policy Statement, CLICK HERE
Requisition: J0041176