hero

Jobs at Alumni Ventures Portfolio Companies

444
companies
2,367
Jobs

Cloud Operations Engineer

Wasabi Technologies

Wasabi Technologies

Software Engineering, Operations
Boston, MA, USA
Posted on Wednesday, April 24, 2024
Role Description: Cloud Operations Engineer
Role Purpose:
Wasabi, the hot cloud storage company, is looking to hire a Hardware Operations Engineer (DevOps) to be a member of the team supporting the 24x7x365 service operations. To be successful in this role, you will need a strong technical background combined with excellent communication and interpersonal skills. This job requires staying calm under pressure to resolve production issues and restore service quickly to ensure that uptime for the service is within acceptable business targets. You should have excellent communication skills and be able to work with a cross functional team of employees and partner associates to configure hardware, diagnose and trouble shoot hardware that has failed, evaluate new hardware for storage, network and servers, create and run diagnostics to analyze hardware, and be able to analyze logs and alerts to identify, diagnose and resolve production issues.
You need to have experience deploying released software, monitoring and operating services. If you thrive on working in a fast-paced startup environment to service a global customer base, we would like to hear from you.
The role reports to the Director, Hardware Operation.
*Principals only. No recruiters.

Responsibilities:

  • Monitor, troubleshoot and rectify issues in server, storage, and networking hardware, cabling, and power in a 24x7x365 production datacenter.
  • Working with hardware vendors (server, storage, networking) to create tools, knowledge base and a streamlined hardware operations environment.
  • Supervise as well as be able to deploy equipment in a data center.
  • Evaluate hardware technologies.
  • Hunt and resolve issues on site as well as remotely in data center production environments.
  • Work with the supply chain team to proactively establish a streamlined spares supply globally.
  • Build scripting tools to create acceptance tests for server, storage, and networking equipment.
  • May require on call, off-hours duty.
  • Travel may be required.

Requirements:

  • 4+ years’ experience in large scale data center hardware trouble shooting.
  • Expertise in one or more of server, storage, or network hardware.
  • BS in Electrical Engineering or equivalent degree.
  • Strong understanding of data center architecture, heating and cooling, hardware equipment, network design, and cabling/power infrastructure.
  • Deep understanding and experience with fault localization and service restoration in a 24x7 production environment.
  • Good communication and presentation skills and ability describe technical issues to a diverse audience.
  • Ability and experience working in a fast-paced startup environment.
  • Experience in developing MOPs and SOPs and working with other teams to qualify the MOPs to reduce errors/issues in production deployments.
  • Knowledge and experience with Linux and python are required.
  • Attention to detail and creative problem solving and outcome focused.
  • Positive attitude and solution focused to help customers, both internal and external.