Find Jobs
Make your next engineering career move with SoloPoint.
So sorry, this position is no longer available.
Please go ahead and submit your application. We may have other positions that would be the perfect fit for you.
Alternatively, you may want to apply to one of the following related jobs:
SoloPoint Solutions, Inc.
https://www.solopointsolutions.com/wp-content/uploads/2021/05/JB-Square-200x200-1.png
https://www.solopointsolutions.com
https://www.solopointsolutions.com
USD
70
80
HOUR
true
Site Reliability Engineer
2025-09-12
2026-03-12
Employment Type:
Contract
Category: Software Engineer, Application
Job Number: 10574
Pay Rate: $70/hr - $80/hr
Job Description
Qualifications:
Responsibilities:
- BS in Computer Sci, IT, Networking or similar
- A minimum of 5 years of related experience. Experience with network security: configuring/maintaining ACLs, knowledge of firewalls
- Understanding of networks and network protocols
- Strong hands-on knowledge of the Linux shell and working in a command-line (e.g.SSH) environment
- C, C++, perl, java, or Python or a scripting language with knowledge of standard software development practices
- Knowledge of and ability to work on large data communications networks and IT infrastructure supporting highly available systems and applications
- Working knowledge of kubernetes, Prometheus/VictoriaMetrics, alertmanager, building management software, evaporative cooling, and power utilization are helpful
- Strong communication skills and ability to work effectively across multiple technical teams
- Experience working in a 24/7 onsite team managing large data centers or other large installations
- A certification in a system administration area in platforms, software, or any other advanced education in the Computing Science area
Responsibilities:
- Review and respond to alerts from computer systems, storage, network, and other data center/facility related systems
- Create appropriate solutions to improve the process and to prevent issue recurrence and automate the response to all routine service conditions
- Identify issues and propose solutions that will improve the ability to monitor or provide better automation for monitoring or triage
- Respond to alerts from OMNI to ensure that the system continues to collect data 24x7 to provide real time information for diagnoses
- Develop and maintain tools within the monitoring pipeline in collaboration with the Operations Team
- Create new software programs to provide alerts and notifications from the HPC system APIs and into the monitoring pipeline
- Create new software configurations and solve technical issues to enable programs to scale to more dense data or to deliver at scale reliably
We greatly appreciate your interest in this position and encourage you to apply.
Please note: By applying to this job posting, you agree to receive communications from SoloPoint Solutions regarding this and other relevant jobs, as well as employment and industry-related news and updates.
We look forward to working with you!
Share This Job:
Related Jobs:
There are currently no related jobs. Please sign up for Job Alerts.
Loading...
Login to save this search and get notified of similar positions.
Are you sure you want to apply for this job?
Please take a moment to verify your personal information and resume are up-to-date before you apply.