In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Feb 07, 2023 - 03:00 MST
Scheduled - Emergency maintenance will involve a full cluster reboot in an attempt to stabilize transient issues.

Storage live migration will occur afterwards for instances to be moved back to their intended storage tier which may reduce service performance.

Monitoring - Network hardware arrived and was installed into all cluster nodes. Emergency maintenance is scheduled Feb 7, 2023 03:00-07:00 MST in an attempt to fully resolve issues with infrastructure stability. Please view details at the bottom of the page.
Feb 06, 2023 - 01:13 MST
Identified - An issue has been identified in the Computer Science Cloud Storage platform.

Due to recent addition of the Ceph NVMe io2 tier, we are experiencing service degradation due to network congestion attached to our cloud object storage platform.

Significant pause frames and packet loss is occurring on many nodes due to recent traffic increases. This can only be remediated by replacing the networking components on these nodes. This hardware has been ordered and is expected for delivery in two weeks. We hope services will be fully restored by the 2nd week of February.

This will especially be apparent with services sensitive to IO delay from flapping.
JupyterHub appears to be the most affected by this; followed by Moodle.

To mitigate downtime, services are being migrated off the io2 tier (nvme) to the st1 tier (magnetic media).

Jan 17, 2023 - 15:14 MST
Computer Science Core Infrastructure ? Under Maintenance
Science Network ? Operational
Red Hat Ceph Object Storage Cluster ? Operational
90 days ago
100.0 % uptime
Today
JupyterHub ? Under Maintenance
90 days ago
99.78 % uptime
Today
CS Cloud OpenStack Platform ? Under Maintenance
90 days ago
99.81 % uptime
Today
CS vSphere for VDI Labs ? Operational
90 days ago
100.0 % uptime
Today
Managed Servers ? Operational
Moodle LTI Provider to Canvas ? Under Maintenance
90 days ago
99.77 % uptime
Today
Moodle Computer Science Post-Baccalaureate ? Under Maintenance
90 days ago
99.83 % uptime
Today
ELRA Environment ? Operational
CEAS Redirector ? Operational
Departmental Sites ? Operational
CS Home ? Operational
CS Financials ? Operational
Foundations ? Operational
IEEE FOCS 2021 ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Past Incidents
Feb 7, 2023

Unresolved incident: Full cluster reboot and storage migration.

Feb 6, 2023

Unresolved incident: Degraded Stability in Computer Science Core Infrastructure.

Feb 5, 2023
Resolved - Our monitoring systems alerted us to a sudden increase in storage utilization over the weekend for the JupyterHub datastore. Storage was exhausted at around 5:30 MST today, Sunday, February 5th. Intermittent errors such as load failures and messages similar to "no space left on device" would have occurred. The disk was live-expanded at around 6:00 PM MST.
Feb 5, 17:30 MST
Feb 4, 2023

No incidents reported.

Feb 3, 2023

No incidents reported.

Feb 2, 2023

No incidents reported.

Feb 1, 2023

No incidents reported.

Jan 31, 2023

No incidents reported.

Jan 30, 2023

No incidents reported.

Jan 29, 2023

No incidents reported.

Jan 28, 2023

No incidents reported.

Jan 27, 2023

No incidents reported.

Jan 26, 2023

No incidents reported.

Jan 25, 2023

No incidents reported.

Jan 24, 2023

No incidents reported.