Task division ================ This page describes tasks for RPHC administrators and the people responsible for them. QoS Administration ------------------ **Task:** Monitoring the request channel and asking admins for opinions whether requests are reasonable. Then changing the QoS settings accordingly and setting a reminder to revert it. **People responsible:** Lishan, Tianyu and Samuele Project/home/data folder administration --------------------------------------- **Task:** Monitoring storage space allocation in project/home/data folders and snapshot management. **People responsible:** Vanessa and Marek Tape backups ------------ **Task:** Monitoring tape backups and ensuring that they are running correctly. **People responsible:** Joren and Marek Slurm quirk/crash investigations -------------------------------- **Task:** Investigating slurm quirk/crash. a. kill task failed b. Jobs on nearly-full machines not getting allocated **People responsible:** No one assigned yet Minor ansible tasks ------------------- **Task:** Minor ansible tasks. **People responsible:** Marek and Joren Patch Monday tasks ------------------- **Task:** All tasks that are associated with patch Monday. a. Reservations b. Reviving cluster c. Wipe /processing disks d. Make playbook? e. Ask Ameer about patching workflow f. Iptables **People responsible:** Eduardo and Vanessa GPU utilization monitoring -------------------------- **Task:** Monitoring GPU utilization and contacting users upon suboptimal usage. **People responsible:** Tianyu and Joren Documentation updates ---------------------- **Task:** Updating documentation when necessary. **People responsible:** Iris DeepOps upstream merge ---------------------- **Task:** DeepOps upstream merge and testing. **People responsible:** Joren & rest