ACCESS Collaboration with NVIDIA: GPU Monitoring Tools to Maximize Application Performance and System Utilization Workshop
OSC Workshop Series with NVIDIA & ACCESS
The Ohio Supercomputer Center (OSC) in collaboration with NVIDIA & ACCESS will host a workshop on Monday, May 1, 2023 from 2:00 p.m. to 4:00 p.m US Eastern Time.
In this tutorial you will learn about the Data Center GPU Management (DCGM) tools and how they can be used to collect a broad set of runtime data for GPUs including compute engine and memory utilization, tensor core utilization, and NVLINK data. We will demonstrate how to collect telemetry data on individual jobs, run GPU diagnostics to verify health and performance, and generate a test load on a GPU. After attending this tutorial, you will be able to determine how an application or job is utilizing a GPU (or GPUs), possibly leading to code changes to improve performance or a reassignment of a job to more appropriate resources.
Presenters:
Brad Palmer
Brad Palmer is a Senior Solutions Architect at NVIDIA. Brad works with research universities and institutions to optimize the application, performance, and utility of NVIDIA technologies. Brad also delivers workshops and seminars to help researchers and research computing staff choose and implement optimal NVIDIA technologies.
Gianluca Castellani
Gianluca Castellani is a Solutions Architect in the Worldwide Field Organization at NVIDIA. His primary roles are to support deep learning and high-performance computing for Higher Education and Research. Prior to joining NVIDIA, Gianluca spent fifteen years working in research computing teams at KAUST and several other government and educational organizations including CERN, CNAF/INFN, Brandeis University and the Formula1 racing team Scuderia Toro Rosso (now known as Alpha Tauri). Gianluca holds a Ph.D. in Theoretical Physics from Northeastern University in Boston.
This is a virtual event, offered online via Webex. The link will be sent prior to the start of the workshop.