Knowledge Base Resources
Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
The Carpentries
3
We teach foundational coding and data science skills to researchers worldwide.
HPC University
3
A comprehensive list of training resources from the HPC University. HPCU is a virtual organization whose primary goal is to provide a cohesive, persistent, and sustainable on-line environment to share educational and training materials for a continuum of high performance computing environments that span desktop computing capabilities to the highest-end of computing facilities offered by HPC centers.
Cornell Virtual Workshop
2
Cornell Virtual Workshop is a comprehensive training resource for high performance computing topics. The Cornell University Center for Advanced Computing (CAC) is a leader in the development and deployment of Web-based training programs. Our Cornell Virtual Workshop learning platform is designed to enhance the computational science skills of researchers, accelerate the adoption of new and emerging technologies, and broaden the participation of underrepresented groups in science and engineering. Over 350,000 unique visitors have accessed Cornell Virtual Workshop training on programming languages, parallel computing, code improvement, and data analysis. The platform supports learning communities around the world, with code examples from national systems such as Frontera, Stampede2, and Jetstream2.
Rust Web Server Tutorial
2
This is a beginner-friendly tutorial on how to set up your web server using Rust!
The Chronicle of Evidence-Based Mentoring
2
This is a great mentoring resource and has many articles related to mentoring. It is a one-stop shop for mentoring, and at the bottom, there are tags based on topics, and interested users can pick and choose articles and resources on different types of mentorship.
Open OnDemand
2
Open OnDemand is an easy-to-use web portal that lets students, researchers, and industry professionals use supercomputers from anywhere. It is installed on supercomputing resources at hundreds of sites. By eliminating the need for client software or command-line interface, Open OnDemand empowers users of all skill levels and significantly speeds up the time to their first computing.
An Introduction to Cryptography with Python
2
This comprehensive workshop is designed to guide participants through the world of cryptography, from foundational concepts to advanced implementations. Starting with the basics of encryption, decryption, and hashing, the workshop discusses real-world applications like SSL, blockchain, and digital signatures. Interactive Python-based coding examples, such as symmetric and asymmetric encryption, will provide hands-on experience. Participants will also learn to identify cryptographic vulnerabilities and perform attacks like length extension. Finally, the workshop also explores future trends such as quantum cryptography and zero-knowledge proofs, providing participants with the knowledge to apply cryptography in securing modern digital systems. Ideal for beginners and intermediate learners alike, this workshop is a step-by-step journey into mastering cryptographic principles and practices.
DARWIN Documentation Pages
1
DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze Delaware research and education
ACCESS HPC Workshop Series
1
Monthly workshops sponsored by ACCESS on a variety of HPC topics organized by Pittsburgh Supercomputing Center (PSC). Each workshop will be telecast to multiple satellite sites and workshop materials are archived.
Using Linux commands in a python script (and the difference between the subprocess and os python modules)
1
Learn how to use Linux commands in a python script. Specifically, learn how to use the subprocess and os modules in python to run shell commands (which run Linux commands) in a python script that is run on a cluster.
Leveraging AI in Generative Assets and Environments for Play: Insights from the English Department's Digital Media Lab
1
In this presentation, I will explore the recent advancements in AI-driven production of 3D-generative assets and environments, particularly focusing on their application in creating immersive, playful experiences. Platforms such as ChatGPT, Suno, and Speechify have ushered in a new era of digital creativity, facilitating the development of environments that not only entertain but also serve educational purposes. This session will delve into how these technologies are integrated into academic settings, specifically through a case study of the English Department's Digital Media Lab, known as Tech/Tech, which opened in 2022.
NCSA HPC Training Moodle
1
Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Other related topics include 'Cybersecurity for End Users' and 'Developing Webinar Training.' Some of the tutorials also offer digital badges. Many of these tutorials were previously offered on CI-Tutor. A list of open access training courses are provided below.
Parallel Computing on High-Performance Systems
Profiling Python Applications
Using an HPC Cluster for Scientific Applications
Debugging Serial and Parallel Codes
Introduction to MPI
Introduction to OpenMP
Introduction to Visualization
Introduction to Performance Tools
Multilevel Parallel Programming
Introduction to Multi-core Performance
Using the Lustre File System
Attention, Transformers, and LLMs: a hands-on introduction in Pytorch
1
This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how LLMs work and use them in your own projects.
GIS: Geocoding Services
1
Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically requires an API call to a remote server hosted by an organization/institution. The remote server will take the address attributes provided by you and the remote server will compare it to the data it contains and return a best estimate on the coordinates for that location.
There are many geocoding services available with different world coverages, quality of result, and set different rate limits for access. For R, a package called "tidygeocoder" provides an easy way to connect to these different services. As an additional benefit, their documentation provides a good summary of geocoding services available and links to their documentation. The link to the documentation for gecoding services accessible by "tidygeocoder" is provided below.
For Python, geopy package is a library that provides connection to various geocoding services. The link to the documentation for this package is also included below.
HPC Carpentry
1
An HPC focused Carpentry community. Trainings include: HPC fundamentals, python, chapel, LAMMPS, parallelization with python, scaling studies, etc.
PyTorch for Deep Learning and Natural Language Processing
1
PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the basics of PyTorch from scratch. I will then explore how to use it for some ML projects such as Neural Networks, Multi-layer perceptrons (MLPs), Sentiment analysis with RNN, and Image Classification with CNN.
Tutorial: Localized RAG Chatbot with ACCESS HPC
1
This tutorial shows how to set up an open-source customizable RAG chatbot to answer questions about documents you can choose. It uses Indiana's Jetstream 2 HPC, but should work on any major ACCESS HPC.
Introduction to Deep Learning in Pytorch
1
This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data loading and preprocessing to training and model evaluation. Throughout the sessions, students participate in writing and executing simple deep learning programs using Pytorch – a popular Python library for developing, training, and deploying deep learning models.
Enhanced Sampling for MD simulations
1
DeapSECURE – Data-Enabled Advanced Computational Training Platform for Cybersecurity Research and Education
1
DeapSECURE is a training program to infuse high-performance computational techniques into cybersecurity research and education. It is an NSF-funded project of the ODU School of Cybersecurity along with the Department of Electrical and Computer Engineering and the Information Technology Services at ODU. The DeapSECURE team has developed six non-degree training modules to expose cybersecurity students to advanced CI platforms and techniques rooted in big data, machine learning, neural networks, and high-performance programming. Techniques taught in DeapSECURE workshops are rather general and transferable to other areas including science, engineering, finance, linguistics, etc. All lesson materials are made available as open-source educational resources.
ACCESS Pegasus Documentation
1
The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC) workloads, covering logging in, workflow creation, resource configuration, and monitoring options.
Gentle Introduction to Programming With Python
1
This course from MIT OpenCourseWare (OCW) covers very basic information on how to get started with programming using Python. Lectures are available, along with practice assignments, to users at no cost. Python has many applications in tech today, from web frameworks to machine learning. This course will also instruct users on how to get set up with an IDE, which will allow for way more efficient debugging.
Containerized Jupyter Notebooks for HPCs
1
This tutorial demonstrates how to create, manage, and deploy containerized Jupyter simulations for High-Performance Computing (HPC) environments, specifically using SLAC's S3DF infrastructure. By utilizing Apptainer (formerly Singularity) containers, users can package complex simulations with all necessary dependencies, input files, and configurations, ensuring reproducibility and ease of use for new users. The automated workflows, powered by GitHub Actions, handle building and updating the containers, while Open OnDemand provides an accessible interface for running Jupyter notebooks directly from the HPC environment. This approach eliminates setup errors, saves time, and ensures consistent simulation environments, enabling researchers to focus on their work instead of system configuration.
Managing Python Packages on an HPC Cluster
1
This workshop will go into the different ways python packages can be managed in a cluster environment using conda and python virtual environments both in batch mode from the command line and with Jupyter Notebooks and Jupyter Lab on the cluster. The examples will be run on the GMU HOPPER Cluster.
Data Visualization tools for Python
1
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your data extremely easy and works with Python which many people already know.