Skip to main content

Knowledge Base Resources

These CI links have been crowd-sourced from the ConnectCI community and represent a “vetted” list of useful websites, training modules and tutorials. CI links show up in a tag search if they have the relevant tag attached. Affinity groups can include relevant CI links on their respective affinity group pages. Additional CI links are always welcome, click the “Add New CI Link” button to suggest one.

HPC University

A comprehensive list of training resources from the HPC University. HPCU is a virtual organization whose primary goal is to provide a cohesive,… more

Learning

debugging hpc-operations professional-development

Beginner, Intermediate, Advanced

The Carpentries

We teach foundational coding and data science skills to researchers worldwide.

Website

administering-hpc training

Beginner, Intermediate, Advanced

The Chronicle of Evidence-Based Mentoring

This is a great mentoring resource and has many articles related to mentoring. It is a one-stop shop for mentoring, and at the bottom, there are tags… more

Website

mentorship

Beginner

ACCESS HPC Workshop Series

Monthly workshops sponsored by ACCESS on a variety of HPC topics organized by Pittsburgh Supercomputing Center (PSC). Each workshop will be telecast… more

Learning

deep-learning machine-learning neural-networks

Beginner, Intermediate

ACCESS Pegasus Documentation

The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC)… more

Docs

pegasus

Beginner, Intermediate, Advanced

Attention, Transformers, and LLMs: a hands-on introduction in Pytorch

This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how… more

Learning

ai deep-learning machine-learning

Intermediate

Cornell Virtual Workshop

Cornell Virtual Workshop is a comprehensive training resource for high performance computing topics. The Cornell University Center for Advanced… more

Learning

jetstream stampede2 cloud-computing

Beginner, Intermediate, Advanced

DARWIN Documentation Pages

DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze… more

Docs

darwin big-data

Beginner, Intermediate, Advanced

Data Visualization tools for Python

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your… more

Docs

documentation python

Beginner, Intermediate

DeapSECURE – Data-Enabled Advanced Computational Training Platform for Cybersecurity Research and Education

DeapSECURE is a training program to infuse high-performance computational techniques into cybersecurity research and education. It is an NSF-funded… more

Learning

ai deep-learning machine-learning

Beginner

Gentle Introduction to Programming With Python

This course from MIT OpenCourseWare (OCW) covers very basic information on how to get started with programming using Python. Lectures are available,… more

Learning

python

Beginner

GIS: Geocoding Services

Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically… more

Docs

gis

Beginner, Intermediate

HPC Carpentry

An HPC focused Carpentry community. Trainings include: HPC fundamentals, python, chapel, LAMMPS, parallelization with python, scaling studies, etc.

Website

software-carpentry training

Beginner, Intermediate, Advanced

Introduction to Deep Learning in Pytorch

This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data… more

Learning

ai deep-learning image-processing

Beginner, Intermediate

Introduction to Python for Digital Humanities and Computational Research

This documentation contains introductory material on Python Programming for Digital Humanities and Computational Research. This can be a go-to… more

Docs

ai big-data data-analysis

Beginner

Managing Python Packages on an HPC Cluster

This workshop will go into the different ways python packages can be managed in a cluster environment using conda and python virtual environments… more

Docs

documentation pytorch data-science

Intermediate

Open OnDemand

Open OnDemand is an easy-to-use web portal that lets students, researchers, and industry professionals use supercomputers from anywhere. It is… more

Website

open-ondemand administering-hpc cluster-management

Beginner, Intermediate, Advanced

PyTorch for Deep Learning and Natural Language Processing

PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the… more

Docs

ai big-data data-analysis

Beginner

Useful R Packages for Data Science and Statistics

This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to… more

Docs

plotting visualization data-analysis

Beginner, Intermediate, Advanced

Using Linux commands in a python script (and the difference between the subprocess and os python modules)

Learn how to use Linux commands in a python script. Specifically, learn how to use the subprocess and os modules in python to run shell commands (… more

Learning

cluster-management programming python

Beginner, Intermediate

Version control with Git

Understand the benefits of an automated version control system and the basics of how automated version control systems work. Configure git the first… more

Learning

version-control github git

Beginner

Neurostars

A question and answer forum for neuroscience researchers, infrastructure providers and software developers.

Website

documentation image-processing data-sharing

Beginner, Intermediate, Advanced

Paraview UArizona HPC links (advanced)

These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant ([rtdatavis.github.io](… more

Docs

visualization

Intermediate, Advanced

A guide to pip in Python

pip stands for "pip installs packages". It's the go-to package manager for Python, allowing developers to install, update, and manage… more

Learning

pip software-installation

Beginner, Intermediate

A visual introduction to Gaussian Belief Propagation

This website is an interactive introduction to Gaussian Belief Propagation (GBP). A probabilistic inference algorithm that operates by passing… more

Website

ai machine-learning

Beginner, Intermediate

ACCESS - Video for new ACCESS users

This is a short video on how to exchange ACCESS credits and connect to Jetstream 2 (please note this was created for Duke users but applies to all) .

Video

access-account ACCESS-credits exchange-request

Beginner

ACCESS Campus Champion Example Allocation

ACCESS requests proposals to be written following NSF proposal guidelines. The link provides an example of an ACCESS proposal using an NSF LaTeX… more

Learning

allocations-proposal proposal-request research-facilitation

Beginner

ACCESS Events and Training

Listing of upcoming ACCESS related events and training activities.

Website

professional-development training workforce-development

Beginner

ACCESS Getting Started Quick-Guide

A step-by-step guide to getting your first allocation for Access computing and storage resources.

Website

access-account ACCESS-credits allocations-proposal

Beginner

ACCESS Guide (originally given at Duke OIT)

A guide for Duke OIT on how to advise users on using ACCESS and allocation credits to jetstream 2 for Duke University members. This can be used for… more

Docs

ACCESS-credits adding-users allocation-management

Intermediate, Advanced

ACCESS KB Guide - Anvil

Purdue University is the home of Anvil, a powerful supercomputer that provides advanced computing capabilities to support a wide range of… more

Docs

anvil

Beginner, Intermediate, Advanced

ACCESS KB Guide - DELTA

NCSA is the home of Delta, a computing and data resource that balances cutting-edge graphics processor and CPU architectures with a non-POSIX file… more

Docs

delta

Beginner, Intermediate, Advanced

ACCESS KB Guide - Expanse

Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting.

Docs

expanse composable-systems gpu

Beginner, Intermediate, Advanced

ACCESS Support Portal

Website

affinity-group pegasus ACCESS-website

Beginner, Intermediate, Advanced

ACCESS Video Learning Center

A library of short videos about ACCESS allocations, resources and support.

Video

training

Beginner

ACES: Charliecloud Containers for Scientific Workflows (Tutorial)

This tutorial introduces the use of Containers using the Charliecloud software suite. This tutorial will provide participants with background and… more

Learning

ACES TAMU scratch

Beginner

Active inference textbook

This textbook is the first comprehensive treatment of active inference, an integrative perspective on brain, cognition, and behavior used across… more

Learning

ai machine-learning neural-networks

Beginner, Intermediate, Advanced

Advanced Compilers: The Self-Guided Online Course

This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate… more

Learning

optimization parallelization training

Advanced

Advanced Mathematical Optimization Techniques

Mathematical optimization deals with the problem of finding numerically minimums or maximums of a functions. This tutorial provides the Python… more

Learning

optimization python

Beginner, Intermediate, Advanced

AHPCC documentary

This link is a documentary website to use AHPCC.

Docs

login batch-jobs slurm

Beginner, Intermediate

AI for improved HPC research - Cursor and Termius - Powerpoint

These slides provide an introduction on how Termius and Cursor, two new and freemium apps that use AI to perform more efficient work, can be used for… more

Slides

documentation ai machine-learning

Beginner, Intermediate

AI Institutes Cyberinfrastructure Documents: SAIL Meeting

Materials from the SAIL meeting (https://aiinstitutes.org/2023/06/21/sail-2023-summit-for-ai-leadership/). A space where AI researchers can learn… more

Learning

access-account ai data-analysis

Beginner, Intermediate, Advanced

AI powered VsCode Editor

**Cursor: The AI-Powered Code Editor**

Cursor is a cutting-edge, AI-first code editor designed to revolutionize the way developers… more

Tool

ai machine-learning workflow

Beginner, Intermediate

AI/ML TechLab - Accelerating AI/ML Workflows on a Composable Cyberinfrastructure

This technology lab contains a set of sessions to help a new user start an AI project on the ACES cluster, a composable accelerator testbed at Texas… more

Docs

ACES documentation TAMU

Intermediate

An Introduction to the Julia Programming Language

The Julia Programming Language is one of the fastest growing software languages for AI/ML development. It writes in manner that's similar to… more

Learning

ai data-analysis machine-learning

Beginner

Anvil Home Page

Purdue University is the home of Anvil, a powerful supercomputer that provides advanced computing capabilities to support a wide range of… more

Website

anvil

Beginner, Intermediate, Advanced

Application Fundamentals (Android)

The provided text discusses various aspects of Android app development fundamentals. It covers key concepts related to app components, the… more

Website

license api programming

Beginner, Intermediate

Applications of Machine Learning in Engineering and Parameter Tuning Tutorial

Slides for a tutorial on Machine Learning applications in Engineering and parameter tuning given at the RMACC conference 2019.

Learning

data-analysis machine-learning python

Beginner, Intermediate

Astronomy data analysis with astropy

Astropy is a community-driven package that offers core functionalities needed for astrophysical computations and data analysis. From coordinate… more

Learning

visualization image-processing astrophysics

Intermediate, Advanced

Automated Machine Learning Book

The authoritative book on automated machine learning, which allows practitioners without ML expertise to develop and deploy state-of-the-art machine… more

Learning

ai data-analysis deep-learning

Intermediate, Advanced

Awesome Jupyter Widgets (for building interactive scientific workflows or science gateway tools)

A curated list of awesome Jupyter widget packages and projects for building interactive visualizations for Python code

Learning

ai computer-graphics plotting

Beginner, Intermediate, Advanced

AWS Tutorial For Beginners

An AWS Tutorial for Beginners is a course that teaches the basics of Amazon Web Services (AWS), a cloud computing platform that offers a wide range… more

Video

aws

Beginner, Intermediate

Bash shell tutorial

Training materials for using the bash (and zsh) shell.

Learning

bash

Intermediate

Beautiful Soup - Simple Python Web Scraping

This package lets you easily scrape websites and extract information based on html tags and various other metadata found in the page. It can be… more

Tool

documentation ai big-data

Beginner, Intermediate

Benchmarking with a cross-platform open-source flow solver, PyFR

What is PyFR and how does it solve fluid flow problems?
PyFR is an open-source Computational Fluid Dynamics (CFD) solver that is based on… more

Tool

finite-element-analysis benchmarking parallelization

Intermediate

Better Scientific Software (BSSw)

The Better Scientific Software (BSSw) project provides a community to collaborate and learn about best practices in scientific software development.… more

Website

community-outreach project-management research-facilitation

Beginner, Intermediate, Advanced

Bioinformatics Workflow Management with Nextflow

Nextflow is an open-source, domain-specific language and workflow manager designed for the execution and coordination of scientific and data-… more

Docs

cloud-computing parallelization data-management

Beginner, Intermediate

Biopython Tutorial

The Biopython Tutorial and Cookbook website is a dedicated online resource for users in the field of computational biology and bioinformatics. It… more

Learning

bioinformatics genomics python

Beginner, Intermediate, Advanced

Bridges-2 Home Page

Landing Page for Bridges-2 information

Website

bridges-2

Beginner, Intermediate, Advanced

Building Anaconda Navigator applications

This tutorial explains how to create an Anaconda Navigator Application (app) for JupyterLab. It is intended for users of Windows, macOS, and Linux… more

Tool

compiling conda programming

Intermediate

Building the ArduPilot environment for Linux

This article provides instructions for building AirSim, an open-source simulator for autonomous vehicles, on Linux. It outlines the steps to build… more

Docs

profiling data-transfer github

Beginner

C Programming

"These notes are part of the UW Experimental College course on Introductory C Programming. They are based on notes prepared (beginning in Spring… more

Learning

c c++ compiling

Beginner

Campus Champions Home Page

Campus Champions foster a dynamic environment for a diverse community of research computing and data professionals sharing knowledge and experience… more

Website

community-outreach professional-development

Beginner, Intermediate, Advanced

Campus Research Computing Consortium (CaRCC)

CaRCC – the Campus Research Computing Consortium – is an organization of dedicated professionals developing, advocating for, and advancing campus… more

Website

community-outreach professional-development research-facilitation

Beginner, Intermediate, Advanced

CaRCC Data Facing Track

The Data-Facing Track of the People Network brings together people from research computing groups, libraries, research institutes, and other… more

Website

data-analysis data-access-protocols data-lifecycle

Beginner, Intermediate, Advanced

Chameleon

Chameleon is an NSF-funded testbed system for Computer Science experimentation. It is designed to be deeply reconfigurable, with a wide variety of… more

Docs

data-sharing data-reproducibility

Beginner, Intermediate, Advanced

Charliecloud User Group

Announcements for for users and developers of Charliecloud, which provides lightweight user-defined software stacks for high-performance computing.

Mailing List

containers

Beginner

CHARMM Links to Install, Run, and Troubleshoot MD Simulations

CHARMM (Chemistry at HARvard Macromolecular Mechanics) is a widely distributed molecular simulation program with a broad array of applications.… more

Learning

charmm molecular-dynamics namd

Beginner, Intermediate

CMake Tutorials

CMake is an open-source tool used to manage the build process in operating systems. This tutorial takes you through how to use CMake from the very… more

Learning

training compiling

Beginner, Intermediate, Advanced

Conda

Conda is a popular package management system. This tutorial introduces you to Conda and walks you through managing Python, your environment, and… more

Tool

anaconda conda python

Beginner

ConnectCI

Connect.Cybinfrastructure is a family of portals, each representing a program that is serving a segment of the research computing and data community… more

Website

community-outreach

Beginner, Intermediate, Advanced

Containerization Explained

Containerization is a software development method in which applications are packaged into standard units for development, shipment, and deployment.

Video

containers

Beginner

Creating a Mobile Application

Goes through in detail on how to build an application that can run on Android and IOS devices, using Qt Creator to develop Qt Quick applications.… more

Website

github compiling programming

Intermediate

CUDA Toolkit Documentation

NVIDIA CUDA Toolkit Documentation: If you are working with GPUs in HPC, the NVIDIA CUDA Toolkit is essential. You can access the CUDA Toolkit… more

Docs

documentation c c++

Intermediate, Advanced

Cyber Security

learning cybersecurity is crucial for personal protection, safeguarding digital assets, financial security, and national security. It is important… more

Learning

training data-security cybersecurity

Beginner

DAGMan for orchestrating complex workflows on HTC resources (High Throughput Computing)

DAGMan (Directed Acyclic Graph Manager) is a meta-scheduler for HTCondor. It manages dependencies between jobs at a higher level than the HTCondor… more

Tool

open-science-grid

Intermediate, Advanced

Data Analysis with R for Educators

This webinar series is an orientation to R. We start with an overview of R’s history and place in the larger data science ecosystem. Next, we… more

Video

data-analysis data-science psychology

Beginner

Data Imputation Methods for Climate Data and Mortality Data

This slices and videos introduced how to use K-Nearest-Neighbors method to impute climate data and how to use Bayesian Spatio-Temporal models in R-… more

Video

allocation-value documentation ai

Intermediate, Advanced

Data Visualization Tools for Julia

Plots.jl is the most widely used plotting library for the Julia programming language. It's known for being especially powerful in its… more

Tool

plotting visualization julia

Beginner, Intermediate

Data visualization with Matplotlib

Data visualization is a critical aspect of data analysis. It allows for a clear and concise representation of data, making it easier for users to… more

Website

plotting visualization

Beginner

DeepChem

DeepChem is an open-source library built on TensorFlow and PyTorch. It is helpful in applying machine learning algorithms to molecular data.

Tool

pytorch tensorflow computational-chemistry

Beginner, Intermediate, Advanced

DELTA Introductory Video

Introductory video about DELTA. Speaker Tim Boerner, Senior Assistant Director, NCSA

video

delta gpu training

Beginner, Intermediate, Advanced

Developer Stories Podcast

As developers, we get excited to think about challenging problems. When you ask us what we are working on, our eyes light up like children in a candy… more

Website

community-outreach professional-development training

Beginner, Intermediate, Advanced

Discover Data Science

Discover Data Science is all about making connections between prospective students and educational opportunities in an exciting new, hot, and growing… more

Website

data-analysis workforce-development

Beginner

Displaying Scientific Data with Tableau

Tableau is a popular and capable software product for creating charts that present data and dashboards that allow you to explore data. It is… more

Video

big-data data-analysis training

Intermediate

Docker - Containerized, reproducible workflows

Docker allows for containerization of any task - basically a smaller, scalable version of a virtual machine. This is very useful when transferring… more

Tool

documentation cloud-computing deep-learning

Intermediate, Advanced

Docker Container Library

The Docker container library, commonly known as Docker Hub, is a vast repository that hosts a multitude of pre-configured container images,… more

Tool

documentation cloud-computing cloud-open-source

Docker Tutorial for Beginners

A Docker tutorial for beginners is a course that teaches the basics of Docker, a containerization platform that allows you to package your… more

Video

docker

Beginner, Intermediate, Advanced

EasyBuild Documentation

EasyBuild is a software installation framework that allows administrators to easily build and install software on high-performance computing (HPC)… more

Docs

easybuild

Intermediate

Educause HEISC-800-171 Community Group

The purpose of this group is to provide a forum to discuss NIST 800-171 compliance. Participants are encouraged to collaborate and share effective… more

Website

cybersecurity

Beginner, Intermediate, Advanced

Examples of code using JSON nlohmann header only Library for C++

This code showcases how to work with the header-only nlohmann JSON library for C++. In order to compile, change the extensions from json_test.txt to… more

Learning

c++

Advanced

Examples of Thrust code for GPU Parallelization

Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely… more

Learning

parallelization gpu cuda

Intermediate, Advanced

Expanse Home Page

Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting.

Website

big-data

Beginner, Intermediate, Advanced

Factor Graphs and the Sum-Product Algorithm

A tutorial paper that presents a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single,… more

Docs

access-account ai machine-learning

Intermediate

Fairness and Machine Learning

The "Fairness and Machine Learning" book offers a rigorous exploration of fairness in ML and is suitable for researchers, practitioners,… more

Docs

ai data-analysis deep-learning

Intermediate, Advanced

fast.ai

Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own… more

Website

ai machine-learning pytorch

Beginner, Intermediate, Advanced

Feed Forward NNs and Gradient Descent

Feed-forward neural networks are a simple type of network that simply rely on data to be "fed-forward" through a series of layers that… more

Website

deep-learning machine-learning neural-networks

Intermediate

File management of Visual Studio Code on clusters

Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated… more

Learning

faster file-limit scratch

Intermediate

Fine-tuning LLMs with PEFT and LoRA

As LLMs get larger fine-tuning to the full extent can become difficult to train on consumer hardware. Storing and deploying these tuned models can… more

Video

faster optimization performance-tuning

Intermediate, Advanced

Framework to help in scaling Machine Learning/Deep Learning/AI/NLP Models to Web Application level

This framework will help in scaling Machine Learning/Deep Learning/Artificial Intelligence/Natural Language Processing Models to Web Application… more

Learning

ai deep-learning machine-learning

Intermediate

FreeSurfer Tutorials

The official MGH / Harvard tutorial page for FreeSurfer. The FreeSurfer group has provided and designed a series of tutorials for using FreeSurfer… more

Learning

data-analysis image-processing psychology

Beginner, Intermediate

FSL Lectures

This is the official University of Oxford FSL group lecture page. This includes information on upcoming and past courses (online and in-person), as… more

Learning

data-analysis image-processing psychology

Beginner, Intermediate, Advanced

Fundamentals of Cloud Computing

An introduction to Cloud Computing

Website

cloud-computing

Beginner

Fundamentals of R Programming

This course is an introduction to the R programming language and covers the fundamental concepts needed to operate in the R environment. This course… more

Learning

ACES TAMU plotting

Beginner

Gaussian 16

Gaussian 16 is a computational chemistry package that is used in predicting molecular properties and understanding molecular behavior at a quantum… more

Tool

gaussian computational-chemistry

Intermediate, Advanced

GDAL Multi-threading

Multi-threading guidance when using GDAL.

Learning

parallelization gis

Intermediate

Geocomputation with R (Free Reference Book)

Below is a link for a book that focuses on how to use "sf" and "terra" packages for GIS computations. As of 5/1/2023, this book… more

Learning

r

Beginner, Intermediate

GIS: Projections and their distortions

In GIS, projections are helpful to take something plotted on a globe and convert it to a flat map that we can print or show on a screen.… more

Learning

gis

Beginner, Intermediate

GIS: What is a Geodetic Datums?

Often when working with GIS, or spatial data, one encounters the word "datum" and it may require that you choose a "datum" when… more

Learning

arcgis gis

Beginner

Git Branching Workflow and Maneuvers

A couple of resources that:

1.) Presents and defends a git branching workflow for stable collaborative git based projects. ("A… more

Learning

github git

Beginner, Intermediate, Advanced

Globus Documentation

Globus is a data transfer, sharing, automation, and discovery service used by hundreds of thousands of researchers to manage "big data" at… more

Docs

cloud-storage data-sharing data-management

Beginner, Intermediate, Advanced

GPU Computing Workshop Series for the Earth Science Community

GPU training series for scientists, software engineers, and students, with emphasis on Earth science applications.

The content of this… more

Learning

optimization performance-tuning profiling

Beginner

Guide to building AirSim on Linux machines

This article provides step-by-step instructions on how to build AirSim, a simulator for autonomous vehicles, on Linux. It includes both Docker and… more

Docs

documentation github github-pages

Beginner, Intermediate

Handwritten Digits Tutorial in PyTorch

This tutorial is essentially the "hello world" of image recognition and feed-forward neural network (using PyTorch). Using the MNIST… more

Website

ai visualization deep-learning

Intermediate

Harnessing the Power of Cloud and Machine Learning for Climate and Ocean Advances

Documentation and presentation on how to use machine learning and deep learning framework using TensorFlow, Keras and sci-kit learn for Climate and… more

Learning

machine-learning

Intermediate

Header-only C++ JSON library

JSON is a lightweight format for storing and transporting data, for example in a config file. This library is header-only, and has easy-to-read… more

Learning

resources c++

Intermediate, Advanced

High Performance Computing (HPC) 101 - Cluster

High Performance Computing (HPC) Cluster

Video

hpc-cluster-build

Beginner, Intermediate

High performance computing 101

An introductory guide to High Performance Computing.

Website

administering-hpc

Beginner

Horovod: Distributed deep learning training framework

Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in… more

Tool

deep-learning distributed-computing gpu

Intermediate, Advanced

Hour of Ci

Hour of Cyberinfrastructure (Hour of CI) is a nationwide campaign to introduce undergraduate and graduate students to cyberinfrastructure and… more

Learning

arcgis gis administering-hpc

Beginner

How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev

A tutorial entitled "How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev" presented at SciPy 2023… more

Learning

data-sharing data-management-software data-reproducibility

Beginner, Intermediate, Advanced

How to use Rclone

Learn how to use Rclone to transfer data, specifically from your local drive to the Open Storage Network, vice versa.

Learning

data-transfer

Beginner

HPCwire

HPCwire is a prominent news and information source for the HPC community. Their website offers articles, analysis, and reports on HPC technologies,… more

Website

documentation pytorch data-science

Beginner, Intermediate, Advanced

Implementing Markov Processes with Julia

The following link provides an easy method of implementing Markov Decision Processes (MDP) in the Julia computing language. MDPs are a class of… more

Tool

ai machine-learning julia

Intermediate, Advanced

Info about retiring of R GIS packages rgdal, rgeos, maptools in 2023

R GIS packages "rgdal", "rgeos", and "maptools" are package set to be archived and no longer supported by end of 2023… more

Docs

r

Beginner, Intermediate, Advanced

InsideHPC

InsideHPC is an informational site offers videos, research papers, articles, and other resources focused on machine learning and quantum computing… more

Website

ai machine-learning community-outreach

Beginner, Intermediate, Advanced

Installing Rocky Linux Operating System

Rocky Linux is an open-source enterprise operating system. It is compatible with Red Hat Enterprise Linux (RHEL). It is a community-driven project… more

Learning

unix-environment software-installation

Beginner

Intro to Machine Learning on HPC

This tutorial introduces machine learning on high performance computing (HPC) clusters. While it focuses on the HPC clusters at The University of… more

Docs

ai supervised-learning unsupervised-learning

Beginner

Intro to Statistical Computing with Stan

The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function. Here… more

Docs

data-analysis machine-learning monte-carlo

Beginner, Intermediate

Introduction to GPU/Parallel Programming using OpenACC

Introduction to the basics of OpenACC.

Slides

gpu c c++

Beginner

Introduction to MP

Open Multi-Processing, is an API designed to simplify the integration of parallelism in software development, particularly for applications running… more

Slides

expanse faster c

Intermediate

Introduction to Parallel Computing Tutorial

The tutorial is intended to provide a brief overview of the extensive and broad topic of Parallel Computing. It covers the basics of parallel… more

Learning

parallelization

Beginner

Introduction to Probabilistic Graphical Models

This website summarizes the notes of Stanford's introductory course on probabilistic graphical models.
It starts from the very basics and… more

Learning

ai machine-learning

Beginner, Intermediate

Introduction to Vizualization on HPC Using Python

This workshop has an introduction to the concepts of visualization followed by hands on exercises. The concepts section has Speaker Notes, and the… more

Learning

visualization documentation training

Beginner

Introductory Tutorial to Numpy and Pandas for Data Analysis

In this tutorial, I present an overview with many examples of the use of Numpy and Pandas for data analysis. Beginners in the field of data analysis… more

Docs

ai big-data data-analysis

Beginner

Jetstream Home

Jetstream2 makes cutting-edge high-performance computing and software easy to use for your research regardless of your project’s scale—even if you… more

Website

jetstream

Beginner, Intermediate, Advanced

Jetstream2 Docs Site

Jetstream2 makes cutting-edge high-performance computing and software easy to use for your research regardless of your project’s scale—even if you… more

Docs

jetstream

Beginner, Intermediate, Advanced

Jetstream2 Status

Jetstream2 makes cutting-edge high-performance computing and software easy to use for your research regardless of your project’s scale—even if you… more

Website

jetstream

Beginner, Intermediate, Advanced

Language models and using HPC resources

Documentation and research based on the latest NLP text generation detection methods for 2023.

Learning

natural-language-processing

Intermediate

Linux Tutorial from Ryan's Tutorials

The following pages are intended to give you a solid foundation in how to use the terminal, to get the computer to do useful work for you. You won… more

Learning

file-systems bash unix-environment

Beginner

Long Tales of Science: A podcast about women in HPC

A series of interviews with women in the HPC community

Website

science-gateway community-outreach professional-development

Beginner, Intermediate, Advanced

Machine Learning in Astrophysics

Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning… more

Docs

plotting big-data image-processing

Intermediate

Machine Learning in R online book

The free online book for the mlr3 machine learning framework for R. Gives a comprehensive overview of the package and ecosystem, suitable from… more

Learning

data-analysis machine-learning r

Beginner, Intermediate, Advanced

Machine Learning with sci-kit learn

In the realm of Python-based machine learning, Scikit-Learn stands out as one of the most powerful and versatile tools available. This introductory… more

Learning

ai big-data machine-learning

Beginner

Managing and Optimizing Your Jobs on HPC

An overview of tools and methods to manage and optimize jobs and HPC workflows

Video

memory optimization batch-jobs

Intermediate

MATLAB bioinformatics toolbox

Bioinformatics Toolbox provides algorithms and apps for Next Generation Sequencing (NGS), microarray analysis, mass spectrometry, and gene ontology.… more

Tool

visualization data-analysis bioinformatics

Beginner, Intermediate, Advanced

MATLAB with other Programming Languages

MATLAB is a really useful tool for data analysis among other computational work. This tutorial takes you through using MATLAB with other programming… more

Tool

c c++ fortran

Beginner, Intermediate, Advanced

MDAnalysis - Python library for the analysis of molecular dynamics simulations

MDAnalysis is a python based library of tools for the analysis of molecular dynamics simulations. It is able to read and write many popular… more

Tool

computational-chemistry materials-science python

Beginner, Intermediate, Advanced

Mechanism and Implementation of Various MPI Libraries

There is a detailed explanation about communication routines and managing methods of different MPI libraries, as well as several exercises designed… more

Website

compiling mpi

Beginner

Metadata Systems

Metadata is a vital topic in libraries and librarianship, encompassing structured information used for accessing digital resources. The definition of… more

Learning

metadata

Intermediate

Molecular Dynamics Tutorials for Beginner's

Links to MD tutorials for beginner's across various simulation platforms.

Learning

cloud-computing amber charmm

Beginner

MOPAC

MOPAC (Molecular Orbital PACkage) is a semi-empirical quantum chemistry package used to compute molecular properties and structures by using… more

Tool

computational-chemistry

Intermediate, Advanced

Moving-Lid-Driven Flow Simulation by Finite Difference Method

The listed repository contains code written in C++ to model the flow inside a cavity with a lid moving above from left to right by discretizing… more

Docs

fluid-dynamics

Intermediate

MPI Resources

Workshop for beginners and intermediate students in MPI which includes helpful exercises. Open MPI documentation.

Learning

parallelization mpi

Beginner, Intermediate

Natural Language Processing with Deep Learning

CS244N is a renowned natural language processing course offered by Stanford University and taught by Christopher Manning. It covers a wide range of… more

Video

natural-language-processing training workforce-development

Beginner, Intermediate

NCSA HPC Training Moodle

Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Other related topics… more

Learning

performance-tuning profiling parallelization

Beginner, Intermediate

NCSA HPC-Moodle

Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Some of the tutorials… more

Learning

training workforce-development

Beginner, Intermediate, Advanced

Neocortex Documentation

Neocortex is a new supercomputing cluster at the Pittsburgh Supercomputing Center (PSC) that features groundbreaking AI hardware from Cerebras… more

Docs

documentation ai deep-learning

Beginner

NERSC Training and Tutorials

A comprehensive collection of NERSC developed training and tutorial events, offered on regular schedules. All sessions are archived, including slide… more

Learning

training

Beginner, Intermediate, Advanced

Neural Networks in Julia

Making a neural network has never been easier! The following link directs users to the Flux.jl package, the easiest way of programming a neural… more

Tool

ai deep-learning machine-learning

Intermediate, Advanced

Neurodesk

Neurodesk provides a containerised data analysis environment to facilitate reproducible analysis of neuroimaging data. Analysis pipelines for… more

Website

psychology containers software-installation

Beginner, Intermediate, Advanced

NITRC

The Neuroimaging Tools and Resources Collaboratory (NITRC) is a neuroimaging informatics knowledge environment for MR, PET/SPECT, CT, EEG/MEG,… more

Website

data-analysis image-processing data-sharing

Beginner, Intermediate, Advanced

Numba: Compiler for Python

Numba is a Python compiler designed for accelerating numerical and array operations, enabling users to enhance their application's performance… more

Docs

vectorization optimization performance-tuning

Intermediate, Advanced

Numpy - a Python Library

Numpy is a python package that leverages types and compiled C code to make many math operations in Python efficient. It is especially useful for… more

Tool

documentation big-data data-analysis

Beginner, Intermediate

Oakridge Leadership Computing Facility (OLCF) Training Events and Archive

Upcoming training events and archives of training materials detailing general HPC best practices as well as how to use OLCF resources and services.

Learning

training

Beginner, Intermediate, Advanced

Official Documentation for PyTorch and NumPy

The official documentation for PyTorch, a machine learning tensor-based framework, and NumPy, which allows for support for ndarrays which is useful… more

Docs

deep-learning neural-networks pytorch

Beginner

Official Documentation of VisIt

VisIt is a prominent open-source, interactive parallel visualization and graphical analysis tool predominantly used for viewing scientific data. Its… more

Docs

visIt novel-accelerators particle-physics

Intermediate, Advanced

Official Python Documentation

The official documentation for Python 3.11.5. Python comes with a lot of features built into the language, so it is worth taking a look as you code.

Docs

documentation python

OnShape Documentation

This contains documentation for getting started with using OnShape for CAD. OnShape cloud-hosted CAD software that lets you work with others like on… more

Tool

documentation faster

Beginner

OnShape FeatureScripts: Custom features for everyone

OnShape FeatureScripts allow users to create their own features via OnShape's programming language. The user can make these as simple or complex… more

Tool

documentation materials-science particle-physics

Intermediate, Advanced

Open Storage Network

The Open Storage Network, a national resource available through the XSEDE resource allocation system, is high quality, sustainable, distributed… more

Website

data-management data-retention open-storage-network

Beginner, Intermediate, Advanced

Open-Source Server Virtualization Platform

Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems… more

Learning

software-installation

Beginner

OpenMP and Multithreaded Jobs in GRASS

Techniques and support for multithreaded geospatial data processing in GRASS.

Tool

parallelization gis openmp

Intermediate

OpenStack Tutorial For Beginners

OpenStack Tutorial For Beginners

Video

openstack

Beginner

Optimizing Research Workflows - A Documentation of Snakemake

Snakemake is a powerful and versatile workflow management system that simplifies the creation, execution, and management of data analysis pipelines.… more

Docs

documentation data-analysis data-reproducibility

Intermediate, Advanced

Pandas - Python

pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language… more

Docs

documentation ai big-data

Beginner, Intermediate

Paraview UArizona HPC links (beginner)

These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant (rtdatavis.github.io). The… more

Docs

visualization

Beginner

Performance Engineering Of Software Systems

A class from MITOpenCourseware that gives a hands on approach to building scalable and high-performance software systems. Topics include performance… more

Learning

optimization parallelization training

Intermediate, Advanced

Practical Machine Learning with Python

This video series provides a holistic understanding of machine learning, covering theory, application, and inner workings of supervised, unsupervised… more

Video

machine-learning programming python

Advanced

Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing

Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving… more

Docs

ai machine-learning

Advanced

Python

Python course offered by Texas A&M HPRC

Learning

python

Beginner

Python Data and Viz Training (CCEP Program)

5 Days of recordings of Python data analysis and visualization training.

Learning

data-science python

Beginner, Intermediate

Python Tools for Data Science

Python has become a very popular programming language and software ecosystem for work in Data Science, integrating support for data access, data… more

Video

ai machine-learning big-data

Intermediate

PyTorch Introduction

This is a very barebones introduction to the PyTorch framework used to implement machine learning. This tutorial implements a feed-forward neural… more

Website

deep-learning machine-learning neural-networks

Intermediate

QGIS Processing Executor

Running QGIS tools from the command line

Docs

gis

Intermediate

Quick and Robust Data Augmentation with Albumentations Library

Data augmentation is a crucial step in the pipeline for image classification with deep learning. Albumentations is an extremely versatile Python… more

Tool

deep-learning python

Advanced

R for Data Science

R for Data Science is a comprehensive resource for individuals looking to harness the power of the R programming language for data analysis,… more

Learning

visualization data-analysis data-science

Beginner, Intermediate, Advanced

R for Research Scientists

A book for researchers who contribute code to R projects: This booklet is the result of my work with the Social Cognition for Social Justice lab. It… more

Learning

software-carpentry workforce-development r

Beginner, Intermediate

Raftlib: Open Source library for concurrent data processing pipelines

Raftlib is an open-source C++ Library that provides a framework for implementing parallel and concurrent data processing pipelines. It is designed… more

Tool

parallelization pthreads openmp

Intermediate, Advanced

Recommended Libraries for Cyberinfrastructure Users Developing Jupyter Notebooks

This repository contains information about Jupyter Widgets and how they can be used to develop interactive workflows, data dashboards, and web… more

Website

ai big-data data-analysis

Beginner, Intermediate, Advanced

Regular Expressions

Regular expressions (sometimes referred to as RegEx) is an incredibly powerful tool that is used to define string patterns for "find" or… more

Learning

perl programming python

Beginner, Intermediate

Regulated Research Community of Practice

The daily news clearly shows the increasing threat to safety and privacy of data, personal as well as intellectual property. While the requirements… more

Website

community-outreach cybersecurity

Beginner, Intermediate, Advanced

Reinforcement Learning For Beginners with Python

This course takes through the fundamentals required to get started with reinforcement learning with Python, OpenAI Gym and Stable Baselines. You… more

Video

deep-learning machine-learning tensorflow

Beginner

Representation Learning in Deep Learning

Representation learning is a fundamental concept in machine learning and artificial intelligence, particularly in the field of deep learning. At its… more

Docs

deep-learning image-processing machine-learning

Intermediate

Research Security Operations Center at IU

The NSF-funded ResearchSOC helps make scientific computing resilient to cyberattacks and capable of supporting trustworthy, productive research… more

Website

cybersecurity

Beginner, Intermediate, Advanced

Research Software Development in JupyterLab: A Platform for Collaboration Between Scientists and RSEs

Iterative Programming takes place when you can explore your code and play with your objects and functions without needing to save, recompile, or… more

Learning

ai visualization big-data

Beginner, Intermediate

Research Software Engineering Training Materials

An ongoing collection of RSE training material, workshops, and resources. We are compiling this list as a starting point for future activities. We… more

Website

astrophysics data-science novel-accelerators

Beginner, Intermediate, Advanced

Resource to active inference

Active inference is an emerging study field in machine learning and computational neuroscience. This website in particular introduces "active… more

Website

ai

Beginner, Intermediate, Advanced

RMACC Website

Rocky Mountain Advanced Computing Consortium Website

Website

community-outreach

Beginner, Intermediate, Advanced

Rockfish at Johns Hopkins University

Resources and User Guide available at Rockfish

Docs

rockfish

Intermediate

Running Particle-in-Cell Simulations on HPC

WarpX is an advanced particle-in-cell code used to model particle accelerators, which needs to be run on HPC. This website contains the tutorial on… more

Docs

github github-pages novel-accelerators

Intermediate

Samtools Documentation

Samtools is a suite of programs for interacting with high-throughput sequencing data, especially in the SAM/BAM format. It offers various utilities… more

Docs

documentation data-analysis bioinformatics

Beginner, Intermediate, Advanced

Science Gateway Tool/Web App Template (Jupyter Notebook + ipywidgets)

Use this template to turn any science gateway workflow into a web application!

Learning

data-analysis github astrophysics

Beginner

Scikit-Learn: Easy Machine Learning and Modeling

Scikit-learn is free software machine learning library for Python. It has a variety of features you can use on data, from linear regression… more

Tool

documentation ai plotting

Beginner, Intermediate

Scipy Lecture Notes

Comprehensive tutorials and lecture notes covering various aspects of scientific computing using Python and Scipy.

Learning

visualization data-analysis machine-learning

Beginner, Intermediate

Set Up VSCode for Python and Github

VSCode is a popular IDE that runs on Windows, MacOS, and Linux. This tutorial will explain how to get set up with VSCode to code in Python. It will… more

Learning

git python

Setting up PyFR flow solver on clusters

These instructions were executed on the FASTER and Grace cluster computing facilities at Texas A&M University. However, the process can be… more

Learning

faster fluid-dynamics c++

Advanced

Singularity/Apptainer User Manuals

Singularity/Apptainer is a free and open-source container platform that allows users to build and run containers on high performance computing… more

Docs

containers singularity

Intermediate

Slurm Scheduling Software Documentation

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm… more

Website

cluster-management cluster-support slurm

Intermediate, Advanced

Slurm Tutorials

Introduction to the Slurm Workload Manager for users and system administrators, plus some material for Slurm programmers.

Learning

administering-hpc cluster-management hpc-cluster-architecture

Beginner

Slurm User Group Mailing List

Mailing List

slurm schedulers

Beginner, Intermediate, Advanced

Solving differential equations with Physics-informed Neural Network

Differential equations, the backbone of countless physical phenomena, have traditionally been solved using numerical methods or analytical techniques… more

Learning

neural-networks

Beginner, Intermediate

Spack Documentation

Spack is a package manager for supercomputers that can help administrators install scientific software and libraries for multiple complex software… more

Docs

spack

Intermediate

TensorFlow for Deep Neural Networks

TensorFlow is a powerful framework for Deep Learning, developed by google. This specifically is their python package, which is easy to use and can be… more

Tool

documentation faster tensorflow

Intermediate, Advanced

Termius - Modern ssh platform

**Termius: The Modern SSH Client for 2023**

Termius is the future-facing SSH client that's redefining remote server access in… more

Website

cloud-computing data-sharing data-transfer

Beginner, Intermediate

Texas A&M HPRC Training Site

Training Resources and Courses offered by Texas A&M's Research Computing Group

Learning

ACES TAMU

Beginner, Intermediate, Advanced

The Official Documentation of Pandas

Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures,… more

Docs

plotting visualization

Beginner, Intermediate

The Theory Behind Neural Networks (Very Simplified)

This video by the YouTube channel 3Blue1Brown provides a very simplified introduction to the theory behind neural networks. This tutorial is perfect… more

Video

neural-networks

Beginner

Thrust resources

Thrust is a CUDA library that optimizes parallelization on the GPU for you. The Thrust tutorial is great for beginners. The documentation is helpful… more

Learning

parallelization gpu resources

Intermediate, Advanced

Time-Series LSTMs Python Walkthrough

A walkthrough (with a Google Colab link) on how to implement your own LSTM to observe time-dependent behavior.

Website

ai deep-learning machine-learning

Advanced

Trinity Tutorial for Transcriptome Assembly

Trinity is one of the most popular tool to assemble transcripts from RNA-Seq short reads. In this tutorial, we will cover the basic usage of Trinity… more

Learning

biology

Beginner

Trusted CI

The mission of Trusted CI is to lead in the development of an NSF Cybersecurity Ecosystem with the workforce, knowledge, processes, and… more

Website

cybersecurity training

Beginner, Intermediate, Advanced

Trusted CI Resources Page

Very helpful list of external resources from Trusted CI

Website

cybersecurity

Beginner, Intermediate, Advanced

Tutorial for OpenMP Building up and Utilization

The following link elaborates the usage of OpenMP API and its related syntax. There are also several exercises available for learners to help them… more

Website

openmp

Beginner

Ultimate guide to Unix

Unix is incredibly common and useful. This website provides all the common commands and explanations for one to get started with a unix system.

Website

bash

Beginner

Understanding LLM Fine-tuning

With the recent uprising of LLM's many business are looking at way to adopt these LLMs and fine-tuning these models on specfic data sets to… more

Learning

big-data training

Beginner, Intermediate

UNIX/command line basics tutorial

Introductory training materials for working on the UNIX command line.

Learning

bash

Beginner

Use Windows Subsystem for Linux for HPC Command Line Access from Windows

Windows Subsystem for Linux (WSL) provides a Linux environment for Windows users to access HPC resources fast and efficiently.

Tool

workflow ssh

Beginner

Using Dask on HPC Systems

A tutorial on the effective use of Dask on HPC resources. The four-hour tutorial will be split into two sections, with early topics focused on novice… more

Learning

training jupyterhub python

Beginner, Intermediate

Vulkan Support Survey across Systems

It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created… more

Docs

anvil bridges-2 darwin

Beginner, Intermediate

Vulkan Support Survey across Systems

It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created… more

Learning

big-data computer-graphics workflow

Intermediate

Warewulf documentation

Warewulf is an operating system provisioning platform for Linux that is designed to produce secure, scalable, turnkey cluster deployments that… more

Website

documentation administering-hpc distributed-computing

Beginner, Intermediate

Weka

Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression,… more

Tool

big-data data-analysis machine-learning

Intermediate, Advanced

What are LSTMs?

This reading will explain what a long short-term memory neural network is. LSTMs are a type of neural networks that rely on both past and present… more

Learning

ai deep-learning machine-learning

Intermediate, Advanced

What is fairness in ML?

This article discusses the importance of fairness in machine learning and provides insights into how Google approaches fairness in their ML models.… more

Docs

ai visualization data-analysis

Intermediate

What is VPN? How It Works, Types of VPN

A VPN, or Virtual Private Network, is a technology that creates a secure tunnel between your device and a VPN server. This tunnel encrypts all of… more

Website

vpn

Beginner

Why 'N How: Martinos Center for Biomedical Imaging:

The Why & How seminar series is designed to introduce research assistants, graduate students, and postdoctoral and clinical fellows – really,… more

Learning

image-processing

Beginner, Intermediate, Advanced

Wiki for Onboarding onto the C3DDB Cluster at MGHPCC

This is a resource for researchers and students looking to on-board onto the c3ddb cluster at MGHPCC. In the code section, there are example job… more

Learning

cluster-support

Beginner

Women in HPC

Through collaboration and networking, WHPC strives to bring together women in HPC and technical computing while encouraging women to engage in… more

Website

community-outreach

Beginner

WRF in the Public Cloud

CAC summer student employee Jeff Lantz describes his experiences in running the WRF weather forecasting application in the public cloud. He compares… more

Video

aws azure cloud-commercial

Advanced