Knowledge Base Resources
These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
R for Research Scientists
0
A book for researchers who contribute code to R projects: This booklet is the result of my work with the Social Cognition for Social Justice lab. It was developed in response to questions I was getting from students; both grad students that were making software design decisions, and undergraduates who were using things like version control for the first time. Although many tutorials and resources exist for these topics, there was not a single source that I thought covered just enough material to build up to the workflow used by the lab without extraneous detail.
Scipy Lecture Notes
0
Comprehensive tutorials and lecture notes covering various aspects of scientific computing using Python and Scipy.
What is fairness in ML?
0
This article discusses the importance of fairness in machine learning and provides insights into how Google approaches fairness in their ML models.
The article covers several key topics:
Introduction to fairness in ML: It provides an overview of why fairness is essential in machine learning systems, the potential biases that can arise, and the impact of biased models on different communities.
Defining fairness: The article discusses various definitions of fairness, including individual fairness, group fairness, and disparate impact. It explains the challenges in achieving fairness due to trade-offs and the need for thoughtful considerations.
Addressing bias in training data: It explores how biases can be present in training data and offers strategies to identify and mitigate these biases. Techniques like data preprocessing, data augmentation, and synthetic data generation are discussed.
Fairness in ML algorithms: The article examines the potential biases that can arise from different machine learning algorithms, such as classification and recommendation systems. It highlights the importance of evaluating and monitoring models for fairness throughout their lifecycle.
Fairness tools and resources: It showcases various tools and resources available to practitioners and developers to help measure, understand, and mitigate bias in machine learning models. Google's TensorFlow Extended (TFX) and What-If Tool are mentioned as examples.
Google's approach to fairness: The article highlights Google's commitment to fairness and the steps they take to address fairness challenges in their ML models. It mentions the use of fairness indicators, ongoing research, and partnerships to advance fairness in AI.
Overall, the article provides a comprehensive overview of fairness in machine learning and offers insights into Google's approach to building fair ML models.
EasyBuild Documentation
0
EasyBuild is a software installation framework that allows administrators to easily build and install software on high-performance computing (HPC) systems. It supports a wide range of software packages, toolchains, and compilers.
Supported software are found in the EasyConfigs repository, one of several resositories in EasyBuild project.
MDAnalysis - Python library for the analysis of molecular dynamics simulations
0
MDAnalysis is a python based library of tools for the analysis of molecular dynamics simulations. It is able to read and write many popular simulation formats including CHARMM, LAMMPS, GROMACS, and AMBER and more. This link contains the documentation pages of all MDAnalysis functions and has links to tutorials using Jupyter Notebooks.
Docker - Containerized, reproducible workflows
0
Docker allows for containerization of any task - basically a smaller, scalable version of a virtual machine. This is very useful when transferring work across computing environments, as it ensures reproducibility.
ACCESS Getting Started Quick-Guide
0
A step-by-step guide to getting your first allocation for Access computing and storage resources.
Trusted CI Resources Page
0
Very helpful list of external resources from Trusted CI
Bridges-2 Home Page
0
Landing Page for Bridges-2 information
UNIX/command line basics tutorial
0
Introductory training materials for working on the UNIX command line.
Official Python Documentation
0
The official documentation for Python 3.11.5. Python comes with a lot of features built into the language, so it is worth taking a look as you code.
Mechanism and Implementation of Various MPI Libraries
0
There is a detailed explanation about communication routines and managing methods of different MPI libraries, as well as several exercises designed for users to get familiar with the implementation of MPI build process.
Understanding LLM Fine-tuning
0
With the recent uprising of LLM's many business are looking at way to adopt these LLMs and fine-tuning these models on specfic data sets to ensure accuracy. These models when fine-tuned can be optimal for fulfilling the specific needs of a company. This site explains explicitly when, how, and why models should be trained. It goes over various strategies for LLM fine -tuning.
NITRC
0
The Neuroimaging Tools and Resources Collaboratory (NITRC) is a neuroimaging informatics knowledge environment for MR, PET/SPECT, CT, EEG/MEG, optical imaging, clinical neuroinformatics, imaging genomics, and computational neuroscience tools and resources.
Open-Source Server Virtualization Platform
0
Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems including Linux and Windows on x64 hardware.
Performance Engineering Of Software Systems
0
A class from MITOpenCourseware that gives a hands on approach to building scalable and high-performance software systems. Topics include performance analysis, algorithmic techniques for high performance, instruction-level optimizations, caching optimizations, parallel programming, and building scalable systems.
Wiki for Onboarding onto the C3DDB Cluster at MGHPCC
0
This is a resource for researchers and students looking to on-board onto the c3ddb cluster at MGHPCC. In the code section, there are example job submission scripts for the different queues on c3ddb.
NCSA HPC Training Moodle
0
Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Other related topics include 'Cybersecurity for End Users' and 'Developing Webinar Training.' Some of the tutorials also offer digital badges. Many of these tutorials were previously offered on CI-Tutor. A list of open access training courses are provided below.
Parallel Computing on High-Performance Systems
Profiling Python Applications
Using an HPC Cluster for Scientific Applications
Debugging Serial and Parallel Codes
Introduction to MPI
Introduction to OpenMP
Introduction to Visualization
Introduction to Performance Tools
Multilevel Parallel Programming
Introduction to Multi-core Performance
Using the Lustre File System
Machine Learning in Astrophysics
0
Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy allowing for a range of statistical and machine learning routines to analyze astronomical data in Python. In particular, it has loaders for many open astronomical datasets with examples on how to visualize such complicated and large datasets.
A visual introduction to Gaussian Belief Propagation
0
This website is an interactive introduction to Gaussian Belief Propagation (GBP). A probabilistic inference algorithm that operates by passing messages between the nodes of arbitrarily structured factor graphs. A special case of loopy belief propagation, GBP updates rely only on local information and will converge independently of the message schedule. The key argument is that, given recent trends in computing hardware, GBP has the right computational properties to act as a scalable distributed probabilistic inference framework for future machine learning systems.
AI/ML TechLab - Accelerating AI/ML Workflows on a Composable Cyberinfrastructure
0
This technology lab contains a set of sessions to help a new user start an AI project on the ACES cluster, a composable accelerator testbed at Texas A&M University. You will learn how to create and activate a virtual environment, manipulate and visualize data with Pandas and Matplotlib, use Scikit-learn for linear regression and classification applications, and use Pytorch to create and train a simple image classification model with deep neural networks (DNN).
RMACC Website
0
Rocky Mountain Advanced Computing Consortium Website
How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev
0
A tutorial entitled "How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev" presented at SciPy 2023 in Austin, TX. This tutorial is hosted in a series of Jupyter Notebooks which can be accessed in the click of a button using Binder. See the README for more information.
UCLA Extended Reality (XR) collaboration resources and Workshop
0
Comprehensive Extended Reality (XR) collaboration resources for building a high performance extended reality (XR), augmented reality (AR), virtual reality (VR) and mixed reality campus teams. The tags set are a small subset of the the topics covered.
HPCwire
0
HPCwire is a prominent news and information source for the HPC community. Their website offers articles, analysis, and reports on HPC technologies, applications, and industry trends.