Knowledge Base Resources
These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
Anvil Home Page
0
Slurm Scheduling Software Documentation
0
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm requires no kernel modifications for its operation and is relatively self-contained. As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.
Official Documentation of VisIt
0
VisIt is a prominent open-source, interactive parallel visualization and graphical analysis tool predominantly used for viewing scientific data. Its GitHub repository offers a detailed insight into the software's source code, documentation, and contribution guidelines. In particular, it offers useful examples on how it
Advanced Compilers: The Self-Guided Online Course
0
This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate representations, data flow, and “classic” optimizations as well as more research focusedtopics such as parallelization, just-in-time compilation, and garbage collection.
A survey on datasets for fairness-aware machine learning
0
The research paper provides an overview of various datasets that have been used to study fairness in machine learning. It discusses the characteristics of these datasets, such as their size, diversity, and the fairness-related challenges they address. The paper also examines the different domains and applications covered by these datasets.
Slurm User Group Mailing List
0
Fine-tuning LLMs with PEFT and LoRA
0
As LLMs get larger fine-tuning to the full extent can become difficult to train on consumer hardware. Storing and deploying these tuned models can also be quite expensive and difficult to store. With PEFT (parameter -efficent fine tuning), it approaches fine-tune on a smaller scale of model parameters while freezing most parameters of the pretrained LLMs. Basically it is providing full performance that which is similar if not better than full fine tuning while only having a small number of trainable parameters. This source explains that as well as going over LORA diagrams and a code walk through.
Long Tales of Science: A podcast about women in HPC
0
A series of interviews with women in the HPC community
MDAnalysis - Python library for the analysis of molecular dynamics simulations
0
MDAnalysis is a python based library of tools for the analysis of molecular dynamics simulations. It is able to read and write many popular simulation formats including CHARMM, LAMMPS, GROMACS, and AMBER and more. This link contains the documentation pages of all MDAnalysis functions and has links to tutorials using Jupyter Notebooks.
DELTA Introductory Video
0
Introductory video about DELTA. Speaker Tim Boerner, Senior Assistant Director, NCSA
R for Data Science
0
R for Data Science is a comprehensive resource for individuals looking to harness the power of the R programming language for data analysis, visualization, and statistical modeling. Whether you're a beginner or an experienced data scientist, this guide will help you unlock the full potential of R in the realm of data science.
Docker Tutorial for Beginners
0
A Docker tutorial for beginners is a course that teaches the basics of Docker, a containerization platform that allows you to package your application and its dependencies into a standardized unit for development, shipment, and deployment.
UCLA Extended Reality (XR) collaboration resources and Workshop
0
Comprehensive Extended Reality (XR) collaboration resources for building a high performance extended reality (XR), augmented reality (AR), virtual reality (VR) and mixed reality campus teams. The tags set are a small subset of the the topics covered.
MOPAC
0
MOPAC (Molecular Orbital PACkage) is a semi-empirical quantum chemistry package used to compute molecular properties and structures by using approximations of the Schrödinger equation. This tutorial explains the process of using MOPAC for different forms of calculations.
Horovod: Distributed deep learning training framework
0
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in parallel. The library supports popular deep learning framework such as TensorFlow, Keras, PyTorch, and Apache MXNet.
Trusted CI Resources Page
0
Very helpful list of external resources from Trusted CI
Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing
0
Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving semantic data from humans, as well as account for possible discrepancies between human-provided data and these beliefs. To this end, this paper introduces the problem of semantic data association (SDA) in relation to conventional data association problems for sensor fusion. It then, develops a novel probabilistic semantic data association (PSDA) algorithm to rigorously address SDA in general settings. Simulations of a multi-object search task show that PSDA enables robust collaborative state estimation under a wide range of conditions.
AI Institutes Cyberinfrastructure Documents: SAIL Meeting
0
Materials from the SAIL meeting (https://aiinstitutes.org/2023/06/21/sail-2023-summit-for-ai-leadership/). A space where AI researchers can learn about using ACCESS resources for AI applications and research.
OpenMP Tutorial
0
OpenMP (Open Multi-Processing) is an API that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.
Recommended Libraries for Cyberinfrastructure Users Developing Jupyter Notebooks
0
This repository contains information about Jupyter Widgets and how they can be used to develop interactive workflows, data dashboards, and web applications that can be run on HPC systems and science gateways. Easy to build web applications are not only useful for scientists. They can also be used by software engineers and system admins who want to quickly create tools tools for file management and more!
ACCESS KB Guide - Anvil
0
Purdue University is the home of Anvil, a powerful supercomputer that provides advanced computing capabilities to support a wide range of computational and data-intensive research spanning from traditional high-performance computing to modern artificial intelligence applications.
Git Branching Workflow and Maneuvers
0
A couple of resources that:
1.) Presents and defends a git branching workflow for stable collaborative git based projects. ("A Successful Git Branching Model")
2.) Maps "What do you want to do?" to the commands necessary to accomplish it. ("Git Flight Rules")
Astronomy data analysis with astropy
0
Astropy is a community-driven package that offers core functionalities needed for astrophysical computations and data analysis. From coordinate transformations to time and date handling, unit conversions, and cosmological calculations, Astropy ensures that astronomers can focus on their research without getting bogged down by the intricacies of programming. This guide walks you through practical usage of astropy from CCD data reduction to computing galactic orbits of stars.
DeepChem
0
DeepChem is an open-source library built on TensorFlow and PyTorch. It is helpful in applying machine learning algorithms to molecular data.
Fairness and Machine Learning
0
The "Fairness and Machine Learning" book offers a rigorous exploration of fairness in ML and is suitable for researchers, practitioners, and anyone interested in understanding the complexities and implications of fairness in machine learning.