Knowledge Base Resources
Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
CyberAmbassadors: Professional Skills for Interdisciplinary Work
0
The CyberAmbassadors project was funded through a workforce development grant from the National Science Foundation (Award #1730137). Starting in 2017, the initial focus of this project was to develop, test, and refine new curriculum to help CyberInfrastructure (CI) Professionals strengthen their communications, teamwork and leadership skills. With support and collaboration from a number of academic and professional organizations, the CyberAmbassadors project was expanded to offer professional skills training to college students and professionals working across STEM (science, technology, engineering, math) disciplines.
Molecular Dynamics Tutorials for Beginner's
0
Links to MD tutorials for beginner's across various simulation platforms.
Official Documentation for PyTorch and NumPy
0
The official documentation for PyTorch, a machine learning tensor-based framework, and NumPy, which allows for support for ndarrays which is useful to make tensors when implementing NNs. Both libraries can be installed with pip.
The Official Documentation of Pandas
0
Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures, and data analysis tools for the Python programming language. The official documentation serves as an in-depth guide to using this powerful tool including explanations and examples.
Gesture Classifier Model using MediaPipe
0
MediaPipe is Google's open-source framework for building multimodal (e.g., video, audio, etc.) machine learning pipelines. It is highly efficient and versatile, making it perfect for tasks like gesture recognition.
This is a tutorial on how to make a custom model for gesture recognition tasks based on the Google MediaPipe API. This tutorial is specifically for video-playback, though could be generalized to image and live-video feed recognition.
Neurostars
0
A question and answer forum for neuroscience researchers, infrastructure providers and software developers.
Long Tales of Science: A podcast about women in HPC
0
A series of interviews with women in the HPC community
Regular Expressions
0
Regular expressions (sometimes referred to as RegEx) is an incredibly powerful tool that is used to define string patterns for "find" or "find and replace" operations on strings, or for input validation. Regular Expressions are used in search engines, in search and replace dialogs of word processors and text editors, and text-processing Linux utilities such as sed and awk. They are supported in many programming languages, including Python, R, Perl, Java, and others.
Machine Learning in R online book
0
The free online book for the mlr3 machine learning framework for R. Gives a comprehensive overview of the package and ecosystem, suitable from beginners to experts. You'll learn how to build and evaluate machine learning models, build complex machine learning pipelines, tune their performance automatically, and explain how machine learning models arrive at their predictions.
fast.ai
0
Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own library built on PyTorch, news articles, and other resources to dive into this realm.
Examples of code using JSON nlohmann header only Library for C++
0
This code showcases how to work with the header-only nlohmann JSON library for C++. In order to compile, change the extensions from json_test.txt to json_test.cpp and test.txt to test.json. You must also download the header files from https://github.com/nlohmann/json. Complilation instructions are at the bottom of json_test. This code is very helpful for creating config files, for example.
DAGMan for orchestrating complex workflows on HTC resources (High Throughput Computing)
0
DAGMan (Directed Acyclic Graph Manager) is a meta-scheduler for HTCondor. It manages dependencies between jobs at a higher level than the HTCondor Scheduler.
It is a workflow management system developed by the High-Throughput Computing (HTC) community, specifically for managing large-scale scientific computations and data analysis tasks. It enables users to define complex workflows as directed acyclic graphs (DAGs). In a DAG, nodes represent individual computational tasks, and the directed edges represent dependencies between the tasks. DAGMan manages the execution of these tasks and ensures that they are executed in the correct order based on their dependencies.
The primary purpose of DAGMan is to simplify the management of large-scale computations that consist of numerous interdependent tasks. By defining the dependencies between tasks in a DAG, users can easily express the order of execution and allow DAGMan to handle the scheduling and coordination of the tasks. This simplifies the development and execution of complex scientific workflows, making it easier to manage and track the progress of computations.
Discover Data Science
0
Discover Data Science is all about making connections between prospective students and educational opportunities in an exciting new, hot, and growing field – data science.
Numpy - a Python Library
0
Numpy is a python package that leverages types and compiled C code to make many math operations in Python efficient. It is especially useful for matrix manipulation and operations.
RMACC Systems Administrator Workshop Slides
0
A compilation of the slides from this year's RMACC Sys Admin Workshop.
RMACC Sys Admin Workhop Schedule:
Tuesday
12:00 PM Sign-in
1:00 PM Introductions
1:30 PM Lightning Talk - HPC Survival guide
2:00 PM Node Management - Scott Serr
2:30 PM Lightning Talk - Warewulf
3:00 PM Urgent HPC - Coltran Hophan-Nichols and Alexander Salois
Wednesday
9:00 AM Breakfast
10:00 AM Round table Sites - BYU, INL, UMT, ASU, MSU
11:00 AM Open OnDemand setup - Dean Anderson
11:30 AM Lightning talk - Long term hardware support
12:00 PM Lunch
1:00 PM HPC Security - Matt Bidwell
2:00 PM Lightning talk- Security
2:30 PM ACCESS resources - Couso
3:00 PM Easybuild tutorial - Alexander Salois
3:30 PM General Q & A
Thursday
9:00 AM Breakfast
10:00 AM Lightning Talk- Containers and Virtual Machines
11:00 AM University of Montana - Hellgate Site Tour
11:30 AM Closing Remarks
MATLAB bioinformatics toolbox
0
Bioinformatics Toolbox provides algorithms and apps for Next Generation Sequencing (NGS), microarray analysis, mass spectrometry, and gene ontology. Using toolbox functions, you can read genomic and proteomic data from standard file formats such as SAM, FASTA, CEL, and CDF, as well as from online databases such as the NCBI Gene Expression Omnibus and GenBank.
Oakridge Leadership Computing Facility (OLCF) Training Events and Archive
0
Upcoming training events and archives of training materials detailing general HPC best practices as well as how to use OLCF resources and services.
Spack Documentation
0
Spack is a package manager for supercomputers that can help administrators install scientific software and libraries for multiple complex software stacks.
Active inference textbook
0
This textbook is the first comprehensive treatment of active inference, an integrative perspective on brain, cognition, and behavior used across multiple disciplines including computational neurosciences, machine learning, artificial intelligence, and robotics. It was published in 2022 and it's open access at this time. The contents in this textbook should be educational to those who want to understand how the free energy principle is applied to the normative behavior of living organisms and who want to widen their knowledge of sequential decision making under uncertainty.
Fine-tuning LLMs with PEFT and LoRA
0
As LLMs get larger fine-tuning to the full extent can become difficult to train on consumer hardware. Storing and deploying these tuned models can also be quite expensive and difficult to store. With PEFT (parameter -efficent fine tuning), it approaches fine-tune on a smaller scale of model parameters while freezing most parameters of the pretrained LLMs. Basically it is providing full performance that which is similar if not better than full fine tuning while only having a small number of trainable parameters. This source explains that as well as going over LORA diagrams and a code walk through.
Women in HPC
0
Through collaboration and networking, WHPC strives to bring together women in HPC and technical computing while encouraging women to engage in outreach activities and improve the visibility of inspirational role models.
Fundamentals of Cloud Computing
0
An introduction to Cloud Computing
CUDA Toolkit Documentation
0
NVIDIA CUDA Toolkit Documentation: If you are working with GPUs in HPC, the NVIDIA CUDA Toolkit is essential. You can access the CUDA Toolkit documentation, including programming guides and API references, at this provided website
Set Up VSCode for Python and Github
0
VSCode is a popular IDE that runs on Windows, MacOS, and Linux. This tutorial will explain how to get set up with VSCode to code in Python. It will also provide a tutorial on how to set up Github integration within VSCode.
Introductory Python Lecture Series
0
A lecture and notes with the goal of teaching introductory python. Starting by understanding how to download and start using python, then expanding to basic syntax for lists, arrays, loops, and methods.