- NITRC0The Neuroimaging Tools and Resources Collaboratory (NITRC) is a neuroimaging informatics knowledge environment for MR, PET/SPECT, CT, EEG/MEG, optical imaging, clinical neuroinformatics, imaging genomics, and computational neuroscience tools and resources.
- Online Master's in Business Analytics Program Guide - TechGuide0A degree in business analytics looks different in today’s world than it did a decade ago. In its most current application, business analytics uses modern data science and capabilities in machine learning (ML). The magic comes into play when these are leveraged for strategic planning.
- Docker - Containerized, reproducible workflows0Docker allows for containerization of any task - basically a smaller, scalable version of a virtual machine. This is very useful when transferring work across computing environments, as it ensures reproducibility.
- Machine Learning in R online book0The free online book for the mlr3 machine learning framework for R. Gives a comprehensive overview of the package and ecosystem, suitable from beginners to experts. You'll learn how to build and evaluate machine learning models, build complex machine learning pipelines, tune their performance automatically, and explain how machine learning models arrive at their predictions.
- Open-Source Server Virtualization Platform0Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems including Linux and Windows on x64 hardware.
- NERSC Training and Tutorials0
- NERSC Training and Tutorials Main Site
- NERSC Upcoming and Recent Training Events
- NERSC Archived Training and Tutorials
A comprehensive collection of NERSC developed training and tutorial events, offered on regular schedules. All sessions are archived, including slide decks, video recordings, and software examples as are available. Some examples of past training and tutorial topics are listed below Deep Learning for Sciences Webinar Series BerkeleyGW Tutorial Workshop VASP Trainings Timemory Software Monitoring Tutorial, April 2021 HPCToolkit to Measure and Analyzing GPU Applications Performance Tutorial Totalview Tutorial NVidia HPCSDK - OpenMP Target Offload Training Parallelware Training Series ARM Debugging and Profiling Tools Tutorial Roofline on NVIDIA GPUs GPUs for Science events 3-part OpenACC Training Series 9-part CUDA Training Series - Representation Learning in Deep Learning0Representation learning is a fundamental concept in machine learning and artificial intelligence, particularly in the field of deep learning. At its core, representation learning involves the process of transforming raw data into a form that is more suitable for a specific task or learning objective. This transformation aims to extract meaningful and informative features or representations from the data, which can then be used for various tasks like classification, clustering, regression, and more.
- Why Mentoring Matters and How to Get Started0Describes effective mentorship (both ways).
- Official Python Documentation0The official documentation for Python 3.11.5. Python comes with a lot of features built into the language, so it is worth taking a look as you code.
- Git Branching Workflow and Maneuvers0A couple of resources that: 1.) Presents and defends a git branching workflow for stable collaborative git based projects. ("A Successful Git Branching Model") 2.) Maps "What do you want to do?" to the commands necessary to accomplish it. ("Git Flight Rules")
- Understanding LLM Fine-tuning0With the recent uprising of LLM's many business are looking at way to adopt these LLMs and fine-tuning these models on specfic data sets to ensure accuracy. These models when fine-tuned can be optimal for fulfilling the specific needs of a company. This site explains explicitly when, how, and why models should be trained. It goes over various strategies for LLM fine -tuning.
- Navier-Stokes Cahn-Hilliard (NSCH) for MOOSE Framework0The MOOSE Navier-Stokes Cahn-Hilliard (NSCH) application is a library for implementing simulation tools that solve the Navier-Stokes Cahn-Hilliard equations with non-matching densities using Galerkin finite element methods with a residual-based stabilization scheme.
- Performance Engineering Of Software Systems0A class from MITOpenCourseware that gives a hands on approach to building scalable and high-performance software systems. Topics include performance analysis, algorithmic techniques for high performance, instruction-level optimizations, caching optimizations, parallel programming, and building scalable systems.
- Python Tools for Data Science0Python has become a very popular programming language and software ecosystem for work in Data Science, integrating support for data access, data processing, modeling, machine learning, and visualization. In this webinar, we will describe some of the key Python packages that have been developed to support that work, and highlight some of their capabilities. This webinar will also serve as an introduction and overview of topics addressed in two Cornell Virtual Workshop tutorials, available at https://cvw.cac.cornell.edu/pydatasci1 and https://cvw.cac.cornell.edu/pydatasci2
- Spatial Data Science in the Cloud (Alpine HPC) using Python0Spatial Data Science is a growing field across a wide range of industries and disciplines. The open-source programming language Python has many libraries that support spatial analysis, but what do you do when your computer is unable to tackle the massive file sizes of high-resolution data and the computing power required in your analysis? There materials have been prepared to teach you spatial data science and how to execute your analysis using a high-performance computer (HPC).
- Machine Learning in Astrophysics0Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy allowing for a range of statistical and machine learning routines to analyze astronomical data in Python. In particular, it has loaders for many open astronomical datasets with examples on how to visualize such complicated and large datasets.
- Rockfish at Johns Hopkins University0Resources and User Guide available at Rockfish
- AI for improved HPC research - Cursor and Termius - Powerpoint0These slides provide an introduction on how Termius and Cursor, two new and freemium apps that use AI to perform more efficient work, can be used for faster HPC research.
- HPCwire0HPCwire is a prominent news and information source for the HPC community. Their website offers articles, analysis, and reports on HPC technologies, applications, and industry trends.
- Fairness and Machine Learning0The "Fairness and Machine Learning" book offers a rigorous exploration of fairness in ML and is suitable for researchers, practitioners, and anyone interested in understanding the complexities and implications of fairness in machine learning.
- Examples of code using JSON nlohmann header only Library for C++0This code showcases how to work with the header-only nlohmann JSON library for C++. In order to compile, change the extensions from json_test.txt to json_test.cpp and test.txt to test.json. You must also download the header files from https://github.com/nlohmann/json. Complilation instructions are at the bottom of json_test. This code is very helpful for creating config files, for example.
- Introduction to GPU/Parallel Programming using OpenACC0Introduction to the basics of OpenACC.
- GIS: What is a Geodetic Datums?0Often when working with GIS, or spatial data, one encounters the word "datum" and it may require that you choose a "datum" when doing GIS computation tasks. Below is a short video on what are datums from NOAA and UCAR.
Knowledge Base Resources
These resources are contributed by researchers, facilitators, engineers, and HPC admins. Please upvote resources you find useful!