Knowledge Base Resources
These resources are contributed by researchers, facilitators, engineers, and HPC admins. Please upvote resources you find useful!
Open Storage Network
0
The Open Storage Network, a national resource available through the XSEDE resource allocation system, is high quality, sustainable, distributed storage cloud for the research community.
Reinforcement Learning For Beginners with Python
0
This course takes through the fundamentals required to get started with reinforcement learning with Python, OpenAI Gym and Stable Baselines. You'll be able to build deep learning powered agents to solve a varying number of RL problems including CartPole, Breakout and CarRacing as well as learning how to build your very own/custom environment!
MATLAB with other Programming Languages
0
MATLAB is a really useful tool for data analysis among other computational work. This tutorial takes you through using MATLAB with other programming languages including C, C++, Fortran, Java, and Python.
Harnessing the Power of Cloud and Machine Learning for Climate and Ocean Advances
0
- Harnessing the Power of Cloud and Machine Learning for Climate and Ocean Advances
- Github for Outputs of Presentation
Documentation and presentation on how to use machine learning and deep learning framework using TensorFlow, Keras and sci-kit learn for Climate and Ocean Advances
Ultimate guide to Unix
0
Unix is incredibly common and useful. This website provides all the common commands and explanations for one to get started with a unix system.
iOS CoreML + SwiftUI Image Classification Model
0
This tutorial will teach step-by-step how to create an image classification model using Core ML in XCode and integrate it into an iOS app that will use the user's iPhone camera to scan objects and predict based on the image classification model.
TensorFlow for Deep Neural Networks
0
TensorFlow is a powerful framework for Deep Learning, developed by google. This specifically is their python package, which is easy to use and can be used to train incredibly powerful models.
Working with Python on HPC Clusters
0
This tutorial series and documentation covers topics on using Python on HPC clusters. The specific steps are based on the HOPPER cluster at George Mason University in Fairfax, VA. They should be implementable on most HPC clusters that have the SLURM scheduler installed, the Environment Modules system for managing packages and Open onDemand for a web-based GUI to access the cluster resources.
How to use Rclone
0
Learn how to use Rclone to transfer data, specifically from your local drive to the Open Storage Network, vice versa.
Samtools Documentation
0
Samtools is a suite of programs for interacting with high-throughput sequencing data, especially in the SAM/BAM format. It offers various utilities for processing, analyzing, and managing sequence data generated from next-generation sequencing (NGS) experiments. Samtools is widely used in bioinformatics and genomics research for tasks such as read alignment, variant calling, and data manipulation.
Introduction to Parallel Computing Tutorial
0
The tutorial is intended to provide a brief overview of the extensive and broad topic of Parallel Computing. It covers the basics of parallel computing, and is intended for someone who is just becoming acquainted with the subject .
Introduction to Linux CLI for Researchers
0
The goal of this video is to help researchers and students recently given allocations to High Performance Compute resources a basic introduction to Linux commands to help them get started. These are a few of the most fundamental commands for navigating and getting started.
If you find this video helpful or would like me to continue this series let me know!
Charliecloud User Group
0
Announcements for for users and developers of Charliecloud, which provides lightweight user-defined software stacks for high-performance computing.
Neurostars
0
A question and answer forum for neuroscience researchers, infrastructure providers and software developers.
Introductory Tutorial to Numpy and Pandas for Data Analysis
0
In this tutorial, I present an overview with many examples of the use of Numpy and Pandas for data analysis. Beginners in the field of data analysis can find It incredibly helpful, and at the same time, anyone who already has experience in data analysis and needs a refresher can find value in it. I discuss the use of Numpy for analyzing 1D and 2D multidimensional data and an introduction on using Pandas to manipulate CSV files.
Installing Rocky Linux Operating System
0
Rocky Linux is an open-source enterprise operating system. It is compatible with Red Hat Enterprise Linux (RHEL). It is a community-driven project that provides a stable and reliable platform for production workloads. It is one of the best alternatives to Opensource CentOS, since Centos will be on end of life (EoL) soon in 2024 by shifting to CentOS Stream.
phenoACCESS-24 workshop program materials
0
phenoACCESS-24: Workshop on Research Computing and Plant Phenotyping
High-throughput plant phenotyping is computationally intensive, requiring data storage, data processing and analysis, research computing expertise, and mechanisms for data sharing. This workshop is aimed at research computing workforce development by addressing questions such as what is plant phenotyping; what types of data are collected; what are the preprocessing and analytical needs; what tools and platforms exist for data capture, management, analysis, and storage; and how best to collaborate and engage with phenotyping researchers. The full-day agenda will include speakers (scientists and research compute staff); panel discussions (how to work with research computing staff and facilities; how to engage with phenotyping scientists), and networking opportunities (meet-and-greet, ice breakers, small group discussions). The videos and slide decks for the talks are included on the linked page.
Slurm User Group Mailing List
0
A visual introduction to Gaussian Belief Propagation
0
This website is an interactive introduction to Gaussian Belief Propagation (GBP). A probabilistic inference algorithm that operates by passing messages between the nodes of arbitrarily structured factor graphs. A special case of loopy belief propagation, GBP updates rely only on local information and will converge independently of the message schedule. The key argument is that, given recent trends in computing hardware, GBP has the right computational properties to act as a scalable distributed probabilistic inference framework for future machine learning systems.
Bridges-2 Home Page
0
Landing Page for Bridges-2 information
Data Imputation Methods for Climate Data and Mortality Data
0
- Data Imputation Methods for Climate Data and Mortality Data - Slices
- Github repository
- Data Imputation Methods for Climate Data and Mortality Data - Full Tutorial
This slices and videos introduced how to use K-Nearest-Neighbors method to impute climate data and how to use Bayesian Spatio-Temporal models in R-INLA to impute mortality data. The demos will be added soon.
Quick and Robust Data Augmentation with Albumentations Library
0
Data augmentation is a crucial step in the pipeline for image classification with deep learning. Albumentations is an extremely versatile Python library that can be used to easily augment images. Transformations include rotations, flips, downscaling, distortions, blurs, and many more.
Citation:
Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA. Albumentations: Fast and Flexible Image Augmentations. Information. 2020; 11(2):125. https://doi.org/10.3390/info11020125
Building Anaconda Navigator applications
0
This tutorial explains how to create an Anaconda Navigator Application (app) for JupyterLab. It is intended for users of Windows, macOS, and Linux who want to generate an Anaconda Navigator app conda package from a given recipe. Prior knowledge of conda-build or conda recipes is recommended.
Warewulf documentation
0
Warewulf is an operating system provisioning platform for Linux that is designed to produce secure, scalable, turnkey cluster deployments that maintain flexibility and simplicity. It can be used to setup a stateless provisioning in HPC environment.
Weka
0
Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization.