Knowledge Base Resources
Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
Texas A&M HPRC Training Site
0
Training Resources and Courses offered by Texas A&M's Research Computing Group
Using Dask on HPC Systems
0
A tutorial on the effective use of Dask on HPC resources. The four-hour tutorial will be split into two sections, with early topics focused on novice Dask users and later topics focused on intermediate usage on HPC and associated best practices. The knowledge areas covered include (but are not limited to):
Beginner section
High-level collections including dask.array and dask.dataframe
Distributed Dask clusters using HPC job schedulers
Earth Science data analysis using Dask with Xarray
Using the Dask dashboard to understand your computation
Intermediate section
Optimizing the number of workers and memory allocation
Choosing appropriate chunk shapes and sizes for Dask collections
Querying resource usage and debugging errors
Regular Expressions
0
Regular expressions (sometimes referred to as RegEx) is an incredibly powerful tool that is used to define string patterns for "find" or "find and replace" operations on strings, or for input validation. Regular Expressions are used in search engines, in search and replace dialogs of word processors and text editors, and text-processing Linux utilities such as sed and awk. They are supported in many programming languages, including Python, R, Perl, Java, and others.
Optimizing Research Workflows - A Documentation of Snakemake
0
Snakemake is a powerful and versatile workflow management system that simplifies the creation, execution, and management of data analysis pipelines. It uses a user-friendly, Python-based language to define workflows, making it particularly valuable for automating and reproducibly managing complex computational tasks in research and data analysis.
Slurm Scheduling Software Documentation
0
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm requires no kernel modifications for its operation and is relatively self-contained. As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.
Advanced Compilers: The Self-Guided Online Course
0
This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate representations, data flow, and “classic” optimizations as well as more research focusedtopics such as parallelization, just-in-time compilation, and garbage collection.
HPCwire
0
HPCwire is a prominent news and information source for the HPC community. Their website offers articles, analysis, and reports on HPC technologies, applications, and industry trends.
Introduction to Parallel Computing Tutorial
0
The tutorial is intended to provide a brief overview of the extensive and broad topic of Parallel Computing. It covers the basics of parallel computing, and is intended for someone who is just becoming acquainted with the subject .
OnShape Documentation
0
This contains documentation for getting started with using OnShape for CAD. OnShape cloud-hosted CAD software that lets you work with others like on a Google Doc, with the power and capabilities of any other software like Solidworks or Inventor.
Info about retiring of R GIS packages rgdal, rgeos, maptools in 2023
0
R GIS packages "rgdal", "rgeos", and "maptools" are package set to be archived and no longer supported by end of 2023. Many other R GIS packages are build on top of these packages, including "sp" and "raster". The packages recommended as replacement for "sp" is "sf" and the replacement for "raster" is "terra". Below are links to published articles regarding this transition. Additionally, I am including links to the documentation for the new packages recommended to be used "sf" and "terra".
Introduction to Linux CLI for Researchers
0
The goal of this video is to help researchers and students recently given allocations to High Performance Compute resources a basic introduction to Linux commands to help them get started. These are a few of the most fundamental commands for navigating and getting started.
If you find this video helpful or would like me to continue this series let me know!
Anvil Home Page
0
Open Storage Network
0
The Open Storage Network, a national resource available through the XSEDE resource allocation system, is high quality, sustainable, distributed storage cloud for the research community.
Charliecloud User Group
0
Announcements for for users and developers of Charliecloud, which provides lightweight user-defined software stacks for high-performance computing.
Cyber Security
0
learning cybersecurity is crucial for personal protection, safeguarding digital assets, financial security, and national security. It is important when it comes to consumer data protection for business, creating long lasting relationships with customers.
Chameleon
0
Chameleon is an NSF-funded testbed system for Computer Science experimentation. It is designed to be deeply reconfigurable, with a wide variety of capabilities for researching systems, networking, distributed and cluster computing and security.
High performance computing 101
0
An introductory guide to High Performance Computing.
Why 'N How: Martinos Center for Biomedical Imaging:
0
The Why & How seminar series is designed to introduce research assistants, graduate students, and postdoctoral and clinical fellows – really, anyone who is interested – to the many tools used in medical imaging. These include software tools and most of the major imaging modalities wielded by investigators (MRI, PET, EEG, MEG, optical, TMS and others). As the name of the series suggests, the talks cover both the reasons researchers might need a particular tool and the nuts and bolts of how to apply it. You can watch videos of the overviews below.
How to use Rclone
0
Learn how to use Rclone to transfer data, specifically from your local drive to the Open Storage Network, vice versa.
Examples of Thrust code for GPU Parallelization
0
Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely compatible with other versions. Before compiling change extension from thrust_ex.txt to thrust_ex.cu. Any code on the device (GPU) that is run through a Thrust transform is automatically parallelized on the GPU. Host (CPU) code will not be. Thrust code can also be compiled to run on a CPU for practice.
Oakridge Leadership Computing Facility (OLCF) Training Events and Archive
0
Upcoming training events and archives of training materials detailing general HPC best practices as well as how to use OLCF resources and services.
Neural Networks in Julia
0
Making a neural network has never been easier! The following link directs users to the Flux.jl package, the easiest way of programming a neural network using the Julia programming language. Julia is the fastest growing software language for AI/ML and this package provides a faster alternative to Python's TensorFlow and PyTorch with a 100% Julia native programming and GPU support.
Long Tales of Science: A podcast about women in HPC
0
A series of interviews with women in the HPC community
NCSA HPC-Moodle
0
Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Some of the tutorials also offer digital badges.
Docker Tutorial for Beginners
0
A Docker tutorial for beginners is a course that teaches the basics of Docker, a containerization platform that allows you to package your application and its dependencies into a standardized unit for development, shipment, and deployment.