Knowledge Base Resources
Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
Rust Web Server Tutorial
2
This is a beginner-friendly tutorial on how to set up your web server using Rust!
Enhanced Sampling for MD simulations
1
PyTorch for Deep Learning and Natural Language Processing
1
PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the basics of PyTorch from scratch. I will then explore how to use it for some ML projects such as Neural Networks, Multi-layer perceptrons (MLPs), Sentiment analysis with RNN, and Image Classification with CNN.
GIS: Geocoding Services
1
Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically requires an API call to a remote server hosted by an organization/institution. The remote server will take the address attributes provided by you and the remote server will compare it to the data it contains and return a best estimate on the coordinates for that location.
There are many geocoding services available with different world coverages, quality of result, and set different rate limits for access. For R, a package called "tidygeocoder" provides an easy way to connect to these different services. As an additional benefit, their documentation provides a good summary of geocoding services available and links to their documentation. The link to the documentation for gecoding services accessible by "tidygeocoder" is provided below.
For Python, geopy package is a library that provides connection to various geocoding services. The link to the documentation for this package is also included below.
Open OnDemand Documentation Repository
1
This is the main documentation repo for the Open OnDemand Portal which enables researchers to access HPC resources from a familiar web interface.
ACCESS Pegasus Documentation
1
The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC) workloads, covering logging in, workflow creation, resource configuration, and monitoring options.
Data Visualization tools for Python
1
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your data extremely easy and works with Python which many people already know.
DARWIN Documentation Pages
1
DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze Delaware research and education
Introduction to Python for Digital Humanities and Computational Research
1
This documentation contains introductory material on Python Programming for Digital Humanities and Computational Research. This can be a go-to material for a beginner trying to learn Python programming and for anyone wanting a Python refresher.
Useful R Packages for Data Science and Statistics
1
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
Singularity/Apptainer User Manuals
0
Singularity/Apptainer is a free and open-source container platform that allows users to build and run containers on high performance computing resources.
SingularityCE is the community edition of Singularity maintained by Sylabs, a company that also offers commercial Singularity products and services.
Apptainer is a fork of Singularity, maintained by the Linux foundation, a community of developers and users who are passionate about open source software.
Paraview UArizona HPC links (beginner)
0
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant (rtdatavis.github.io). The following links are specific to the Paraview program and the workflows that have been used my researchers at the U of Arizona. Some of the pages linked are very beginner friendly: getting started, working with cameras and keyframes for rendering, visualizing external files (netcdf climate data), graphs and data exporting.
Many of the workflows involve using remote desktops via the Open On Demand interface, but if this isn't set up at your university you can use paraview locally on a desktop. Feel free to post on access ci https://ask.cyberinfrastructure.org/ if you need assistance getting a paraview gui open for your work on HPC.
Numpy - a Python Library
0
Numpy is a python package that leverages types and compiled C code to make many math operations in Python efficient. It is especially useful for matrix manipulation and operations.
Conda
0
Conda is a popular package management system. This tutorial introduces you to Conda and walks you through managing Python, your environment, and packages.
Big Data Research at the University of Colorado Boulder
0
Background: Big data, defined as having high volume, complexity or velocity, have the potential to greatly accelerate research discovery. Such data can be challenging to work with and require research support and training to address technical and ethical challenges surrounding big data collection, analysis, and publication.
Methods: The present study was conducted via a series of semi-structured interviews to assess big data methodologies employed by CU Boulder researchers across a broad sample of disciplines, with the goal of illuminating how they conduct their research; identifying challenges and needs; and providing recommendations for addressing them.
Findings: Key results and conclusions from the study indicate: gaps in awareness of existing big data services provided by CU Boulder; open questions surrounding big data ethics, security and privacy issues; a need for clarity on how to attribute credit for big data research; and a preference for a variety of training options to support big data research.
AHPCC documentary
0
This link is a documentary website to use AHPCC.
Numba: Compiler for Python
0
Numba is a Python compiler designed for accelerating numerical and array operations, enabling users to enhance their application's performance by writing high-performance functions in Python itself. It utilizes LLVM to transform pure Python code into optimized machine code, achieving speeds comparable to languages like C, C++, and Fortran. Noteworthy features include dynamic code generation during import or runtime, support for both CPU and GPU hardware, and seamless integration with the Python scientific software ecosystem, particularly Numpy.
How to Get the Most Out of a Mentoring Relationship by The Plank Center
0
Backed by collegiate white papers, top industry professionals, and researchers, The Plank Center’s Mentorship Guide offers basic tips and tricks on how to get the most out of a mentorship relationship. This easy-to-follow guide supplements mentorship programs, lesson plans, and professional relationships.
Quick and Robust Data Augmentation with Albumentations Library
0
Data augmentation is a crucial step in the pipeline for image classification with deep learning. Albumentations is an extremely versatile Python library that can be used to easily augment images. Transformations include rotations, flips, downscaling, distortions, blurs, and many more.
Citation:
Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA. Albumentations: Fast and Flexible Image Augmentations. Information. 2020; 11(2):125. https://doi.org/10.3390/info11020125
Running Particle-in-Cell Simulations on HPC
0
WarpX is an advanced particle-in-cell code used to model particle accelerators, which needs to be run on HPC. This website contains the tutorial on how to build WarpX on various HPC systems such as NERSC along with examples on how to set up post-processing/visualization tools for different physics cases.
Rockfish at Johns Hopkins University
0
Resources and User Guide available at Rockfish
Docker Container Library
0
The Docker container library, commonly known as Docker Hub, is a vast repository that hosts a multitude of pre-configured container images, streamlining the deployment process. It can drastically speed up a workflow, and gives you a consistent starting point each time. Check it out, they might have exactly what you are looking for!
ACCESS Guide (originally given at Duke OIT)
0
A guide for Duke OIT on how to advise users on using ACCESS and allocation credits to jetstream 2 for Duke University members. This can be used for non Duke members. Assumes the reader has basic knowledge of ACCESS.
Building Anaconda Navigator applications
0
This tutorial explains how to create an Anaconda Navigator Application (app) for JupyterLab. It is intended for users of Windows, macOS, and Linux who want to generate an Anaconda Navigator app conda package from a given recipe. Prior knowledge of conda-build or conda recipes is recommended.
iOS CoreML + SwiftUI Image Classification Model
0
This tutorial will teach step-by-step how to create an image classification model using Core ML in XCode and integrate it into an iOS app that will use the user's iPhone camera to scan objects and predict based on the image classification model.