Knowledge Base Resources
These resources are contributed by researchers, facilitators, engineers, and HPC admins. Please upvote resources you find useful!
File management of Visual Studio Code on clusters
0
Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated Development Environment (IDE) that supports a wide variety of programming languages. One of its key features is its extensive library of extensions. These extensions add on to the basic functionalities of VSCode, making coding more efficient and convenient.
However, there's a catch. When these extensions are installed and used frequently, they generate a multitude of files. These files are typically stored in a folder named .vscode-extension within your home directory. On a cluster computing facility such as the FASTER and Grace clusters at Texas A&M University, there's a limitation on how many files you can have in your home directory. For instance, the file number limit could be 10000, while the .vscode-extension directory can hold around 4000 temporary files even with just a few extensions. Thus, if the number of files in your home directory surpasses this limit due to VSCode extensions, you might face some issues. This restriction can discourage users from taking full advantage of the extensive features and extensions offered by the VSCode editor.
To overcome this, we can shift the .vscode-extension directory to the scratch space. The scratch space is another area in the cluster where you can store files and it usually has a much higher limit on the number of files compared to the home directory. We can perform this shift smoothly using a feature called symbolic links (or symlinks for short). Think of a symlink as a shortcut or a reference that points to another file or directory located somewhere else.
Here's a step-by-step guide on how to move the .vscode-extension directory to the scratch space and create a symbolic link to it in your home directory:
1. Copy the .vscode-extension directory to the scratch space: Using the cp command, you can copy the .vscode-extension directory (along with all its contents) to the scratch space. Here's how:
cp -r ~/.vscode-extension /scratch/user
Don't forget to replace /scratch/user with the actual path to your scratch directory.
2. Remove the original .vscode-extension directory: Once you've confirmed that the directory has been copied successfully to the scratch space, you can remove the original directory from your home space. You can do this using the rm command:
rm -r ~/.vscode-extension
It's important to make sure that the directory has been copied to the scratch space successfully before deleting the original.
3. Create a symbolic link in the home directory: Lastly, you'll create a symbolic link in your home directory that points to the .vscode-extension directory in the scratch space. You can do this as follows:
ln -s /scratch/user/.vscode-extension ~/.vscode-extension
By following this process, all the files generated by VSCode extensions will be stored in the scratch space. This prevents your home directory from exceeding its file limit. Now, when you access ~/.vscode-extension, the system will automatically redirect you to the directory in the scratch space, thanks to the symlink. This method ensures that you can use VSCode and its various extensions without worrying about hitting the file limit in your home directory.
Anvil Home Page
0
Purdue University is the home of Anvil, a powerful supercomputer that provides advanced computing capabilities to support a wide range of computational and data-intensive research spanning from traditional high-performance computing to modern artificial intelligence applications.
The Theory Behind Neural Networks (Very Simplified)
0
This video by the YouTube channel 3Blue1Brown provides a very simplified introduction to the theory behind neural networks. This tutorial is perfect for those that don't have much linear algebra or machine learning background and are eager to step into the realm of ML!
Geocomputation with R (Free Reference Book)
0
Below is a link for a book that focuses on how to use "sf" and "terra" packages for GIS computations. As of 5/1/2023, this book is up to date and examples are error free. The book has a lot of information but provides a good overview and example workflows on how to use these tools.
Creating a Mobile Application
0
Goes through in detail on how to build an application that can run on Android and IOS devices, using Qt Creator to develop Qt Quick applications. Goes through the setting up, creation, configuration, optimization, and overall deployment. This provides the fundamental basis, need to click around on the site for more specifics.
Intro to Machine Learning on HPC
0
This tutorial introduces machine learning on high performance computing (HPC) clusters. While it focuses on the HPC clusters at The University of Arizona, the content is generic enough that it can be used by students from other institutions.
Trusted CI Resources Page
0
Very helpful list of external resources from Trusted CI
High Performance Computing (HPC) 101 - Cluster
0
High Performance Computing (HPC) Cluster
UNIX/command line basics tutorial
0
Introductory training materials for working on the UNIX command line.
DeepChem
0
DeepChem is an open-source library built on TensorFlow and PyTorch. It is helpful in applying machine learning algorithms to molecular data.
Introduction to Parallel Programming for GPUs with CUDA
0
This tutorial provides a comprehensive introduction to CUDA programming, focusing on essential concepts such as CUDA thread hierarchy, data parallel programming, host-device heterogeneous programming model, CUDA kernel syntax, GPU memory hierarchy, and memory optimization techniques like global memory coalescing and shared memory bank conflicts. Aimed at researchers, students, and practitioners, the tutorial equips participants with the skills needed to leverage GPU acceleration for scalable computation, particularly in the context of AI.
Running Particle-in-Cell Simulations on HPC
0
WarpX is an advanced particle-in-cell code used to model particle accelerators, which needs to be run on HPC. This website contains the tutorial on how to build WarpX on various HPC systems such as NERSC along with examples on how to set up post-processing/visualization tools for different physics cases.
FreeSurfer Tutorials
0
The official MGH / Harvard tutorial page for FreeSurfer. The FreeSurfer group has provided and designed a series of tutorials for using FreeSurfer and for getting acquainted with the concepts needed to perform its various modes of analysis and processing of MRI data. The tutorials are designed to be followed along in a terminal window where commands can be copy/pasted instead of typed.
Trinity Tutorial for Transcriptome Assembly
0
Trinity is one of the most popular tool to assemble transcripts from RNA-Seq short reads. In this tutorial, we will cover the basic usage of Trinity, best practice and common problems.
How to Get the Most Out of a Mentoring Relationship by The Plank Center
0
Backed by collegiate white papers, top industry professionals, and researchers, The Plank Center’s Mentorship Guide offers basic tips and tricks on how to get the most out of a mentorship relationship. This easy-to-follow guide supplements mentorship programs, lesson plans, and professional relationships.
Training an LSTM Model in Pytorch
0
This google colab notebook tutorial demonstrates how to create and train an lstm model in pytorch to be used to predict time series data. An airline passenger dataset is used as an example.
Optimizing Research Workflows - A Documentation of Snakemake
0
Snakemake is a powerful and versatile workflow management system that simplifies the creation, execution, and management of data analysis pipelines. It uses a user-friendly, Python-based language to define workflows, making it particularly valuable for automating and reproducibly managing complex computational tasks in research and data analysis.
Singularity/Apptainer User Manuals
0
Singularity/Apptainer is a free and open-source container platform that allows users to build and run containers on high performance computing resources.
SingularityCE is the community edition of Singularity maintained by Sylabs, a company that also offers commercial Singularity products and services.
Apptainer is a fork of Singularity, maintained by the Linux foundation, a community of developers and users who are passionate about open source software.
Horovod: Distributed deep learning training framework
0
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in parallel. The library supports popular deep learning framework such as TensorFlow, Keras, PyTorch, and Apache MXNet.
How-To Video: ACCESS Allocations
0
This video will walk you through the process of efficiently utilizing and managing your ACCESS project(s). Here, you’ll find instructions on how to request resources, extend the end date of a project, renew a request, and all the other necessary tasks to successfully manage your project.
Hour of Ci
0
Hour of Cyberinfrastructure (Hour of CI) is a nationwide campaign to introduce undergraduate and graduate students to cyberinfrastructure and geographic information science (GIS).
OnShape Documentation
0
This contains documentation for getting started with using OnShape for CAD. OnShape cloud-hosted CAD software that lets you work with others like on a Google Doc, with the power and capabilities of any other software like Solidworks or Inventor.
OpenMP and Multithreaded Jobs in GRASS
0
Techniques and support for multithreaded geospatial data processing in GRASS.
Regular Expressions
0
Regular expressions (sometimes referred to as RegEx) is an incredibly powerful tool that is used to define string patterns for "find" or "find and replace" operations on strings, or for input validation. Regular Expressions are used in search engines, in search and replace dialogs of word processors and text editors, and text-processing Linux utilities such as sed and awk. They are supported in many programming languages, including Python, R, Perl, Java, and others.