Knowledge Base Resources
Use these links “vetted” by the community. Additional CI links are always welcome.
High performance computing 101
0
An introductory guide to High Performance Computing.
GDAL Multi-threading
0
Multi-threading guidance when using GDAL.
Wiki for Onboarding onto the C3DDB Cluster at MGHPCC
0
This is a resource for researchers and students looking to on-board onto the c3ddb cluster at MGHPCC. In the code section, there are example job submission scripts for the different queues on c3ddb.
Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing
0
Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving semantic data from humans, as well as account for possible discrepancies between human-provided data and these beliefs. To this end, this paper introduces the problem of semantic data association (SDA) in relation to conventional data association problems for sensor fusion. It then, develops a novel probabilistic semantic data association (PSDA) algorithm to rigorously address SDA in general settings. Simulations of a multi-object search task show that PSDA enables robust collaborative state estimation under a wide range of conditions.
Introduction to MP
0
Open Multi-Processing, is an API designed to simplify the integration of parallelism in software development, particularly for applications running on multi-core processors and shared-memory systems. It is an important resource as it goes over what openMP and ways to work with it. It is especially important because it provides a straightforward way to express parallelism in code through pragma directives, making it easier to create parallel regions, parallelize loops, and define critical sections. The key benefit of OpenMP lies in its ease of use, automatic thread management, and portability across various compilers and platforms. For app development, especially in the context of mobile or desktop applications, OpenMP can enhance performance by leveraging the capabilities of modern multi-core processors. By parallelizing computationally intensive tasks, such as image processing, data analysis, or simulations, apps can run faster and more efficiently, providing a smoother user experience and taking full advantage of the available hardware resources. OpenMP's scalability allows apps to adapt to different hardware configurations, making it a valuable tool for developers aiming to optimize their software for a range of devices and platforms.
Horovod: Distributed deep learning training framework
0
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in parallel. The library supports popular deep learning framework such as TensorFlow, Keras, PyTorch, and Apache MXNet.
Implementing Markov Processes with Julia
0
The following link provides an easy method of implementing Markov Decision Processes (MDP) in the Julia computing language. MDPs are a class of algorithms designed to handle stochastic situations where the actor has some level of control. For example, used at a low level, MDPs can be used to control an inverted pendulum, but applied in higher level decision making the can also decide when to take evasive action in air traffic management. MDPs can also be extended to the partially observable domain to form the Partially Observable Markov Decision Process (POMDP). This link contains a wealth of information to show one can easily implement basic POMDP and MDP algorithms and apply well known online and offline solvers.
RRCoP Resources Page
0
Very helpful list of Regulated Research Community of Practice's collaborating communities.
Docker Tutorial for Beginners
0
A Docker tutorial for beginners is a course that teaches the basics of Docker, a containerization platform that allows you to package your application and its dependencies into a standardized unit for development, shipment, and deployment.
Use Windows Subsystem for Linux for HPC Command Line Access from Windows
0
Windows Subsystem for Linux (WSL) provides a Linux environment for Windows users to access HPC resources fast and efficiently.
What are LSTMs?
0
This reading will explain what a long short-term memory neural network is. LSTMs are a type of neural networks that rely on both past and present data to make decisions about future data. It relies on loops back to previous data to make such decisions. This makes LSTMs very good for predicting time-dependent behavior.
Texas A&M HPRC Training Site
0
Training Resources and Courses offered by Texas A&M's Research Computing Group
Metadata Systems
0
Metadata is a vital topic in libraries and librarianship, encompassing structured information used for accessing digital resources. The definition of metadata varies but is essentially data about data. It has evolved beyond simply describing metadata schemas and now focuses on topics like interoperability, non-descriptive metadata (administrative and preservation metadata), and the effective application of metadata schemas for user discovery. Interoperability, the ability to seamlessly exchange metadata between systems, is a major concern. Different levels of interoperability are examined, including schema-level, record-level, and repository-level. Challenges to interoperability include variations in standards, collaboration barriers, and costs.Metadata management is discussed in terms of the holistic management of metadata across an entire library. Steps include analyzing metadata requirements, adopting schema, creating metadata content, delivery/access, evaluation, and maintenance. Administrative metadata, which encompasses ownership and production information, is becoming more critical, particularly for electronic resource licensing. Preservation metadata is also gaining importance in ensuring the long-term viability of digital objects.
Research Software Engineering Training Materials
0
An ongoing collection of RSE training material, workshops, and resources. We are compiling this list as a starting point for future activities. We are especially seeking material that goes beyond basic research computing competency (e.g. what The Carpentries does so well) and is general enough to span multiple domains. Specific tools and technologies used only in one domain, or applicable to only one subset of computing (i.e. HPC) are typically too narrowly focused. When in doubt, submit it to be included or reach out and we’d be happy to discuss.
ACCESS Getting Started Quick-Guide
0
A step-by-step guide to getting your first allocation for Access computing and storage resources.
Applications of Machine Learning in Engineering and Parameter Tuning Tutorial
0
Slides for a tutorial on Machine Learning applications in Engineering and parameter tuning given at the RMACC conference 2019.
Conda
0
Conda is a popular package management system. This tutorial introduces you to Conda and walks you through managing Python, your environment, and packages.
Slurm Tutorials
0
Introduction to the Slurm Workload Manager for users and system administrators, plus some material for Slurm programmers.
Astronomy data analysis with astropy
0
Astropy is a community-driven package that offers core functionalities needed for astrophysical computations and data analysis. From coordinate transformations to time and date handling, unit conversions, and cosmological calculations, Astropy ensures that astronomers can focus on their research without getting bogged down by the intricacies of programming. This guide walks you through practical usage of astropy from CCD data reduction to computing galactic orbits of stars.
Recommended Libraries for Cyberinfrastructure Users Developing Jupyter Notebooks
0
This repository contains information about Jupyter Widgets and how they can be used to develop interactive workflows, data dashboards, and web applications that can be run on HPC systems and science gateways. Easy to build web applications are not only useful for scientists. They can also be used by software engineers and system admins who want to quickly create tools tools for file management and more!
Scikit-Learn: Easy Machine Learning and Modeling
0
Scikit-learn is free software machine learning library for Python. It has a variety of features you can use on data, from linear regression classifiers to xg-boost and random forests. It is very useful when you want to analyze small parts of data quickly.
Biopython Tutorial
0
The Biopython Tutorial and Cookbook website is a dedicated online resource for users in the field of computational biology and bioinformatics. It provides a collection of tutorials and practical examples focused on using the Biopython library.
The website offers a series of tutorials that cover various aspects of Biopython, catering to users with different levels of expertise. It also includes code snippets and examples, and common solutions to common challenges in computational biology.
ACCESS KB Guide - Expanse
0
Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting.
Better Scientific Software (BSSw)
0
The Better Scientific Software (BSSw) project provides a community to collaborate and learn about best practices in scientific software development. Software—the foundation of discovery in computational science & engineering—faces increasing complexity in computational models and computer architectures. BSSw provides a central hub for the community to address pressing challenges in software productivity, quality, and sustainability.
Neocortex Documentation
0
Neocortex is a new supercomputing cluster at the Pittsburgh Supercomputing Center (PSC) that features groundbreaking AI hardware from Cerebras Systems.