Knowledge Base Resources
These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
Quick and Robust Data Augmentation with Albumentations Library
0
Data augmentation is a crucial step in the pipeline for image classification with deep learning. Albumentations is an extremely versatile Python library that can be used to easily augment images. Transformations include rotations, flips, downscaling, distortions, blurs, and many more.
Citation:
Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA. Albumentations: Fast and Flexible Image Augmentations. Information. 2020; 11(2):125. https://doi.org/10.3390/info11020125
Machine Learning with sci-kit learn
0
In the realm of Python-based machine learning, Scikit-Learn stands out as one of the most powerful and versatile tools available. This introductory post serves as a gateway to understanding Scikit-Learn through explanations of introductory ML concepts along with implementations examples in Python.
FreeSurfer Tutorials
0
The official MGH / Harvard tutorial page for FreeSurfer. The FreeSurfer group has provided and designed a series of tutorials for using FreeSurfer and for getting acquainted with the concepts needed to perform its various modes of analysis and processing of MRI data. The tutorials are designed to be followed along in a terminal window where commands can be copy/pasted instead of typed.
Python Data and Viz Training (CCEP Program)
0
Science Gateway Tool/Web App Template (Jupyter Notebook + ipywidgets)
0
Use this template to turn any science gateway workflow into a web application!
Framework to help in scaling Machine Learning/Deep Learning/AI/NLP Models to Web Application level
0
This framework will help in scaling Machine Learning/Deep Learning/Artificial Intelligence/Natural Language Processing Models to Web Application level almost without any time.
Advanced Compilers: The Self-Guided Online Course
0
This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate representations, data flow, and “classic” optimizations as well as more research focusedtopics such as parallelization, just-in-time compilation, and garbage collection.
Campus Champions Home Page
0
Campus Champions foster a dynamic environment for a diverse community of research computing and data professionals sharing knowledge and experience in digital research infrastructure.
OpenMP and Multithreaded Jobs in GRASS
0
Techniques and support for multithreaded geospatial data processing in GRASS.
Managing and Optimizing Your Jobs on HPC
0
An overview of tools and methods to manage and optimize jobs and HPC workflows
OpenMP Tutorial
0
OpenMP (Open Multi-Processing) is an API that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.
Vulkan Support Survey across Systems
0
It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created using programs that run on OpenGL or custom rasterization techniques. To put it simply the next generation of graphics provided by OpenGL's successor Vulkan is strangely absent in the super computing world. The aim of this survey of available resources is to determine the systems that can support Vulkan workflows and programs. This will assist users in getting past some of the first hurdles in using Vulkan in HPC contexts.
Jetstream Home
0
Jetstream2 makes cutting-edge high-performance computing and software easy to use for your research regardless of your project’s scale—even if you have limited experience with supercomputing systems.Cloud-based and on-demand, the 24/7 system includes discipline-specific apps. You can even create virtual machines that look and feel like your lab workstation or home machine, with thousands of times the computing power.
Regulated Research Community of Practice
0
The daily news clearly shows the increasing threat to safety and privacy of data, personal as well as intellectual property. While the requirements such as DFARS 7012, HIPAA, and Cybersecurity Maturity Model Certification (CMMC) improve the consistency of data handling between agencies and contractors and grantees, it leaves academic institutions to figure out how to meet such requirements in a cost-effective way that fits the research and education mission of the institution. Most institutions, agencies, and companies act in isolation with one-off contract language to address data security and safeguarding concerns. Even though cybersecurity has a clear and uniform goal of protecting data, a onesize solution does not fit all academic institutions.
By supporting this community with development of a community strategic roadmap, regular discussions and workshops, and a repository of generalized and specific resources for handling regulated research programs RRCoP lowers the barrier to entry for institutions handling new regulations.
High performance computing 101
0
An introductory guide to High Performance Computing.
Reinforcement Learning For Beginners with Python
0
This course takes through the fundamentals required to get started with reinforcement learning with Python, OpenAI Gym and Stable Baselines. You'll be able to build deep learning powered agents to solve a varying number of RL problems including CartPole, Breakout and CarRacing as well as learning how to build your very own/custom environment!
ACCESS KB Guide - DELTA
0
NCSA is the home of Delta, a computing and data resource that balances cutting-edge graphics processor and CPU architectures with a non-POSIX file system with a POSIX-like interface. Delta allows applications to reap the benefits of modern file systems without rewriting code.
Language models and using HPC resources
0
Documentation and research based on the latest NLP text generation detection methods for 2023.
Introduction to MP
0
Open Multi-Processing, is an API designed to simplify the integration of parallelism in software development, particularly for applications running on multi-core processors and shared-memory systems. It is an important resource as it goes over what openMP and ways to work with it. It is especially important because it provides a straightforward way to express parallelism in code through pragma directives, making it easier to create parallel regions, parallelize loops, and define critical sections. The key benefit of OpenMP lies in its ease of use, automatic thread management, and portability across various compilers and platforms. For app development, especially in the context of mobile or desktop applications, OpenMP can enhance performance by leveraging the capabilities of modern multi-core processors. By parallelizing computationally intensive tasks, such as image processing, data analysis, or simulations, apps can run faster and more efficiently, providing a smoother user experience and taking full advantage of the available hardware resources. OpenMP's scalability allows apps to adapt to different hardware configurations, making it a valuable tool for developers aiming to optimize their software for a range of devices and platforms.
Neurodesk
0
Neurodesk provides a containerised data analysis environment to facilitate reproducible analysis of neuroimaging data. Analysis pipelines for neuroimaging data typically rely on specific versions of packages and software, and are dependent on their native operating system. These dependencies mean that a working analysis pipeline may fail or produce different results on a new computer, or even on the same computer after a software update. Neurodesk provides a platform in which anyone, anywhere, using any computer can reproduce your original research findings given the original data and analysis code.
Metadata Systems
0
Metadata is a vital topic in libraries and librarianship, encompassing structured information used for accessing digital resources. The definition of metadata varies but is essentially data about data. It has evolved beyond simply describing metadata schemas and now focuses on topics like interoperability, non-descriptive metadata (administrative and preservation metadata), and the effective application of metadata schemas for user discovery. Interoperability, the ability to seamlessly exchange metadata between systems, is a major concern. Different levels of interoperability are examined, including schema-level, record-level, and repository-level. Challenges to interoperability include variations in standards, collaboration barriers, and costs.Metadata management is discussed in terms of the holistic management of metadata across an entire library. Steps include analyzing metadata requirements, adopting schema, creating metadata content, delivery/access, evaluation, and maintenance. Administrative metadata, which encompasses ownership and production information, is becoming more critical, particularly for electronic resource licensing. Preservation metadata is also gaining importance in ensuring the long-term viability of digital objects.
Implementing Markov Processes with Julia
0
The following link provides an easy method of implementing Markov Decision Processes (MDP) in the Julia computing language. MDPs are a class of algorithms designed to handle stochastic situations where the actor has some level of control. For example, used at a low level, MDPs can be used to control an inverted pendulum, but applied in higher level decision making the can also decide when to take evasive action in air traffic management. MDPs can also be extended to the partially observable domain to form the Partially Observable Markov Decision Process (POMDP). This link contains a wealth of information to show one can easily implement basic POMDP and MDP algorithms and apply well known online and offline solvers.
Horovod: Distributed deep learning training framework
0
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in parallel. The library supports popular deep learning framework such as TensorFlow, Keras, PyTorch, and Apache MXNet.
Automated Machine Learning Book
0
The authoritative book on automated machine learning, which allows practitioners without ML expertise to develop and deploy state-of-the-art machine learning approaches. Describes the background of techniques used in detail, along with tools that are available for free.
Docker Tutorial for Beginners
0
A Docker tutorial for beginners is a course that teaches the basics of Docker, a containerization platform that allows you to package your application and its dependencies into a standardized unit for development, shipment, and deployment.