Knowledge Base Resources
These resources are contributed by researchers, facilitators, engineers, and HPC admins. Please upvote resources you find useful!
Attention, Transformers, and LLMs: a hands-on introduction in Pytorch
1
- Landing Page
- Preparing data for LLM training
- Small Language Models: an introduction to autoregressive language modeling
- Attention is all you need
- Other LLM Topics
This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how LLMs work and use them in your own projects.
Introduction to Deep Learning in Pytorch
1
- Landing Page
- Pytorch Quickstart
- Pytorch Basics
- Pytorch GPU Support
- Regression and Classification with Fully Connected Neural Networks
- High Dimensional Data
- Datasets and data loading
- Building the network
- Computer Vision and Convolutional Neural Networks
This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data loading and preprocessing to training and model evaluation. Throughout the sessions, students participate in writing and executing simple deep learning programs using Pytorch – a popular Python library for developing, training, and deploying deep learning models.
Leveraging AI in Generative Assets and Environments for Play: Insights from the English Department's Digital Media Lab
1
In this presentation, I will explore the recent advancements in AI-driven production of 3D-generative assets and environments, particularly focusing on their application in creating immersive, playful experiences. Platforms such as ChatGPT, Suno, and Speechify have ushered in a new era of digital creativity, facilitating the development of environments that not only entertain but also serve educational purposes. This session will delve into how these technologies are integrated into academic settings, specifically through a case study of the English Department's Digital Media Lab, known as Tech/Tech, which opened in 2022.
Useful R Packages for Data Science and Statistics
1
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
PyTorch for Deep Learning and Natural Language Processing
1
PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the basics of PyTorch from scratch. I will then explore how to use it for some ML projects such as Neural Networks, Multi-layer perceptrons (MLPs), Sentiment analysis with RNN, and Image Classification with CNN.
ACCESS HPC Workshop Series
1
- ACESS HPC Workshop Series
- MPI Workshop
- OpenMP Workshop
- GPU Programming Using OpenACC
- Summer Boot Camp
- Big Data and Machine Learning
Monthly workshops sponsored by ACCESS on a variety of HPC topics organized by Pittsburgh Supercomputing Center (PSC). Each workshop will be telecast to multiple satellite sites and workshop materials are archived.
AI powered VsCode Editor
0
**Cursor: The AI-Powered Code Editor**
Cursor is a cutting-edge, AI-first code editor designed to revolutionize the way developers write, debug, and understand code. Built upon the premise of pair-programming with artificial intelligence, Cursor harnesses the capabilities of advanced AI models to offer real-time coding assistance, bug detection, and code generation.
**How Cursor Benefits High-Performance Computing (HPC) Work:**
1. **Efficient Code Development:** With AI-assisted code generation, researchers and developers in the HPC realm can quickly write optimized code for simulations, data processing, or modeling tasks, reducing the time to deployment.
2. **Debugging Assistance:** Handling complex datasets and simulations often lead to intricate bugs. Cursor's capability to automatically investigate errors and determine root causes can save crucial time in the HPC workflow.
3. **Tailored Code Suggestions:** Cursor's AI provides context-specific code suggestions by understanding the entire codebase. For HPC applications where performance is paramount, this means receiving recommendations that align with optimization goals.
4. **Improved Code Quality:** With AI-driven bug scanning and linter checks, Cursor ensures that HPC codes are not only fast but also robust and free of common errors.
5. **Easy Integration:** Being a fork of VSCode, Cursor allows seamless migration, ensuring that developers working in HPC can swiftly integrate their existing VSCode setups and extensions.
In essence, for HPC tasks that demand speed, precision, and robustness, Cursor acts as an invaluable co-pilot, guiding developers towards efficient and optimized coding solutions.
It is free if you provide your own OPEN AI API KEY.
AI Institutes Cyberinfrastructure Documents: SAIL Meeting
0
Materials from the SAIL meeting (https://aiinstitutes.org/2023/06/21/sail-2023-summit-for-ai-leadership/). A space where AI researchers can learn about using ACCESS resources for AI applications and research.
Harnessing the Power of Cloud and Machine Learning for Climate and Ocean Advances
0
- Harnessing the Power of Cloud and Machine Learning for Climate and Ocean Advances
- Github for Outputs of Presentation
Documentation and presentation on how to use machine learning and deep learning framework using TensorFlow, Keras and sci-kit learn for Climate and Ocean Advances
Scipy Lecture Notes
0
Comprehensive tutorials and lecture notes covering various aspects of scientific computing using Python and Scipy.
Machine Learning with sci-kit learn
0
In the realm of Python-based machine learning, Scikit-Learn stands out as one of the most powerful and versatile tools available. This introductory post serves as a gateway to understanding Scikit-Learn through explanations of introductory ML concepts along with implementations examples in Python.
An Introduction to the Julia Programming Language
0
The Julia Programming Language is one of the fastest growing software languages for AI/ML development. It writes in manner that's similar to Python while being nearly as fast as C++, while being open source, and reproducible across platforms and environments. The following link provide an introduction to using Julia including the basic syntax, data structures, key functions, and a few key packages.
Framework to help in scaling Machine Learning/Deep Learning/AI/NLP Models to Web Application level
0
This framework will help in scaling Machine Learning/Deep Learning/Artificial Intelligence/Natural Language Processing Models to Web Application level almost without any time.
iOS CoreML + SwiftUI Image Classification Model
0
This tutorial will teach step-by-step how to create an image classification model using Core ML in XCode and integrate it into an iOS app that will use the user's iPhone camera to scan objects and predict based on the image classification model.
Fairness and Machine Learning
0
The "Fairness and Machine Learning" book offers a rigorous exploration of fairness in ML and is suitable for researchers, practitioners, and anyone interested in understanding the complexities and implications of fairness in machine learning.
PyTorch Introduction
0
This is a very barebones introduction to the PyTorch framework used to implement machine learning. This tutorial implements a feed-forward neural network and is taught completely asynchronously through Stanford University. A good start after learning the theory behind feed-forward neural networks.
fast.ai
0
Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own library built on PyTorch, news articles, and other resources to dive into this realm.
Python Tools for Data Science
0
Python has become a very popular programming language and software ecosystem for work in Data Science, integrating support for data access, data processing, modeling, machine learning, and visualization. In this webinar, we will describe some of the key Python packages that have been developed to support that work, and highlight some of their capabilities. This webinar will also serve as an introduction and overview of topics addressed in two Cornell Virtual Workshop tutorials, available at https://cvw.cac.cornell.edu/pydatasci1 and https://cvw.cac.cornell.edu/pydatasci2
GPU Acceleration in Python
0
This tutorial explains how to use Python for GPU acceleration with libraries like CuPy, PyOpenCL, and PyCUDA. It shows how these libraries can speed up tasks like array operations and matrix multiplication by using the GPU. Examples include replacing NumPy with CuPy for large datasets and using PyOpenCL or PyCUDA for more control with custom GPU kernels. It focuses on practical steps to integrate GPU acceleration into Python programs.
Representation Learning in Deep Learning
0
Representation learning is a fundamental concept in machine learning and artificial intelligence, particularly in the field of deep learning. At its core, representation learning involves the process of transforming raw data into a form that is more suitable for a specific task or learning objective. This transformation aims to extract meaningful and informative features or representations from the data, which can then be used for various tasks like classification, clustering, regression, and more.
Machine Learning in Astrophysics
0
Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy allowing for a range of statistical and machine learning routines to analyze astronomical data in Python. In particular, it has loaders for many open astronomical datasets with examples on how to visualize such complicated and large datasets.
AI for improved HPC research - Cursor and Termius - Powerpoint
0
These slides provide an introduction on how Termius and Cursor, two new and freemium apps that use AI to perform more efficient work, can be used for faster HPC research.
Applications of Machine Learning in Engineering and Parameter Tuning Tutorial
0
Slides for a tutorial on Machine Learning applications in Engineering and parameter tuning given at the RMACC conference 2019.
Training an LSTM Model in Pytorch
0
This google colab notebook tutorial demonstrates how to create and train an lstm model in pytorch to be used to predict time series data. An airline passenger dataset is used as an example.