Knowledge Base Resources
Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
Useful R Packages for Data Science and Statistics
1
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
Data Imputation Methods for Climate Data and Mortality Data
0
This slices and videos introduced how to use K-Nearest-Neighbors method to impute climate data and how to use Bayesian Spatio-Temporal models in R-INLA to impute mortality data. The demos will be added soon.
Scikit-Learn: Easy Machine Learning and Modeling
0
Scikit-learn is free software machine learning library for Python. It has a variety of features you can use on data, from linear regression classifiers to xg-boost and random forests. It is very useful when you want to analyze small parts of data quickly.
The Official Documentation of Pandas
0
Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures, and data analysis tools for the Python programming language. The official documentation serves as an in-depth guide to using this powerful tool including explanations and examples.
marimo | a next generation python notebook
0
Introduction seminar for new reactive python notebook from marimo ambassador.
Data visualization with Matplotlib
0
Data visualization is a critical aspect of data analysis. It allows for a clear and concise representation of data, making it easier for users to understand and interpret complex datasets. One of the most popular libraries for data visualization in Python is Matplotlib. The included website aims to provide a brief overview of Matplotlib, its features, and examples/exercises to dive deeper into its functionalities.
Data Visualization Tools for Julia
0
Plots.jl is the most widely used plotting library for the Julia programming language. It's known for being especially powerful in its versatility and intuitiveness. It's limited set of dependencies and wide applicability across different graphics packages make it especially helpful in visualizing the results of your latest Julia implementation.
However, there are still multiple options available for Julia programmers to visualize their datasets. The second link details a comparison against a variety of Julia packages.