Knowledge Base Resources
These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
Useful R Packages for Data Science and Statistics
1
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
Data visualization with Matplotlib
0
Data visualization is a critical aspect of data analysis. It allows for a clear and concise representation of data, making it easier for users to understand and interpret complex datasets. One of the most popular libraries for data visualization in Python is Matplotlib. The included website aims to provide a brief overview of Matplotlib, its features, and examples/exercises to dive deeper into its functionalities.
Data Visualization Tools for Julia
0
Plots.jl is the most widely used plotting library for the Julia programming language. It's known for being especially powerful in its versatility and intuitiveness. It's limited set of dependencies and wide applicability across different graphics packages make it especially helpful in visualizing the results of your latest Julia implementation.
However, there are still multiple options available for Julia programmers to visualize their datasets. The second link details a comparison against a variety of Julia packages.
Data Imputation Methods for Climate Data and Mortality Data
0
This slices and videos introduced how to use K-Nearest-Neighbors method to impute climate data and how to use Bayesian Spatio-Temporal models in R-INLA to impute mortality data. The demos will be added soon.
Awesome Jupyter Widgets (for building interactive scientific workflows or science gateway tools)
0
A curated list of awesome Jupyter widget packages and projects for building interactive visualizations for Python code
Scikit-Learn: Easy Machine Learning and Modeling
0
Scikit-learn is free software machine learning library for Python. It has a variety of features you can use on data, from linear regression classifiers to xg-boost and random forests. It is very useful when you want to analyze small parts of data quickly.
The Official Documentation of Pandas
0
Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures, and data analysis tools for the Python programming language. The official documentation serves as an in-depth guide to using this powerful tool including explanations and examples.