Skip to main content

biology

Affinity Groups

There are no Affinity Groups associated with this topic. View All Affinity Groups.

Announcements

Title Date
Ookami Webinar 02/14/24
Open Call: Minisymposia for PASC24 10/05/23

Upcoming Events & Trainings

No events or trainings are currently scheduled.

Topics from Ask.CI

Loading topics from Ask.CI ...

Engagements

Run Markov Chain Monte Carlo (MCMC) in Parallel for Evolutionary Study
Texas Tech University

My ongoing project is focused on using species trait value (as data matrices) and its corresponding phylogenetic relationship (as a distance matrix) to reconstruct the evolutionary history of the smoke-induced seed germination trait. The results of this project are expected to increase the predictability of which untested species could benefit from smoke treatment, which could promote germination success of native species in ecological restoration. This computational resources allocated for this project pull from the high-memory partition of our Ivy cluster of HPCC (Centos 8, Slurm 20.11, 1.5 TB memory/node, 20 core /node, 4 node). However, given that I have over 1300 species to analyze, using the maximum amount of resources to speed up the data analysis is a challenge for two reasons: (1) the ancestral state reconstruction (the evolutionary history of plant traits) needs to use the Markov Chain Monte Carlo (MCMC) in Bayesian statistics, which runs more than 10 million steps and, according to experienced evolutionary biologists, could take a traditional single core simulation up 6 months to run; and (2) my data contain over 1300 native species, with about 500 polymorphic points (phylogenetic uncertainty), which would need a large scale of random simulation to give statistical strength. For instance, if I use 100 simulations for each 500 uncertainty points, I would have 50,000 simulated trees. Based on my previous experience with simulations, I could design codes to parallel analyze 50,000 simulated trees but even with this parallelization the long run MCMC will still require 50000 cores to run for up to 6 months. Given this computational and evolutionary research challenge, my current work is focused on discovering a suitable parallelization methods for the MCMC steps. I hope to have some computational experts to discuss my project.

Status: In Progress

People with Expertise

Xiaoqin Huang

Rice University

Programs

ACCESS CSSN

Roles

mentor, research computing facilitator, research software engineer, cssn

xqhuang at Rice

Expertise

Jason Wells

Harvard University

Programs

ACCESS CSSN, Campus Champions

Roles

research computing facilitator, cssn

Placeholder headshot

Expertise

Diana Toups Dugas

New Mexico State University

Programs

RMACC, SWEETER, Campus Champions

Roles

mentor, researcher/educator, research computing facilitator

Expertise

People with Interest

Balamurugan Desinghu

Rutgers, the State University of New Jersey

Programs

ACCESS CSSN, Campus Champions, CAREERS, Northeast

Roles

mentor, researcher/educator, research computing facilitator, cssn, Consultant

Bala Desinghu Photo

Interests

Raul Gutierrez

University of Rhode Island Graduate School of Oceanography

Programs

CAREERS

Roles

student-facilitator, mentee

Placeholder headshot

Interests

Gretta Kellogg

Pennsylvania State University

Programs

ACCESS CSSN

Roles

cssn

Placeholder headshot

Interests

+26 more tags