SuperLearner Screens

1 minute read

Investigating the use of screens in Super Learner ensembles

Clinical research trials often generate “big data”

Examples: vaccine and cancer early detection research
Large number of participants = good
Large number of variables -> challenging Machine learning increasingly used in biomedical research
Helpful with a large number of variables
Can identify complex relationships with outcome of interest
Issue: performance can vary across algorithms Potential solution: use an ensemble of algorithms
Combine results across procedures
Allows enhanced flexibility
Issue: no formal guidelines for combining certain procedures Goal: establish guidelines around ensemble procedures Potential impact in vaccine, cancer research
Methodology of this research will be available upon peer review*

Lasso + SL not always a helpful combo; surprising result.

Next steps: Train researchers to follow our guidelines + practices. Conduct further research including:

Building a study to address why the lasso alone is not ideal for variable selection
Testing other algorithms that process biomedical data.
Building this experiment in other programming language such as C or Python for better performance

Following publication, results will be replicable by others with scientific computing tools.