Preface

Our research objective is to identify potential vaccine antigen candidates targeting the parasite P. vivax, the second most prevalent cause of malaria. To improve the performance of the autologous model for P. vivax, which has a constrained size of proteome and a small set of labeled antigens, we leverage heterologous data from P. falciparum. We utilized multiple models trained on various combinations of heterologous and autologous data using the positive-unlabeled random forest (PURF) algorithm. The research notebook contains both data and code generated in the study titled “Plasmodium vivax antigen candidate prediction improves with the addition of Plasmodium falciparum data.” Further, the notebook provides guidance on extracting protein variables and assembling machine learning input from the database, along with code for conducting experimental analyses and creating plots.

Creative Commons License
The notebook is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.