I am a Machine Learning and Computational Biology Ph.D. student in Computer and Information Science department at the University of Pennsylvania. I am advised by Dr. Yoseph Barash. My research focus is on application and development of interpretable machine learning methods for answering core biological questions. My research projects have involved predicting splicing differences between tissues and regulatory networks between RNA-binding proteins with deep learning and developing EIG, an interpretation method for splicing code and other deep learning models based on genomic data. In parallel, I have contributed to research towards reliable identification and quantification of splicing events from RNA-Seq data and understanding the role of RNA-binding proteins in post-transcriptional regulation. I received my MS from TU Munich in August 2014 with specialization in AI and Software Engineering. In a previous life, I was a software engineer working on navigation systems in cars for two years.
Languages, systems, and tools
Proficient: Python, Tensorflow, High Performance Computing, UNIX
Competent: R, LaTeX
Past work experience: Perl, MATLAB, Java, Android, C#, .NET Framework, C++, Oracle, SQL
Relevant graduate coursework
Machine Learning, Bayesian Statistics, Mathematical Statistics, Deep Learning, Advanced Computational Biology, RNA World, High-throughput Datasets for Biologists, Interpretation of Deep Learning Models, Adversarial and Secure Machine Learning, Computational Linguistics.
Research and Professional Experience
CIS Ph.D. Student
Dr. Yoseph Barash, Biociphers Lab, University of Pennsylvania
My research projects have involved predicting splicing differences between tissues and regulatory networks between RNA-binding proteins with deep learning and developing EIG, an interpretation method for splicing code and other deep learning models based on genomic data. In parallel, I have contributed to research towards reliable identification and quantification of splicing events from RNA-Seq data and understanding the role of RNA-binding proteins in post-transcriptional regulation.
Master Thesis Student
Dr. Peter Struss, Model-Based Systems and Qualitative Reasoning Group, TU Munich
November 2013–August 2014 (10 months)
Conceptualization and implementation of a generic tool for selection of learning goals for a knowledge-based machine learning system, application of the tool for the fitness training domain.
Graduate Research Assistant
Dr. Florian Röhrbein, Human Brain Project, Neurorobotics, TU Munich
November 2013–March 2014 (5 months)
Surveyed on the availability of robot simulation tools and game engines for the neurorobotics platform of the Human Brain Project.
Graduate Research Assistant
Christian Vögele, fortiss GmbH, Munich
October 2013–April 2014 (7 months)
Project: PARO-Performance analysis for role-based applications. Created a tool for the performance modeling of an application and analysis of potential low performance areas during the design phase.
Graduate Research Assistant
Dr. Tobias Hamp, Rostlab, TU Munich
March 2013–September 2013 (7 months)
Project: Prediction of Interaction sites in Proteins using only sequences. Developed neural network models for prediction of interaction hot spots in proteins using amino acid sequences and evolutionary information.
Infosys Limited, India and TomTom Int. BV., Netherlands
June 2010-September 2012 (2 years 3 months)
Responsibilities included development of UI and back end for GPS enabled embedded navigation device with Android for TomTom Int. BV. Trainings Undertaken at Infosys: Android, ASP.NET, ADO.NET, C#, C, UNIX, Oracle, SQL Certifications at Infosys: STAR, Geographical Information Systems, PERL
MOCCASIN: A method for correcting for known and unknown confounders in RNA splicing analysis
Barry Slaff, Caleb Matthew Radens, Paul Jewell, Anupama Jha, Nicholas Lahens, Gregory R Grant, Andrei Thomas-Tikhonenko, Kristen W Lynch, Yoseph Barash
Multi-trait association studies discover pleiotropic loci between Alzheimer’s disease and cardiometabolic traits
William P. Bone, Katherine M. Siewert, Anupama Jha, Derek Klarin, Scott M. Damrauer, the VA Million Veteran Project, Kyong-Mi Chang, Philip S. Tsao, Themistocles L. Assimes, Marylyn D. Ritchie, Benjamin F. Voight
RNA binding proteins PCBP1 and PCBP2 are critical determinants of murine erythropoiesis
Xinjun Ji, Anupama Jha, Jesse Humenik, Louis R. Ghanem, Andrew Kromer , Christopher Duncan-Lewis, Elizabeth Traxler, Mitchell J. Weiss, Yoseph Barash, Stephen A, Liebhaber
In submission, 2020
Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study
Anupama Jha, Joseph K Aicher, Matthew R Gazzara, Deependra Singh, Yoseph Barash
Genome Biology, 2020
Ancient antagonism between CELF and RBFOX families tunes mRNA splicing outcomes
Matthew R Gazzara, Michael J Mallory, Renat Roytenberg, John P Lindberg, Anupama Jha, Kristen W Lynch, Yoseph Barash
Genome Research, 2017
RBP-Pokedex: Prediction of RBP knockdown effect via DNN experiment modeling
ISMB 2020, Virtual Conference
Integrative Deep Models for Alternative Splicing
ISMB/ECCB 2017, Prague, Czech Republic
Penn Research in Machine Learning (PRiML) Group, October 2017, Philadelphia, USA
RNA Discussion Group, April 2017, Philadelphia, USA
GCB 537 Guest Lecture: Support Vector Machine
April 2016-2019, Upenn, Philadelphia, USA
Panel: Is Graduate School for me?
CAPWIC 2019, Harrisonburg, USA
Guest Lecture with Dr. Yoseph Barash: AI and Computational Biology
WICS High School Day for Girls, Spring 2018-2019, Philadelphia, USA
GCB 537: Advanced Computational Biology
TA with Dr. Yoseph Barash, UPenn
Spring 2016, 2017, 2018 (3 terms)
Teaching Assistant for the Ph.D. level course with three components: statistical data analysis and machine learning techniques for computational biology, discussion on current topics in genomics and computational biology, and hands on experience in data analysis, coding and evaluation of computational biology tools/algorithms.
Deep Learning Reading Group
Co-organizer with Dr. Yoseph Barash, UPenn
Summer 2016, Spring 2018 (2 terms)
Co-organizer of reading groups to cover deep learning book and interpretation methods for deep learning models.
Service and Outreach
Co-reviewer with Prof. Yoseph Barash for NeurIPS 2015-2020, ISMB/ECCB 2016-2019, ICLR 2018-2020, ICML 2019-2020, PLOS Computational Biology Journal, Bioinformatics Journal, Nucleic Acids Research Journal. Independent reviewer for TEAMC-2018, Nature Scientific Reports
Mentored undergraduate students at University of Pennsylvania in their final year project in Fall 2017 and Spring 2018.