I am UW Data Science Postdoctoral Scholar at the University of Washington with Prof. William Stafford Noble. I started my career as a software engineer; my undergrad degree is in computer science. After working for two years in industry developing navigation systems for cars, I got a master's degree in Informatics at TU Munich. There I discovered my two passions: machine learning and computational biology. I went on to obtain a Ph.D. degree in these two areas at University of Pennsylvania with Dr. Yoseph Barash. My research focus during my Ph.D. was the application and development of interpretable machine learning methods for answering core biological questions related to splicing differences between tissues and regulatory networks of RNA-binding proteins. For my postdoctoral studies, I am developing machine learning methods for imputing high-throughput genomic data sets, particularly with respect to 3D genome architecture.
Languages, systems, and tools
Proficient: Python, Tensorflow, High Performance Computing, UNIX
Competent: R, LaTeX
Past work experience: Perl, MATLAB, Java, Android, C#, .NET Framework, C++, Oracle, SQL
Relevant graduate coursework
Machine Learning, Bayesian Statistics, Mathematical Statistics, Deep Learning, Advanced Computational Biology, RNA World, High-throughput Datasets for Biologists, Interpretation of Deep Learning Models, Adversarial and Secure Machine Learning, Computational Linguistics.
Research and Professional Experience
UW Data Science Postdoctoral Scholar
Dr. William Stafford Noble, Department of Genome Sciences, University of Washington
As a postdoctoral fellow with Dr. William Stafford Noble at the University of Washington, I have been actively developing integrative and interpretable models for nuclear DNA architecture. These include a sequence to trans-contact prediction models that predicts inter-chromosomal Hi-C contacts from DNA sequence and a cross-species bidirectional translator between Hi-C and ATAC-seq data across six mammalian species.
CIS Ph.D. Student
Dr. Yoseph Barash, Biociphers Lab, University of Pennsylvania
My research projects have involved predicting splicing differences between tissues and regulatory networks between RNA-binding proteins with deep learning and developing EIG, an interpretation method for splicing code and other deep learning models based on genomic data. In parallel, I have contributed to research towards reliable identification and quantification of splicing events from RNA-Seq data and understanding the role of RNA-binding proteins in post-transcriptional regulation.
Master Thesis Student
Dr. Peter Struss, Model-Based Systems and Qualitative Reasoning Group, TU Munich
November 2013–August 2014 (10 months)
Conceptualization and implementation of a generic tool for selection of learning goals for a knowledge-based machine learning system, application of the tool for the fitness training domain.
Graduate Research Assistant
Dr. Florian Röhrbein, Human Brain Project, Neurorobotics, TU Munich
November 2013–March 2014 (5 months)
Surveyed on the availability of robot simulation tools and game engines for the neurorobotics platform of the Human Brain Project.
Graduate Research Assistant
Christian Vögele, fortiss GmbH, Munich
October 2013–April 2014 (7 months)
Project: PARO-Performance analysis for role-based applications. Created a tool for the performance modeling of an application and analysis of potential low performance areas during the design phase.
Graduate Research Assistant
Dr. Tobias Hamp, Rostlab, TU Munich
March 2013–September 2013 (7 months)
Project: Prediction of Interaction sites in Proteins using only sequences. Developed neural network models for prediction of interaction hot spots in proteins using amino acid sequences and evolutionary information.
Infosys Limited, India and TomTom Int. BV., Netherlands
June 2010-September 2012 (2 years 3 months)
Responsibilities included development of UI and back end for GPS enabled embedded navigation device with Android for TomTom Int. BV. Trainings Undertaken at Infosys: Android, ASP.NET, ADO.NET, C#, C, UNIX, Oracle, SQL Certifications at Infosys: STAR, Geographical Information Systems, PERL
MOCCASIN: A method for correcting for known and unknown confounders in RNA splicing analysis
Barry Slaff, Caleb Matthew Radens, Paul Jewell, Anupama Jha, Nicholas Lahens, Gregory R Grant, Andrei Thomas-Tikhonenko, Kristen W Lynch, Yoseph Barash
Multi-trait association studies discover pleiotropic loci between Alzheimer’s disease and cardiometabolic traits
William P. Bone, Katherine M. Siewert, Anupama Jha, Derek Klarin, Scott M. Damrauer, the VA Million Veteran Project, Kyong-Mi Chang, Philip S. Tsao, Themistocles L. Assimes, Marylyn D. Ritchie, Benjamin F. Voight
RNA binding proteins PCBP1 and PCBP2 are critical determinants of murine erythropoiesis
Xinjun Ji, Anupama Jha, Jesse Humenik, Louis R. Ghanem, Andrew Kromer , Christopher Duncan-Lewis, Elizabeth Traxler, Mitchell J. Weiss, Yoseph Barash, Stephen A, Liebhaber
In submission, 2020
Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study
Anupama Jha, Joseph K Aicher, Matthew R Gazzara, Deependra Singh, Yoseph Barash
Genome Biology, 2020
Ancient antagonism between CELF and RBFOX families tunes mRNA splicing outcomes
Matthew R Gazzara, Michael J Mallory, Renat Roytenberg, John P Lindberg, Anupama Jha, Kristen W Lynch, Yoseph Barash
Genome Research, 2017
RBP-Pokedex: Prediction of RBP knockdown effect via DNN experiment modeling
ISMB 2020, Virtual Conference
Integrative Deep Models for Alternative Splicing
ISMB/ECCB 2017, Prague, Czech Republic
Penn Research in Machine Learning (PRiML) Group, October 2017, Philadelphia, USA
RNA Discussion Group, April 2017, Philadelphia, USA
GCB 537 Guest Lecture: Support Vector Machine
April 2016-2019, Upenn, Philadelphia, USA
Panel: Is Graduate School for me?
CAPWIC 2019, Harrisonburg, USA
Guest Lecture with Dr. Yoseph Barash: AI and Computational Biology
WICS High School Day for Girls, Spring 2018-2019, Philadelphia, USA
GCB 537: Advanced Computational Biology
TA with Dr. Yoseph Barash, UPenn
Spring 2016, 2017, 2018 (3 terms)
Teaching Assistant for the Ph.D. level course with three components: statistical data analysis and machine learning techniques for computational biology, discussion on current topics in genomics and computational biology, and hands on experience in data analysis, coding and evaluation of computational biology tools/algorithms.
Deep Learning Reading Group
Co-organizer with Dr. Yoseph Barash, UPenn
Summer 2016, Spring 2018 (2 terms)
Co-organizer of reading groups to cover deep learning book and interpretation methods for deep learning models.
Service and Outreach
Co-reviewer with Prof. Yoseph Barash for NeurIPS 2015-2020, ISMB/ECCB 2016-2019, ICLR 2018-2020, ICML 2019-2020, PLOS Computational Biology Journal, Bioinformatics Journal, Nucleic Acids Research Journal. Independent reviewer for TEAMC-2018, Nature Scientific Reports
Mentored undergraduate students at University of Pennsylvania in their final year project in Fall 2017 and Spring 2018.