Hi, I'm Susanna Morin.

A self-driven, quick starter, passionate data scientist with 2 years of experience turning complex data into clear insights and actionable strategies with the goal of patient care optimization.

About

I hold a strong foundation in predictive and descriptive analytics, as well as statistical modeling applied to Electronic Health Records and Claims Data.

I've successfully led projects for government-funded population healthcare programs that have informed policy and resulted in additional funding at a state and federal level.

My main tech stack relies heavily on using Python, R, and SQL to ingest, process, analyze, and visualize my data.

  • Languages: Python, R, C++, HTML/CSS, Bash
  • Databases: MySQL, PostgreSQL
  • Libraries: NumPy, Pandas, Matplotlib, ggplot2, lm
  • Tools & Technologies: Git, JIRA, TortoiseSVN

I’m particularly drawn to opportunities that allow me to work collaboratively with key business stakeholders to identify areas of value, develop solutions, and deliver insights to reduce overall cost of care for members and improve their clinical outcomes.

Experience

Data Scientist
  • Built a survival analysis model using the cox proportional hazards regression to investigate the association between the survival time in the Severe and Persistently Mentally Ill (SPMI) patients and predictor variables of interest with the potential to inform policy in healthcare
  • Maintained and created tables used in downstream population healthcare analysis using Electronic Health Records (EHRs)
  • Conducted a total cost of care time series analysis on healthcare provider utilization trends
  • Developed and validated a Generalized Linear Mixed Effects Model for utilization care costs
  • Tools: Python, R, SQL, Azure, Tableu, TortoiseSVN
November 2022 - Present | Remote, USA
Clinical Informatics Analyst
  • Conducted data validation in collaboration with Nima Aghaeepour Laboratory at Stanford University; Focused research efforts on patient phenotyping and trajectory prediction in neonatal health and morbidity resulting in a publication in Science Translational Medicine
  • Pre-processed data using R-Studio; Analyzed relationships between women’s health factors and offspring health outcomes in pre-term labor using statistical methods
  • Generalized a CNN model measuring knee osteoarthritis and improved performance by changing the biomarker from bone shape to cartilage thickness
  • Tools: Python, R, OpenCV, Bash, Tensorflow, PyTorch, Git
July 2021 - December 2022 | San Francisco, CA
Algorithm Developer
  • Conducted research contributing to the development and driving of technical standards for genomic data; Focused on infrastructure for graph-based genomics
  • Developed a genotyper using the Markov Chain Monte Carlo probabilistic model that supports standard variant calling formats; Improved accuracy and performance of genotyper using the Min-Cut algorithm to break out of sampling bottlenecks maximizing mixing efficiency
  • Established evaluation methods that compare accuracy metrics against gold-standard datasets
  • Tools: C++, Git, Bash, Statistical Inference
July 2019 - May 2021 | Santa Cruz, CA
Data Scientist Intern
  • Developed statistical methods used in comparative genomics analysis
  • Leveraged single cell resolution data to improve understanding of cell type-specific transcriptional responses
  • Investigated how single-cell RNAseq data and single-cell ATACseq data from mouse hearts correlated with each other across drug treatment and disease states to successfully predict enhancer activation due to heart stress
  • Established a level of correlation between the two datasets and built a support vector machine (SVM) model to predict enhancer activation (single cell ATAC-seq) from expression data (single cell RNA-seq)
  • Tools: Python, Scikit-learn, Support Vector Machines, CLI
June 2020 - November 2020 | San Francisco, CA

Projects

music streaming app
Music Player Web-App

A music streaming web app based on Django

Accomplishments
  • Tools: Django, HTML, CSS, Bootstrap, SQLite, AWS S3, Heroku
  • Register/login to the web app(with OAuth-based Google Sign-In).
  • Search and filter songs based on language and singer.
  • Create multiple playlists and add/remove songs to/from playlist.
  • Scroll through recently played/viewed songs.
quiz app
Quiz Web-App

A quiz playing web app based on Django

Accomplishments
  • Tools: Django, HTML, CSS, Bootstrap, SQLite, Heroku
  • Register/login to the web app(with OAuth-based Google Sign-In).
  • Play Quiz and see the leaderboard
Screenshot of web app
Blog Web-App

A simple and extensible blog web-app based on Flask.

Accomplishments
  • Tools: HTML, CSS, Bootstrap, Flask, SQLAlchemy, Postgresql, Python
  • Users can view posts and contact the admin via Contact Page.
  • Admin can Add, Delete, Update posts.
Screenshot of  web app
Visual Question Answering

An attention-based classification model that aims at generating an answer for a given input image.

Accomplishments
  • Incorporated Convolution Neural Networks (CNN) for extracting image features and Long Short Term Memory for extracting question embeddings.
  • Tested the model on the COCO dataset, abstract scenes images, and got 69% overall accuracy on the VQA evaluation metric.
Screenshot of  web app
Video Summarizer

A Seq2Seq model that generates a short summary of the given input video.

Accomplishments
  • Incorporated CNN to detect and classify objects in the video frames and Long Short Term Memory for generating a summary.
  • Evaluated the model on MSVD (Microsoft Video Description Corpus) dataset; achieved 0.77, 0.71, 0.52 scores respectively on ROGUE, BLEU, METEOR evaluation metrics.
Screenshot of  web app
Image Generator

An image generator based on the concept of adversarial networks (GANs)

Accomplishments
  • Developed system was tested on a human-face database and loss was calculated by comparing the PCAs of generated and original image.
  • Calculated difference in PCA was less than 10%, depicting the successful generation of an image by the generator.
Screenshot of  web app
Head Counting System

A system that calculates the attendance of the class from a panoramic image of a live classroom.

Accomplishments
  • Used Singular Value Decomposition for image compression; applied various image processing techniques and morphological operations to detect the number of heads.

Skills

Languages and Databases

Python
HTML5
CSS3
MySQL
PostgreSQL
Shell Scripting

Libraries

NumPy
Pandas
OpenCV
scikit-learn
matplotlib

Frameworks

Django
Flask
Bootstrap
Keras
TensorFlow
PyTorch

Other

Git
AWS
Heroku

Education

University of California, San Francisco

San Francisco, CA

Degree: Master of Science in Medical Informatics

    Relevant Courseworks:

    • Machine Learning Algortihms
    • Biostatistics
    • Statistical Methods

University of California, Santa Cruz

Santa Cruz, CA

Degree: Bachelor of Science in Computer Science and Bioinformatics

    Relevant Courseworks:

    • Data Structures and Algorithms
    • Database Management Systems
    • Operating Systems
    • Machine Learning
    • Ethical Algorithms

Contact