top of page

WORK EXPERIENCE 

​

Screenshot 2024-07-24 at 6.16.26 PM.png
Screenshot 2024-07-09 at 8_edited.jpg
Screenshot 2023-10-26 at 6.59_edited.jpg
Screenshot 2023-11-18 at 7.50_edited.jpg
Screenshot 2023-10-26 at 7.05_edited.jpg

Data Science Intern

Geisinger Health System
June 2024 - Present
Houston, Texas

​

  • Designed and built an NLP pipeline to classify intracranial hemorrhage (ICH) from over 30k+ radiology reports, leveraging BERT and advancing research into state-of-the-art Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG)

  • Developed a predictive model on AWS SageMaker to analyze electronic health records (EHR) and identify high-risk, unscreened females for breast cancer, driving proactive health interventions.

  • Implemented automation procedures to streamline extraction and integration of CMS Hospital Care Compare files and internal data from various vendors, cutting down manual effort of one full-time employee by ∼ 2 weeks per quarterly cycle.

Data Scientist

RBL Bank

July 2021 - August 2o23

Mumbai, India

  • Automation: Led a team of 4 to create ETL pipelines on Azure for data migration verification, reducing ∼ 2 hours of daily manual work.

  • Customer Segmentation and Engagement: Implemented a clustering model to segment credit card customers based on spending behavior. Through hyperparameter tuning, feature engineering, and evaluation, accomplished 12% increase in customer engagement.

  • Fraud Detection Model Optimization: Revamped a Credit Card Fraud Classification model, reflected in a significant improvement in AUC-ROC score from 0.82 to 0.89. Contributed to the deployment of the optimized model, enhancing fraud detection capabilities.

  • Interactive Data Visualizations: Took the initiative to automate & optimize various SQL and Excel-based reports into interactive and real-time Tableau Dashboards. Thereby reducing preprocessing and query time, resulted in savings ∼ 50 hours/month.

  • Data Modeling: Designed data models and schemas for relational databases, optimizing query performance and storage efficiency.

 Machine Learning Intern

Kaashiv Infotech
Apr 2021 - June 2o21
Pune, India
  • Improved sales prediction model with diverse ensemble of time series models, leveraging optimal features and new engineered inputs

  • Captured best inputs from base model features and engineered features like Median of all models and Time of hour

  • Utilized Support Vector Regressor for aggregation and enhanced predictions by using two models for peak & off-peak hours

  • Tested the model on past data and achieved an average of 1% improvement over various accuracy metrics

Subject Matter Expert Intern

Chegg
Nov 2020 - March 2o21
Pune, India

• Tutored high school/UG level students as an independent contractor on the Chegg platform, achieving a 95% satisfaction rate.

• Experience in teaching over 60+ students and conducting 80+ lessons through the platform.

• Taught students SQL, Database Management, Python/C++ Programming, and guided them in solving projects and assignments

bottom of page