Shanay Ghag

I am a Data Scientist at Great American Insurance Group, optimising insurance processes using AI & ML. I received my M.S. in Computer Science from University of Southern California, and B.E. in Information Technology from Pune Institute of Computer Technology.

My research focuses on efficient fine-tuning algorithms for multimodal large language models, inference optimisation methods and systems for ML

Experience

My journey in research and industry

Work Experience

Industry roles and internships

Data Scientist

Great American Insurance Group

May 2023 - Present

Developing tools for insurance automation using Machine Learning, Deep Learning, NLP, and Agentic AI. Specializing in deep research agents, fraud detection systems, and multi-modal llm fine-tuning

Machine Learning Engineer Intern

Ardent Privacy

Feb 2023 - May 2023

Engineered ML/NLP pipeline for automated data privacy, achieving 80% faster sensitive data discovery and 91% classification accuracy using ensemble methods (Char-LSTM + XGBoost) deployed on AWS

Data Science Intern

Veritas Technologies LLC

Aug 2021 - Jun 2022

Built Dense Passage Retriever with GraphCodeBert for intelligent code search (MRR: 0.272) and developed retrieval-augmented algorithms for automated code generation from agile user stories

Data Science Intern

FinSoftAI Solutions

Aug 2021 - Feb 2022

Developed real-time NLP engine for financial sentiment analysis across news and social media, supporting investment research, ESG scoring, and trading decisions using Target Dependent Sentiment Analysis

Data Science Intern

iMocha

Mar 2020 - Sep 2020

Enhanced AI-EnglishPro assessment tool by 13% accuracy through advanced sentiment analysis and grammar detection. Built internal code similarity detection tool for developer evaluation platform

Research Experience

Academic research

Research Intern

University of South Carolina

Aug 2025 - Present

Working on a parameter-eﬃcient finetuning algorithm using curvature-aware optimization and neural reprojection for faster convergence with fewer parameters

Research Assistant

University of Southern California - Institute of Creative Technologies

Sep 2022 - May 2023

Worked on a contrastive learning-based model for ensuring dialog consistency in negotiation dialog systems

Research Assistant

Pune Institute of Computer Technology - Computational Linguistics Lab

Sep 2020 - Aug 2021

Researched algorithms for generating Bloom's Taxonomy-aligned questions and generating low-level code from natural language

Projects

A selection of my recent work in machine learning, data science, and AI research

Geometric Reprojection Instruction Tuning (GRIT)

A parameter-efficient fine-tuning framework that combines LoRA with curvature-aware optimization and neural reprojection for efficient adaptation of vision-language models. Research work advised by Dr. Amitava Das

PythonPyTorchTransformersPEFT

Speculative Decoding

Speculative Decoding implementation based on 'Accelerating Large Language Model Decoding with Speculative Sampling' paper by DeepMind, 2023

PyTorchLLMs

Genetic Algorithm for TSP

Advanced Genetic Algorithm for solving the Traveling Salesman Problem with k-nearest neighbor initialization, 2-opt mutation, and multi-tier breeding strategies

Pythongenetic-algorithms

Get In Touch

I'm always open to discussing new projects, research opportunities, or collaborations. Feel free to reach out through any of the channels below.

ghag.shanay@gmail.com github.com/shanayghag linkedin.com/in/shanayghag