Sri Pranavi
Bhamidipati

AI + Analytics | Products & Pipelines | Public Impact

Applied Data Scientist focused on classic ML + applied NLP (RAG, multimodal, evaluation) and mission-driven analytics, shipped across finance, healthcare, climate/energy, and criminal justice.

Sri Pranavi Bhamidipati
View Below

About

I’m an Applied Data Scientist focused on classic ML + applied NLP (RAG, multimodal, evaluation) and mission-driven analytics.

I’ve shipped work across finance, healthcare, climate/energy, and criminal justice — building products, pipelines, and decision support systems that bridge technical execution with real-world impact.

Education

Carnegie Mellon University, Heinz College — Pittsburgh, PA
M.S. Public Policy Management & Data Science (August 2024 – May 2026)
GPA: 3.63/4.00
Focus: Applied ML, NLP, decision systems, optimization, database management

Purdue University — West Lafayette, IN
B.S. in Data Science, Minor in Economics (August 2018 – May 2022)
Certificate in Entrepreneurship & Innovation and Music Technology

Experience

Goldman Sachs

Business Intelligence Analyst II

August 2022 – August 2024 • Dallas, TX

Developed ML/analytics tooling spanning data engineering, software, and audit automation.

  • Developed time series dataset + Python clustering/regression models on client Gross Credits, accurately predicting 10 client rank declines in Q4 2023
  • Implemented Isolation Forest ML algorithm for anomaly detection, flagging 50+ critical PnL Net Revenue data points to COO
  • Built and launched "Management Hub" — analytics web application used by COO/CFO
  • Developed "Engagement Tracker" — web analytics tool monitoring senior leadership engagement with DEI initiatives
  • Contributed to Ariba Invoice Servicing Portal (ISP), recognized by Asia leadership and adopted by 10,000+ global users
  • Created and maintained secure SQL database environment for 20+ teams across Management & Strategy
  • Integrated multiple APIs (Operational Risk Events, Interactions, RTO metrics) into internal data systems
  • Automated data workflows with Autosys and Alteryx, reducing processing times by 25%
  • Organized quarterly data science events for 500+ data professionals; founded Dallas regional Data Science Meetup group

Carnegie Mellon University (SUDS)

Project Manager – Students Using Data for Social Good

September 2024 – May 2025 • Pittsburgh, PA
  • Led 5-person analytics team to visualize trail usage patterns for 25+ miles using heatmaps and flow diagrams
  • Developed predictive forecasting models with 85% accuracy integrating Eco-counter data, weather, holidays, and local events
  • Managed monthly stakeholder meetings and data access agreements (NDAs) for $50K+ infrastructure investments

Carnegie Mellon University (Capstone)

Data Scientist – PA Department of Corrections VSG Analysis

August 2024 – May 2025 • Pittsburgh, PA
  • Built NLP pipeline to extract and structure override reasons from 254,208 VSG records (2016–2024)
  • Developed taxonomy of aggravating vs mitigating themes to support form redesign with structured capture
  • Ran regression and grid-cell analysis on 414,504 arrest records to measure compliance impact on recidivism (OR 1.08, p<0.001)

Featured Projects

ML Policy Lab

Designed ML system to triage ~2,000 bills per session; ran 792 model variants with explicit fairness constraints. XGBoost captured 82% of passed bills.

Bias Audit XGBoost Fairness

ANLP RAG System

Built modular RAG pipeline with hybrid retrieval (BM25 + dense embeddings) and cross-encoder reranking. Qwen2.5-32B + BGE achieving 75.10 F1.

RAG Retrieval Reranking

ANLP VQA

Generated oracle trajectories for self-reflective fine-tuning on VQA. Improved Soft-VQA: 64.97→66.77 (OK-VQA).

Multimodal VQA Fine-tuning

Capstone VSG System

Built NLP pipeline + regression analysis on 254K VSG records to measure compliance impact on recidivism.

NLP Regression Policy

Third Way Clean Energy

Mapped $13B+ in disclosed equity across Nuclear, LDES, Geothermal, CCS. Built investor taxonomy and policy narratives.

Climate Finance Policy

Goldman Analytics Products

Built Management Hub and Engagement Tracker. Automated PnL anomaly detection with Isolation Forest.

Web Apps SQL APIs

All Projects

Explore all 19 projects organized by domain and application area

Third Way Clean Energy Finance Mapping

Mapped $13B+ disclosed equity across Nuclear (~$7.5B), LDES (~$3B), Geothermal (~$2B), CCS (~$0.4B). Built investor taxonomy (VC/PE/corporate) and produced infographic-style policy narratives.

Climate Finance Investment Analysis Policy

Smart Grid Policy Memo

Authored policy brief on U.S. Grid Modernization: regulatory responses, investment trends, public-private partnerships. Analyzed fiscal policy options for equitable AI-integrated smart infrastructure.

Energy Policy Grid Modernization Public-Private Partnerships

EnerVision – Energy Forecasting Model

Developed time series model forecasting U.S. energy consumption/production (1973-2021) within 0.5 quadrillion Btu accuracy. Created investor pitch deck: $1M proposal, 10x ROI, $15M revenue by Year 5.

Time Series Forecasting ARIMA

Goldman Sachs – Management Hub & Engagement Tracker

Built Management Hub (COO/CFO analytics platform centralizing global markets/banking/ops data). Developed Engagement Tracker (DEI monitoring web app). Contributed to Ariba ISP (10K+ users, Asia award nomination).

Web Applications Dashboards SQL

Goldman Sachs – Data Pipelines & Automation

Created SQL database environment for 20+ teams. Integrated APIs (Operational Risk Events, Interactions, RTO metrics) via Alloy/Pure. Automated workflows with Autosys/Alteryx, reducing processing times by 25%.

Data Engineering SQL APIs Automation

Dashboarding & Integration Work

Delivered Tableau dashboards for Marquee Sales tracking market performance and usage trends. Built anomaly detection (Isolation Forest) flagging 50+ critical PnL data points. Reduced audit review times by 25%.

Tableau Dashboards Anomaly Detection

RA-UPMC Clinical Workflow Mapping (Digital Pathology)

Produced end-to-end breast pathology workflow map. Identified bottlenecks: reporting consumes ~34% of pathologists' time. Framed AI opportunities: structured synoptic reporting, integration middleware.

Workflow Mapping Healthcare Digital Pathology

Heart Disease Mortality Data Analysis

Analyzed 472K+ CDC records (2013-2020) across 2,000+ counties. Applied ANOVA to identify demographic disparities (geography, gender, race). Built choropleth maps and time series visualizations.

Health Data Statistical Analysis ANOVA Visualization

ANLP VQA – Self-Reflective Multimodal Reasoning

Generated oracle trajectories for self-reflective fine-tuning on knowledge-intensive VQA. Improved Soft-VQA accuracy: 64.97→66.77 (OK-VQA), 77.64→78.34 (A-OKVQA). Ran GPU workloads on AWS.

Multimodal VQA Fine-tuning QLoRA AWS

ANLP RAG System – Retrieval Pipeline

Built modular RAG pipeline: web scraping → hybrid retrieval (BM25 + dense embeddings) → cross-encoder reranking. Best config: Qwen2.5-32B + BGE achieving 75.10 F1, 58.24 EM (p = 0.0156).

RAG Retrieval Reranking Web Scraping

Capstone NLP Override Reasons

Built NLP pipeline to extract and structure override reasons from 254K VSG records. Manual-coded dataset of ~200 forms. Created taxonomy (aggravating vs mitigating) for form redesign.

NLP Text Classification Qualitative Coding Policy

Amazon Reviews Sentiment Pipeline (AzureML)

Engineered sentiment classification pipeline on 100K+ Amazon reviews. Applied TF-IDF (10K features) + logistic regression, achieving 76.4% accuracy. Deployed via AzureML.

NLP TF-IDF Sentiment Analysis AzureML

From Data to Action – Pittsburgh Flood Risk (MIP vs Greedy)

Built decision-support framework for flood mitigation ($14.5M budget allocation). Designed composite risk score (FEMA + SVI equity). Compared MIP optimization vs greedy heuristic selection.

Optimization MIP Decision Support

SUDS Trail Usage Forecasting

Developed predictive models with 85% accuracy for 25+ miles of trails. Integrated Eco-counter data + weather + holidays. Produced heatmaps and flow diagrams to guide $50K+ infrastructure investments.

Forecasting Predictive Modeling Geospatial

EnerVision Energy Forecasting

Built ARIMA time series model on 50+ years of U.S. energy data (1973-2021). Back-tested and cross-validated for long-term reliability. Supported policy planning, ESG risk mitigation, investment decisions.

Time Series ARIMA Forecasting

ML Policy Lab – Bias Audit & Model Selection

Designed ML system to triage ~2,000 bills/session into top 15% likely to pass. Evaluated 88 model types (792 variants) with temporal evaluation + fairness constraints. XGBoost captured 82% of passed bills.

Bias Audit Temporal Evaluation XGBoost Model Selection

Capstone Compliance Measurement Framework

Ran regression + grid-cell analysis on 414K arrest records to measure VSG compliance impact on recidivism. Reported: non-compliance → OR 1.08 (p < 0.001), +1.5pp predicted probability increase.

Regression Stratified Analysis Policy Measurement

Digital Twins Policy Memo (Transportation)

Co-authored policy report evaluating digital twin tech in urban infrastructure (Singapore, Boston, Barcelona). Assessed scalability, equity tradeoffs, predictive maintenance, disaster resilience.

Policy Research Digital Twins Urban Infrastructure

AI Governance & Fairness Research

Additional responsible AI governance work, fairness audits, and policy research projects focused on equitable AI deployment.

AI Governance Fairness Policy

Skills

Languages

  • Python
  • SQL
  • R
  • C#, .NET MVC
  • Java, HTML/CSS, JavaScript
  • SAS

Frameworks & Libraries

  • PyTorch, TensorFlow
  • scikit-learn, pandas, NumPy
  • NLTK, Hugging Face
  • LangChain
  • Flask, React, Node.js

Tools & Platforms

  • Git, Jupyter
  • Tableau, Power BI, Excel
  • Azure ML, DataBricks
  • AWS, Docker
  • Alteryx, Autosys

ML & NLP Specializations

  • RAG Systems
  • Fine-tuning (QLoRA)
  • Multimodal AI
  • Time Series Forecasting
  • Clustering, Anomaly Detection
  • Sentiment Analysis

Leadership & Management

  • Project Management
  • Team Leadership
  • Stakeholder Communication
  • Cross-Functional Collaboration
  • Mentorship

Communication & Strategy

  • Technical Writing
  • Policy Analysis
  • Data Storytelling
  • Executive Presentations
  • Investor Pitching

Coursework

Applied ML, Statistics, & Decision Modeling

  • Optimization
  • Decision & Risk Modeling
  • Applied Econometrics
  • Time Series Forecasting
  • Exploratory Data Analysis

NLP, LLMs, & Generative AI

  • Advanced Natural Language Processing
  • Computational Data Science
  • Unstructured Data Analytics for Policy
  • Generative AI Lab

Systems, Data, & Production

  • Database Management
  • ML in Production
  • Systems Synthesis

Policy, Governance, & Communication

  • AI Governance
  • Writing for Public Policy
  • Critical Analysis for Policy Research
  • Strategic Presentation Skills
  • Organizational Design & Implementation

Urban / Global Context

  • Smart Cities
  • International Politics
  • From Data to Action
  • Accounting & Financial Analytics

Get In Touch

I'm always interested in new opportunities and collaborations. Feel free to reach out!

Email Me