Sri Pranavi
Bhamidipati

AI + Analytics | Products & Pipelines | Public Impact

Applied Data Scientist focused on classic ML + applied NLP (RAG, multimodal, evaluation) and mission-driven analytics, shipped across finance, healthcare, climate/energy, and criminal justice.

Sri Pranavi Bhamidipati
View Below

About

Hi, I’m Sri! I’m a Carnegie Mellon University graduate student and I build machine learning systems, data products and pipelines, and mission-driven analytics. My technical foundation spans natural language processing (retrieval-augmented generation, multimodal reasoning, fine-tuning), classical machine learning (time series forecasting, clustering, anomaly detection), and data engineering (SQL databases, API integration, workflow automation). I work at the intersection of technical execution and domain expertise, targetting real operational constraints. I design systems considering deployment context, evaluation under temporal distribution shift, fairness metrics, and implementation feasibility. I focus on domains where data science can support evidence-based decisions in climate and energy policy, finance, public health, urban infrastructure, and justice system reform.

Education

Carnegie Mellon University, Heinz College — Pittsburgh, PA
M.S. Public Policy Management & Data Science (August 2024 – May 2026)

Focus: Applied ML, NLP, decision systems, optimization, econometrics, AI safety

Purdue University — West Lafayette, IN
B.S. in Data Science, Minor in Economics (August 2018 – May 2022)
Certificates: Entrepreneurship & Innovation, Music Technology

Experience

Business Intelligence Analyst II

Goldman Sachs Dallas, TX

August 2022 – August 2024

Developed analytics systems spanning data engineering, software development, and audit automation for Management & Strategy team within Global Banking & Markets division.

  • Built time series dataset and clustering models (Python, scikit-learn) analyzing client Gross Credits data, accurately predicting 10 client rank declines in Q4 2023 for proactive risk management
  • Implemented Isolation Forest algorithm for anomaly detection, flagging 50+ critical PnL Net Revenue data points to COO, reducing audit review times by 25%
  • Delivered Tableau dashboards tracking Marquee Sales market performance and usage trends to identify gaps and increase client retention

Research Analyst

Third Way – Climate & Clean Energy Finance Washington, DC

June – August 2025

  • Mapped $13+ billion disclosed equity investments across Nuclear, Long Duration Energy Storage, Geothermal, and Carbon Capture
  • Created investor classification system (VC, PE, corporate, institutional) making capital flows interpretable across clean energy technologies
  • Diagnosed structural finance bottlenecks, including missing middle and low debt usage, translating findings into policy-relevant levers for de-risking and blended finance structures
  • Created infographic-style policy narratives grounded in quantified data for stakeholder communications and policy briefs

NLP Specialist

Carnegie Mellon University – PA Department of Corrections (Capstone) Pittsburgh, PA

August 2025 – December 2025

  • Built NLP pipeline to extract and structure override reasons from VSG records (2016–2024)
  • Developed taxonomy of aggravating vs mitigating themes to support form redesign with structured capture
  • Ran regression and grid-cell analysis on arrest records to measure compliance impact on recidivism

Project Manager

Carnegie Mellon University – Students Using Data for Social Good (SUDS) Pittsburgh, PA

September 2024 – May 2025

  • Led 5-person analytics team to visualize trail usage patterns for 25+ miles using heatmaps and flow diagrams
  • Developed predictive forecasting models with 85% accuracy integrating Eco-counter data, weather, holidays, and local events
  • Managed monthly stakeholder meetings and data access agreements (NDAs) for $50K+ infrastructure investments

Teaching Assistant

Carnegie Mellon University – Operations and Supply Chain Analytics Pittsburgh, PA

March 2025 – May 2025

  • Supported instruction for 100 plus student MBA-level course on linear programming, simulation, and decision modeling using Excel Solver and Python
  • Graded weekly assignments and case studies, led office hours providing academic support, and assisted with final project presentations

Internships

2019 – 2022

Early-career technical roles spanning data science, data engineering, and software development across manufacturing, energy, and logistics sectors.

Featured Projects

Legislative Bill Passage Prediction for Civil Liberties Advocacy

Designed fairness-aware machine learning system for ACLU to triage approximately 2,000 bills per legislative session. Evaluated 88 model families across hundreds of variants with temporal evaluation and integrated fairness audits. Final model captured large majority of passed bills within top-ranked subset.

ACLU Fairness-Aware ML Temporal Evaluation XGBoost

Retrieval-Augmented Generation (RAG) Pipeline for Institutional Knowledge

Built modular RAG system with hybrid retrieval combining BM25 sparse retrieval and dense embeddings, plus cross-encoder reranking. Best configuration achieved 75.10 F1 and 58.24 EM with statistically significant improvements.

RAG Hybrid Retrieval Cross-Encoder Reranking Evaluation

Self-Reflective Multimodal for Visual Question Answering (VQA)

Developed self-reflective training framework for knowledge-intensive VQA tasks. Generated oracle reasoning trajectories and applied QLoRA fine-tuning, yielding measurable accuracy improvements on OK-VQA and A-OKVQA benchmarks.

Multimodal NLP QLoRA fine-tuning Vision-Language Models Error Analysis

Parole Violation Sanction Grid (VSG) Compliance & Recidivism Analysis

Built NLP pipeline extracting and analyzing override reasons from ~200 VSG records. Ran regression analysis on arrest records measuring compliance impact on recidivism (odds ratio 1.08, p less than 0.001). Drafted a revised VSGfor Pennsylvania Department of Corrections.

Criminal Justice Policy NLP Pipeline Impact Evaluation

Clean Energy Investment Mapping

Mapped $13 billion plus disclosed equity across Nuclear (approximately $7.5B), Long Duration Energy Storage (approximately $3B), Geothermal (approximately $2B), and Carbon Capture (approximately $0.4B). Built investor taxonomy for Third Way policy analysis.

Energy Finance Investment Mapping Policy Translation

Web Application Platforms & Data Products

Built Management Hub and Engagement Tracker at Goldman Sachs serving COO and CFO-level stakeholders. Implemented Isolation Forest anomaly detection flagging 50 plus critical PnL data points, reducing audit review times by 25 percent.

Enterprise Analytics Anomaly Detection SQL Pipelines

All Projects

Explore all 19 projects organized by domain and application area

Clean Energy Investment Mapping & Competitor Analysis

Led clean energy finance analysis as research intern at Third Way's Climate and Energy Program. Exported, cleaned, and merged Tracxn datasets covering companies, investors, and funding rounds. Built repeatable taxonomy classifying investor types (VC, corporate, institutional) and financing instruments (equity, grants, debt). Quantified disclosed equity across Nuclear (approximately $7.5B), Long Duration Energy Storage (approximately $3B), Geothermal (approximately $2B), and Carbon Capture (approximately $0.4B). Synthesized findings into policy narratives identifying structural financing gaps such as low debt utilization and the missing middle between early R&D and large-scale deployment, supporting internal policy strategy and external communications.

Climate Finance Investment Mapping Policy Translation Tracxn Data Visualization

Trail Usage Forecasting for Sustainable Infrastructure (SUDS)

Led five-person analytics team at Carnegie Mellon SUDS (Students Using Data for Social Good) to visualize trail usage patterns for 25 plus miles using heatmaps and flow diagrams, identifying high-traffic areas to guide sustainable urban infrastructure investments. Developed predictive forecasting models with 85 percent accuracy integrating Eco-counter data with weather patterns, holiday impacts, and local events to support infrastructure grant applications totaling $50,000 plus. Managed monthly stakeholder meetings with city officials and nonprofit leadership, coordinated data access agreements including NDAs, and delivered actionable final presentation justifying parking spaces, signage, benches, and kiosks for sustainable urban mobility and recreation planning.

Sustainable Infrastructure Forecasting Geospatial Analysis Urban Planning Stakeholder Management

U.S. Grid Modernization Policy Analysis

Authored policy memo analyzing structural limitations of U.S. electric grid and pathways toward advanced smart grid modernization. Evaluated regulatory fragmentation, legacy infrastructure constraints, and public-private investment mechanisms shaping grid deployment. Assessed role of data-driven grid intelligence, reliability, and equity considerations in modernization efforts. Produced policy recommendations emphasizing phased implementation, federal coordination, and targeted investment in grid-enhancing technologies.

Energy Policy Grid Modernization Regulatory Analysis Public-Private Partnerships

U.S. Energy Demand and Production Forecasting (EnerVision)

Developed time series forecasting models using over 50 years of U.S. energy consumption and production data (1973 to 2021). Conducted back-testing across multiple time windows to validate long-term reliability. Integrated forecasts into policy and investment narratives addressing energy transition risks, timing, and uncertainty. Contributed to investor-facing pitch materials translating model outputs into decision-relevant insights.

Time Series ARIMA Energy Forecasting Policy Planning

Costco vs. Target Financial Performance Analysis

Conducted comparative financial analysis of Costco and Target using financial statements and accounting ratios. Evaluated profitability, liquidity, leverage, and operational efficiency to assess relative business models and risk profiles. Interpreted quantitative results in context of retail strategy, supply chain structure, and pricing models to draw conclusions about long-term financial sustainability and competitive positioning.

Financial Analysis Accounting Corporate Strategy

Enterprise Analytics Platforms and Data Products (Goldman Sachs)

Built and maintained internal analytics platforms at Goldman Sachs supporting senior leadership across Global Markets, Banking, and Operations. Developed Management Hub to centralize cross-divisional metrics and Engagement Tracker to monitor leadership engagement with DEI initiatives. Designed SQL-backed pipelines and Tableau dashboards used by COO and CFO-level stakeholders for operational decision-making.

Enterprise Analytics Web Applications Dashboards SQL Pipelines

Anomaly Detection and Data Quality Monitoring (Goldman Sachs)

Implemented Isolation Forest anomaly detection models to flag over 50 critical PnL outliers within financial reporting pipelines. Integrated anomaly alerts into executive-facing dashboards, reducing audit review time by approximately 25 percent and improving data quality oversight across Capital, Liquidity, and PnL datasets.

Anomaly Detection Isolation Forest Time Series Data Quality

Oviact Startup Analytics and Product Strategy (Hack-A-Startup 2025, 2nd Place)

Contributed to data-driven startup pitch for Oviact, focusing on translating analytics into operational and investor value. Framed product narrative, data strategy, and scalability considerations for early-stage venture context. Team placed 2nd overall at Hack-A-Startup 2025, with judges highlighting clarity of problem framing and data-backed decision logic.

Startup Analytics Product Strategy Hackathon 2nd Place Winner

Clinical Workflow Mapping for Digital Pathology (UPMC)

Mapped end-to-end breast pathology workflow at UPMC from detection through reporting and downstream clinical decisions. Identified operational bottlenecks, interoperability gaps, and documentation burdens, including evidence that reporting consumes substantial share of pathologists' time. Framed realistic AI opportunities centered on structured reporting, integration middleware, and local domain adaptation. Delivered final report and presentation connecting AI feasibility to clinical constraints and adoption realities.

Healthcare AI Workflow Mapping Digital Pathology Systems Analysis

Heart Disease Mortality Disparities Analysis (CDC Data)

Analyzed over 472,000 observations across more than 2,000 U.S. counties using CDC heart disease datasets to examine geographic and demographic mortality disparities. Applied ANOVA and hypothesis testing to evaluate differences by county, race, and gender, identifying statistically significant inequities. Built choropleth maps and time series visualizations to communicate findings relevant to equity-driven public health interventions.

Public Health Statistical Analysis Health Equity Data Visualization

Self-Reflective Multimodal Reasoning for Visual Question Answering

Developed self-reflective training framework for knowledge-intensive multimodal VQA (OK-VQA, A-OKVQA). Generated oracle reasoning trajectories linking visual observations, intermediate knowledge hypotheses, and final answers to make reasoning behavior trainable and auditable. Applied synthetic supervision and QLoRA fine-tuning, yielding measurable accuracy improvements. Conducted large-scale error analysis to identify dominant failure modes and guide iterative refinement.

Multimodal NLP Vision-Language Models QLoRA Error Analysis

Retrieval-Augmented Generation Pipeline for Institutional Knowledge

Built modular RAG system answering institution-specific questions using scraped web content. Implemented and benchmarked sparse BM25 retrieval, dense embedding retrieval, and hybrid retrieval with cross-encoder reranking. Identified statistically significant performance gains and documented architectural tradeoffs between retrieval quality, latency, and system complexity.

RAG Hybrid Retrieval Cross-Encoder Reranking Evaluation

Amazon Review Sentiment Classification Pipeline

Engineered sentiment classification pipeline on 100,000 plus Amazon product reviews predicting star ratings based on cleaned and lemmatized review text. Applied TF-IDF with bi-grams (10,000 features) and TextBlob sentiment polarity creating 10,001-dimension feature matrix with NLTK preprocessing. Trained and tuned logistic regression classifier on oversampled data correcting severe class imbalance, achieving 76.4 percent test accuracy. Conducted full exploratory data analysis visualizing label skew and interpreting feature-label relationships (review length, helpful votes, Vine participation). Configured and troubleshot scoring scripts, environment YAML files, and model paths ensuring error-free deployment in AzureML real-time environments.

Sentiment Classification TF-IDF Logistic Regression Class Imbalance AzureML Deployment NLTK

Building and Training a LLaMA-Style Language Model

Implemented transformer-based language model inspired by LLaMA architecture, focusing on core model components including tokenization, embeddings, attention mechanisms, and optimization. Analyzed training dynamics, convergence behavior, and compute tradeoffs to build systems-level understanding of large language models under constrained resources.

LLMs Transformers Model Training Systems Understanding

Flood Risk Mitigation Optimization for Pittsburgh

Developed decision-support framework prioritizing flood mitigation investments under budget constraints. Constructed composite flood risk index combining hazard exposure, population vulnerability, critical infrastructure, and social vulnerability metrics. Compared Mixed Integer Programming optimization with heuristic approaches to show how prioritization shifts under different equity and cost assumptions. Delivered stakeholder-ready report connecting technical optimization to transparent public investment decisions.

Optimization Mixed Integer Programming Equity Analysis Public Infrastructure

311 Service Request Outcome Prediction

Built predictive models analyzing outcomes of 311 service requests using administrative city data. Conducted feature engineering and model evaluation to understand drivers of resolution time and service efficiency. Framed results to support operational planning and resource allocation decisions.

Forecasting Civic Data Operations Analytics

Legislative Bill Passage Prediction for Civil Liberties Advocacy (ACLU)

Designed fairness-aware machine learning system for the ACLU to triage approximately 2,000 bills per legislative session. Built deployment-realistic temporal datasets and evaluated 88 model families across hundreds of variants. Integrated fairness audits directly into model selection. Final recommended model captured large majority of passed bills within top-ranked subset, supporting more effective advocacy prioritization.

ACLU Fairness-Aware ML Temporal Evaluation Policy Decision Support

Parole Sanction Compliance and Recidivism Analysis (PA DOC Capstone)

Conducted multi-phase capstone research with Pennsylvania Department of Corrections evaluating compliance with Violation Severity Guidelines (VSG). Built NLP pipelines extracting override rationales from over 250,000 records and paired with regression analysis on 400,000 plus arrest records. Found statistically significant association between non-compliance and higher rearrest likelihood. Framed results to support actionable policy measurement and form-design improvements.

Criminal Justice Policy Administrative Data Impact Evaluation

Digital Twins for Smart Cities Infrastructure

Co-authored policy analysis examining deployment of digital twin technologies in global cities including Singapore, Boston, and Barcelona. Evaluated applications in predictive maintenance, disaster resilience, and traffic optimization. Assessed adoption barriers such as interoperability, cost, and governance, and proposed policy recommendations emphasizing phased rollout and public-private collaboration.

Smart Cities Digital Twins Urban Policy

Human Rights Conditions in Russia

Authored political analysis examining human rights conditions in Russia, integrating legal frameworks, enforcement mechanisms, and international responses. Assessed implications for global advocacy and policy pressure.

Human Rights Political Analysis International Policy

Geopolitical Analysis of India, China, and Russia

Analyzed strategic relationships among India, China, and Russia, focusing on economic ties, security dynamics, and geopolitical alignment. Integrated historical context with contemporary developments to assess regional power balances and global implications.

Geopolitics International Relations Policy Analysis

Skills

Languages

  • Python
  • SQL
  • R
  • C#, .NET MVC
  • Java, HTML/CSS, JavaScript
  • SAS

Frameworks & Libraries

  • PyTorch, TensorFlow
  • scikit-learn, pandas, NumPy
  • NLTK, Hugging Face
  • LangChain
  • Flask, React, Node.js

Tools & Platforms

  • Git, Jupyter
  • Tableau, Power BI, Excel
  • Azure ML, DataBricks
  • AWS, Docker
  • Alteryx
  • API Integration

ML & NLP Specializations

  • RAG Systems
  • Fine-tuning (QLoRA)
  • Multimodal NLP
  • Time Series Forecasting
  • Clustering, Anomaly Detection
  • Supervised & Unsupervised ML Algorithms
  • Sentiment Analysis

Leadership & Management

  • Project Management
  • Team Leadership
  • Stakeholder Communication
  • Cross-Functional Collaboration
  • Mentorship

Communication & Strategy

  • Technical Writing
  • Policy Analysis
  • Data Storytelling
  • Executive Presentations
  • Investor Pitching

Coursework

Applied ML, Statistics, & Decision Modeling

  • Optimization
  • Decision & Risk Modeling
  • Applied Econometrics
  • Time Series Forecasting
  • Exploratory Data Analysis

NLP, LLMs, & Generative AI

  • Advanced Natural Language Processing
  • Computational Data Science
  • Unstructured Data Analytics for Policy
  • Generative AI Lab

Systems, Data, & Production

  • Database Management
  • ML in Production
  • Systems Synthesis Capstone

Policy, Governance, & Communication

  • AI Governance
  • Writing for Public Policy
  • Critical Analysis for Policy Research
  • Strategic Presentation Skills
  • Organizational Design & Implementation

Urban & Global Context

  • Smart Cities
  • International Policy & Politics
  • From Data to Action
  • Accounting & Financial Analytics
  • International Crisis Negotiation

Get In Touch

I'm always interested in new opportunities and collaborations. Feel free to reach out!

Email Me