Prakhar headshot

Hi, I’m Prakhar Dungarwal 👋

🎓 MS in Data Science @ Columbia University.
📊 I turn data into intelligence, models into products, and AI research into scalable systems—grounded in solid ML: experimental design & A/B testing, feature engineering, NLP, time-series forecasting, and model evaluation.
💡 Passionate about LLMs, agentic AI, RAG, and applied data science that drives measurable real-world impact.

“Data science isn’t just about prediction — it’s about understanding the story the data wants to tell.”

Data Science Machine Learning NLP LLMs Agentic AI RAG
Explore Projects

New York, NY • (917) 217-8276 • pd2782@columbia.edu

LinkedIn Portfolio GitHub

Data Science Projects

CineGraphRAG cover

CineGraphRAG — MCP Agent for Movie Recommendations

A graph-aware reasoning agent that translates natural language into Cypher, queries Neo4j, and returns explainable subgraphs. With LangGraph orchestration and MCP tools, it delivers more relevant, faster recommendations users can trust.

CART cover

CART — Context-Aware Rendition with Transformers

A context-sensitive QA assistant that blends fuzzy matching with vector similarity for retrieval, then routes to transformer-based QA. The result is crisp, on-topic answers and noticeably snappier responses for operational search.

BRSR Parser cover

BRSR Parser — ESG Reporting Toolkit

An NLP pipeline that parses BRSR sustainability reports, extracts tables and key signals, and standardizes ESG metrics. It reduces manual review, improves consistency, and enables faster analytics for listed companies.

HCR cover

Handwritten Character Recognition (CRNN-LSTM)

A production-ready CRNN-LSTM model trained on IAM to transcribe handwriting with high accuracy. Deployed as an auto-grading helper that speeds up evaluation while outperforming classical HMM baselines.

Experience

Adobe Research

New York, NY, USA • Generative AI Researcher
Aug 2025 – Present
  • Developing personalized multimodal LLM frameworks using GPT-4o-v and LLava, defining 10+ evaluation factors for image-edit assessment with 95% consistency.
  • Building automated MLLM-as-a-Judge pipelines with RAG, reasoning traces, and agentic tool binding, improving judgment accuracy by 20%.

Intuit

Mountain View, CA, USA • AI & Data Science Intern
May 2025 – Present
  • Built an Agentic AI framework using LangGraph and ReAct multi-agent orchestration; reduced ground-truth annotation time by 50%.
  • Used GPT-4.5 and Claude-3.5-Sonnet with CoT and LLM-as-a-Judge to extract quality markers, improving recall by 30% vs. traditional NLP baselines.

Columbia University, Columbia Climate School, AC4 Lab

New York, NY, USA • AI & Machine Learning Researcher
Aug 2024 – Present
  • Fine-tuned RoBERTa on GoEmotions to classify video emotion (respect vs. contempt), integrating DeepSeek-R1 & GPT-4o to target a 30% drop in non-peaceful watch time.
  • Built a cloud backend for a Chrome extension using AWS (S3, EC2), LangChain, and YouTube APIs.

Morgan Stanley

Bengaluru, IN • Senior Data Scientist
Jan 2022 – Aug 2024
  • Shipped a RAG chatbot with LLaMA-3.1-8B-Instruct and all-mpnet-base-v2, cutting query resolution time by 60% and annual ops costs by 30%.
  • Built a hybrid forecasting stack (CNN + Exponential Smoothing + Prophet) improving accuracy by 20% and enabling $1M savings.
  • Led DS/ML ops for enterprise data centers; drove algorithmic improvements and reduced annual spend by 20%.

Skills & Interests

Programming & Tools

Python SQL R C C++ Go Git Docker Streamlit AWS Azure Snowpark Pandas NumPy Prophet Neo4j

Specialized Techniques

ML DL NLP CV AI Agents RL RAG A/B Time Series Hybrid Forecasting

LLM Tools & Frameworks

LangChain LlamaIndex LangGraph ReAct OpenAI HF Transformers ChromaDB FAISS MLLMs LLaMA mpnet

ML Frameworks

PyTorch TensorFlow scikit-learn Keras Random Forest XGBoost PySpark JAX MLflow MLOps Caffe CRNN RoBERTa

Education

Columbia University — M.S. in Data Science

Aug 2024 – Dec 2025 • New York City, NY, USA • GPA: 3.8/4

Coursework: Applied Machine Learning, Deep Learning for NLP, Probability & Statistics, LLM-Based Generative AI, Causal Inference

Vellore Institute of Technology — B.Tech CSE

Jul 2018 – Jul 2022 • Vellore, IN • GPA: 3.96/4

Coursework: Data Structures & Algorithms, Computer Architecture, Parallel & Distributed Computing, Artificial Intelligence, Robotics

Contact

Open to opportunities in Data Science, LLMs, RAG & Agentic AI.
New York, NY • pd2782@columbia.edu • (917) 217-8276
LinkedIn Portfolio GitHub

I love watching F1 races and playing cricket.

Chat with PRK-AI