About Experience Projects Education Certs Blog Contact Hire me →
Open to new opportunities

Devashree Buch

Data Engineer · Data Analyst · AI/ML Engineer

I build cloud-native data pipelines that turn raw telemetry into decisions. 5+ years across Capital One, FinTech and Cybersecurity — from NLP vulnerability prediction engines and ML forecasting models to interactive Tableau dashboards published on Tableau Public.

5+
Years experience
85%
NLP model precision
30%+
Runtime reductions
3
Cloud platforms
Devashree Buch
Based in
San Diego, CA
Scroll to explore
01
About & Skills

Turning data into
decisions

→ View my visualisations on Tableau Public ↗

I am a Data Engineer with strong Analytics background who specialises in building reliable, automated data systems at enterprise scale. My work sits at the intersection of engineering and insight — I don't just move data, I make it useful.

At Capital One I engineered Databricks and PySpark pipelines for cybersecurity controls monitoring, cutting execution time by 25% and replacing manual audit workflows with real-time dashboards. At ThreatModeler I built a first-of-its-kind NLP engine achieving 85% precision in vulnerability prediction — and completed IBM's Apache Spark ML certification building end-to-end distributed ML pipelines using MLlib, Prophet time-series forecasting and Random Forest models.

Outside work I build data visualisations published on Tableau Public, solve SQL challenges on LeetCode, and ship open-source data projects on GitHub. I'm currently open to full-time Data Analyst, Data Engineer and AI/ML Engineer roles — remote, hybrid or onsite anywhere in the US.

Cloud & Warehousing
Snowflake Databricks AWS Redshift AWS Glue S3 · Lambda Sagemaker
Languages & Frameworks
Python SQL PySpark Java Pandas · Regex Hive · Hadoop
ML & AI
NLP Random Forest Decision Trees Time-Series (Prophet) K-Means Clustering Linear / Logistic Regression LLM · Generative AI Prompt Engineering
Visualisation & BI
Tableau AWS QuickSight Power BI Jupyter Notebook
AI Tools
Cursor Claude ChatGPT Gemini GitHub Copilot
02
Experience

Where I've
worked

Oct 2024 – May 2025
Security Data Analyst
Capital One · Contractor via Pyramid Consulting · Hybrid, McLean VA
  • Pioneered automation of the Cybersecurity Controls Monitoring framework via end-to-end Databricks + PySpark ETL, transitioning manual audits to real-time data-driven assessments
  • Architected Snowflake schemas and reduced Spark execution time by 25% by migrating CSV ingestion to Parquet with predicate pushdown
  • Built Green-Yellow-Red health matrix in SQL visualised in QuickSight with automated owner notifications
  • Delivered the org's first leadership-level security dashboard in collaboration with product and AWS teams
Sep 2023 – Sep 2024
Independent Data Consultant
Freelance · Remote in USA
  • Engineered statistical models for private clients to forecast financial profitability with 80%+ accuracy and identified key market risks through deep-dive portfolio analysis
  • Completed 20+ hours of hands-on Apache Spark and ML projects during professional transition — building proficiency in distributed data architecture and MLlib pipelines
Feb 2022 – Jun 2023
Data Analyst
Capital One · Contractor via Pyramid Consulting · Remote
  • Built cross-functional BI solutions in Tableau and QuickSight integrating Snowflake, Hive and ACBS data
  • Automated ETL in Databricks, eliminating 20+ hours of manual weekly processing and improving quarterly forecasting accuracy by 10%
  • Optimised Hive via partitioning and bucketing — 30% reduction in query runtime and cloud-compute cost savings
Sep 2020 – Oct 2021
Data Analyst — RevOps
Navigate360 via Vetro Technologies
  • Architected AWS Redshift Data Warehouse, migrating Salesforce and Google Analytics data into a single source of truth
  • Engineered automated ETL pipelines using AWS Glue, S3, AppFlow and Lambda for 24/7 data availability
  • Delivered Tableau KPI dashboards connected directly to Redshift, streamlining executive reporting
Aug 2019 – Sep 2020
Junior Data Scientist
ThreatModeler Software Inc · Jersey City, NJ
  • Built first-of-its-kind NLP engine to interpret DFD diagrams — 85% precision in vulnerability prediction, 40% reduction in manual review cycles
  • Presented technical POC to the CEO — work directly cited in a company patent filing for ML-driven threat detection; received the Outspoken Award for technical partnership and model integrity
2016 – 2018
Early Career & Internships
LTIMindtree · iCreate Technologies
  • ML Intern, LTIMindtree (NJ, 2018) — Engineered a food-recognition prototype using AWS DeepLens, improving model accuracy by 13%
  • Software Intern, iCreate Technologies (India, 2016–17) — Developed Java-based back-end services for an online veterinary e-commerce platform
05
Projects

Things I've
built

01 / Featured
Dark Web Price Index 2020–2025

Interactive Tableau dashboard visualising 5 years of dark web market pricing data — stolen credentials, identity documents, financial accounts and cybercrime services. Built with a custom Python scraper and 2024 interpolation pipeline. "Your SSN costs $1. A corporate server key costs $200,000. Same market."

TableauPythonBeautifulSoup PandasData Storytellingopenpyxl
02
Apache Spark for ML — IBM

End-to-end machine learning pipeline built on Apache Spark as part of IBM's professional certification. Covers distributed data processing, feature engineering, model training and evaluation at scale using PySpark MLlib.

PySparkMLlibApache SparkIBM
03
JPMorgan Chase Forage

Virtual experience program from J.P. Morgan's Software Engineering track via Forage. Implemented financial data feeds, fixed broken visualisation output and used JPMorgan's Perspective library to create live trading dashboards.

PythonPerspectiveReactFinTech
04
Python Problems & Practice

Collection of Python problem solutions, algorithms, data structures and scripting exercises. Covers everything from string manipulation and list comprehensions to file I/O and API integrations — a living reference of daily Python practice.

PythonAlgorithmsData Structures
05
Natural Gas Price Interpolation

Statistical interpolation model for forecasting natural gas prices using time-series techniques. Applied Prophet forecasting, linear regression and seasonal decomposition to identify price trends and generate forward-looking estimates from historical commodity data.

PythonProphetTime-SeriesPandasForecasting
06
Predicting Probability of Default

Credit risk model predicting the probability that a borrower defaults on a loan. Built using logistic regression, decision trees and random forest classifiers on financial features — evaluating model performance via AUC-ROC, precision-recall and confusion matrices.

PythonLogistic RegressionRandom Forestscikit-learnCredit Risk
07
LeetCode Solutions

Personal collection of LeetCode problem solutions across SQL, Python and algorithms — covering arrays, strings, dynamic programming, joins, window functions and more. Built as a living reference of problem-solving practice for coding interviews.

SQLPythonAlgorithmsData StructuresLeetCode
06
Education

Where I've
studied

2022 – 2023
MS in Project Management
University of the Cumberlands
GPA: 3.64 / 4.0
2017 – 2019
MS in Computer & Information Sciences
New York Institute of Technology
GPA: 3.49 / 4.0
2013 – 2017
BE in Computer Engineering
Gujarat Technological University
CGPA: 7.55 / 10.0
07
Certifications & Awards

Credentials &
recognition

Award
Outspoken Award
ThreatModeler Software Inc
Awarded for exceptional technical partnership and professional candor — recognised for driving model integrity through rigorous peer review.
IBM Certificate · BD0231EN  ↗
Apache Spark for Data Engineering & ML
End-to-end ML pipelines using PySpark, MLlib, distributed feature engineering and model evaluation at scale.
View certificate ↗
HarvardX · edX  ↗
Using Python for Research
Applied Python for data analysis, statistical computing and research workflows — delivered by Harvard University faculty.
View certificate ↗
Forage · J.P. Morgan  ↗
Quantitative Research Simulation
Modelled natural gas prices, estimated loan default probabilities and applied financial data analysis in a realistic quant context.
View certificate ↗
Forage · Deloitte  ↗
Data & Analytics Job Simulation
Worked through real Deloitte client data scenarios — data cleaning, analysis, visualisation and insight presentation for business decision-making.
View certificate ↗
Continuing education
More in progress…
Always learning
08
Writing

Thoughts on
data

Writing about data engineering, analytics, ML and the messy realities of building real-world pipelines. Watch this space.

LinkedIn Article · 2025
Will Vibe Coding Change the Way Software Engineers Are Hired?

Exploring how AI-assisted "vibe coding" is reshaping the software hiring landscape — what it means for engineers, hiring managers, and the skills that actually matter now.

Read on LinkedIn ↗
Work in progress
More posts coming soon
Follow on LinkedIn for updates
Let's
work
together

Open to Data Analyst, Data Engineer and AI/ML Engineer roles — remote, hybrid or onsite anywhere across the US.

buchdevashree17@gmail.com