Profile photo

Name: Aishwarya Subrahmanya

Profile: Data Engineer/ Data Analyst

Email: belakavadisubrahma.a@northeastern.edu
aishwaryabs6@gmail.com

Phone: (857) 396-4932

About me

I am a graduate of Northeastern University with a Master’s in Data Analytics Engineering and I hold a B.Tech in Computer Science from PES University. I work with data end-to-end cleaning, processing and transforming datasets, building ETL pipelines, modeling databases and developing dashboards and reports that support decision-making.

Strong in Python and SQL and proficient with Tableau, Excel, Google Sheets, Spark, and Statistics, I also use AWS and Snowflake for scalable data solutions. I am continuously upskilling in machine learning, staying curious and focused as I grow in this field. An AI and ML enthusiast, I love working on projects and research, always aiming to deliver meaningful results through data.

SKILLS

All Data Analysis Data Engineering Programming Cloud & Big Data Machine Learning
Python
C
Java
R
Shell
HTML
CSS
JavaScript
PHP
MATLAB
React.js
SQL
MySQL
PostgreSQL
MongoDB
Pandas
Numpy
Scikit-learn
TensorFlow
PyTorch
Keras
Matplotlib
Plotly
Excel
Tableau
Power BI
Hadoop
Spark
PySpark
Databricks
Kafka
AWS
Docker
Kubernetes
Git

Resume

Education

Northeastern University

Sep 2023 – May 2025 • Boston, MA

Master of Science — Data Analytics Engineering

  • Coursework: Deep Learning, Neural Networks, NLP, Reinforcement Learning
  • Data Analytics, Data Management, Data Mining, Algorithms
  • Data Visualization & Computation

PES University

Aug 2019 – May 2023 • Bangalore, India

Bachelor of Technology — Computer Science & Engineering

  • Coursework: Statistics, Big Data, Machine Learning, DBMS, Information Retrieval
  • Linear Algebra, Data Structures, Operating Systems, Cloud Computing, Software Testing

Professional Experience

Unthink Inc

Jun 2024 – Dec 2024 • Dallas, TX

Data Engineering Intern

  • Built ETL pipelines in Python to clean and aggregate high-volume order data from MongoDB and load into AWS S3, reducing dashboard load time by 40%.
  • Collaborated with frontend engineers on real-time API integrations, cutting sync delays by 30%.
  • Optimized MongoDB and Snowflake query performance with indexing & schema design, improving API response times by 25%.

Tech Mahindra

Jan 2023 – May 2023 • Bangalore, India

Data Analyst Intern

  • Designed Tableau dashboards for monthly operational metrics.
  • Performed churn analysis with Python models, reducing customer attrition by 4%.
  • Cleaned and validated datasets to enhance reporting accuracy.
  • Conducted statistical analysis on 2019–2023 data, presenting insights for strategy planning.

Maruthi Technics

Jul 2022 – Dec 2022 • Bangalore, India

Data Analyst Intern

  • Analyzed defect data with QC teams, identifying root causes and reducing defects by 12%.
  • Forecasted raw material demand with time-series models, reducing inventory costs by 20%.

Projects

Medical NER repo preview

Medical Named Entity Recognition using NLP

Extract diseases, symptoms, medications, and treatments from clinical text with transformer models (BERT/BioBERT).

Python, Transformers (BioBERT), PyTorch, spaCy, scikit-learn, Pandas

Object detection repo preview

Object Detection using Neural Networks

Real-time detection pipeline with YOLOv5 and Faster R-CNN to identify and classify objects accurately.

Python, PyTorch, YOLOv5, Faster R-CNN, OpenCV

Intoxication detection repo preview

Intoxication Detection using Speech

MFCC feature extraction + transformer-based models on 1000+ audio samples. Published at IEEE.

Python, Librosa (MFCC), PyTorch/Transformers, NumPy, Scikit-learn

Fake news detection repo preview

Fake News Detection and Sentiment Analysis

NLP + classification models achieving ~85% accuracy on tweet-level misinformation detection.

Python, NLP, scikit-learn, Pandas, NLTK/spaCy

Warehouse management repo preview

Warehouse Management Analytics

Workflows for inventory and operations KPIs; demand and stock movement insights.

Python, SQL, Pandas, Tableau/Power BI

Volunteer matchmaking repo preview

Volunteer Matchmaking

Matching volunteers to opportunities using profile features and ranking logic.

Python, Flask/Node, SQL, Ranking/Scoring

RFM segmentation repo preview

Customer Segmentation (RFM)

RFM scoring to create customer cohorts for retention and targeted marketing.

Python, Pandas, RFM, Matplotlib/Seaborn

EEG classification repo preview

EEG Classification Model

Signal processing + ML pipeline for EEG time-series classification.

Python, SciPy, scikit-learn, TensorFlow/PyTorch

Crime analysis repo preview

Crime Analysis

Exploratory data analysis and visualization of crime trends and hotspots.

Python, Pandas, Geo/Time-series Viz, Matplotlib/Plotly

Health conditions repo preview

Health Conditions Among Children

Notebook analysis of pediatric health indicators and risk factors.

Python, Pandas, EDA, Visualization

Earthquake analysis repo preview

Earthquake Analysis (Python)

Time-series and geospatial exploration of earthquake events.

Python, Pandas, Time-series, Geospatial Viz

Face mask detection repo preview

Face Mask Detection

Computer vision model to detect mask usage from images/video frames.

Python, OpenCV, TensorFlow/Keras, CNNs

Blood bank DB repo preview

Blood Bank Management (DB)

Database design + UI for managing donors, inventory, and requests.

SQL, Database Design (ERD), HTML/CSS/JS

Stroke prediction repo preview

Stroke Prediction

Feature engineering + classification models for stroke risk prediction.

Python, Feature Engineering, scikit-learn, XGBoost