Work Experience Link to heading

Data Analyst
FHI 360, Washington, DC
Jun 2023 – Nov 2023

  • Reduced patient dropout rates from 72% to 51% by identifying patients at high risk of treatment interruptions with a KNN model built with the HIV team achieving an accuracy of 65%.
  • Developed dashboards for analysis by consolidating data from multiple sources into a data warehouse, driving cost-effective decisions and enhancing scope of project viability studies.
  • Engineered a Python framework to extract Congressional Budget Justification data, providing a strategic forecast of project outlook and funding allocations for upcoming fiscal years in specific regions and countries through dashboards.

Senior Automation Engineer
Acronotics, Bengaluru, India
Jul 2019 - Sep 2022

  • Cut recruitment time by 40% through shortlisting candidates using Gaussian Process Regression based on partner data.
  • Improved user experience by leading a team of engineers as the Solution architect and augmenting features like auto-ticket assignment, auto- rescheduling into RadiumAi, a tool to monitor RPA process across platforms.
  • Increased memory utilization and decreased training runtime through data and model parallelization techniques, enhancing the efficiency of machine learning workflows.
  • Reduced payment term from three days to one by implementing OCR to automate invoice processing, therby accelerating the payment cycle and improving supply chain efficiency.
  • Eliminated dependency on users and ensured consistency by automating data collection, manipulation and report generation.

Data Engineer
Tata Consultancy Services, Bengaluru, India
Jun 2018 - Jul 2019

  • Streamlined data flow for analysis by designing and deploying pipelines that ingest data from AWS S3 buckets into Snowflake and automate transformations through Streams and Tasks, resulting in consistent and efficient data processing.
  • Deployed scripts to automate data extraction, transformation, and visualization from over hundred data sources, ensuring real-time updates and enhancing data accessibility.

Projects Link to heading

Brain Tumor Classification – Deployment ready project

  • Developed a brain-tumor classifier using CNN and managed versioning and artifacts of model using MLFlow.
  • Automated training and retraining of model using Airflow DAG’s based on feedback received.
  • Deployed a scalable application developed using Streamlit leveraging Restful API endpoints on Kubernetes, based on the latest Docker image in the artifact registry, pushed by GitHub Actions on change thereby achieving CI/CD.
  • Monitored model for confidence and prediction distribution, data for drift and skew to increase lifecycle of model.

Body Composition Scanner

  • A tool to calculate body composition metrics using images, inspired by the research paper on Human proportions by Prof. John Verzani which showcased the linear relationship between body measurements such as neck to fat composition.
  • Predicted neck and wrist measurements from images using a Resnet model and used a linear model to measure body composition metrics

Penalty Analysis and Predicion

  • Created a new dataset with paramters such as isFansSide and established thaere exists a complex realtionship between them and direction of penalty for one test player Harry Kane through EDA.
  • Developed a KNN model, improved its accuracy from 44% to 89% by feature engineering, basis expansion and hyperparameter tuning
  • Applied various ML tecniques and settled on distance based KNN moedel which when tuned to seven neighbours achieved an accuracy of 86%.

Classical Machine Learning Algorithms

  • Implemented classical machine learning algorithms (Linear Regression, Logistic Regression, and SVM) from scratch using Python, NumPy, and SciPy, demonstrating strong foundational understanding of model optimization, regularization, gradient descent.
  • Developed comprehensive evaluation metrics (Accuracy, RMSE, SSE, Precision, Recall) to assess model performance, improving the interpretability and robustness of predictions.

Formula1 – Battle for the Drivers’ Championship Analysis and Dashboard

  • Extracted and integrated data from various sources and APIs, utilized Python for data wrangling, and created a comprehensive dashboard displaying the 2021 season insights using Tableau.
  • Used Exploratory Data Analysis techniques to uncover factors influencing the 2021 championship battle.

Health Centre Database and Datawarehouse for Analysis

  • Designed and modeled a database to handle various aspects involved in a healthcare center.
  • Created a multidimensional model with several facts such as Consultation, Tests Conducted, and Operations performed to analyze health center metrics and identify potential hazards.
  • Implemented OLTP and OLAP databases using PostgreSQL and used Talend Jobs for ETL operations.

MBTA – Machine learning model for predicting the load on bus

  • Improved user satisfaction by reducing load on bus by 30% with scheduling strategies derived based on a random forest model developed with MBTA Data Science team which achieved an accuracy of 84.8%.

European Football/Soccer Database Management System

  • Architected and implemented a scalable and robust database system for the European football/soccer league using MySQL and MongoDB.
  • Integrated MySQL database with Python using SQLAlchemy for data analysis.

Education Link to heading

Masters in Data Analytics
Northeastern University, Boston, MA
Expected Graduation: December 2024
GPA: 3.9/4.0

Relevant Coursework:

  • Machine learning
  • Machine Learning Operations (MLOps)
  • Natural Language Processing
  • Data Mining
  • Datawarehouse and Business Intelligence

Bachelor of Science in Information Science
Dr Ambedkar Intitute of Technology, Bengaluru, Karnataka
Graduation: June 2018 GPA: 3.46/4.0

Relevant Coursework:

  • Design and Analysis of Algorithms
  • Object-Oriented Design
  • Reliability, Queueing Theory and Probability
  • Cloud Computing

Skills Link to heading

  • Frameworks: Pandas, NumPy, TensorFlow, PyTorch, Scikit-learn, BeautifulSoup, Scrapy
  • Databases and Datawarehouse Tools: MySQL, PostgreSQL, MongoDB, Snowflake, Talend
  • Cloud and Visualization Tools: GCP, AWS, Tableau, Power BI
  • MLOps and CI/CD: Airflow, MLflow, TFX, DVC, Docker, Kubernetes, GitHub Actions
  • Programming Languages: Python, Java, C++, C#
  • Automation Tools: Automation Anywhere, Power Automate, UI Path, RulAI, VB Script
  • Project Management Tools: Jeera, Workzone, Monday.com

Contact Information Link to heading