Technical Projects

Hands-on implementations focusing on ML systems, NLP, and production deployment

CS Career Pathway Explorer
Independent Project | Jan 2026 - May 2026 |Completed
Code

Problem Definition

CS students struggle to figure out their ideal career path based on their skills, leading to confusion about which roles to pursue and what skills to develop.

Approach

A full-stack AI-powered career guidance platform that semantically matches user skills to 244+ tech roles using BERT embeddings, identifies skill gaps, and generates personalized learning roadmaps with course recommendations.

Implementation Details

  • User inputs their skills → BERT model (sentence-transformers) semantically matches them to 244+ tech roles from O*NET data
  • Skill gap analysis engine calculates what skills are missing for target roles
  • Interactive career roadmap visualization using React Flow showing progression paths
  • Claude API integration for plain-English explanations of recommendations using RAG (Retrieval-Augmented Generation)
  • YouTube API integration to recommend relevant courses for filling skill gaps
  • Resume upload feature with PDF parsing to auto-extract user skills
  • FastAPI backend with SQLite database for data persistence
  • React frontend with 8 pages for seamless user experience
  • Comprehensive test suite with 42 unit and integration tests

Technology Stack

Python
FastAPI
SQLite
React.js
BERT (sentence-transformers)
Claude API
YouTube API
React Flow
PDF Parsing
O*NET Database

Results & Evaluation

Platform supports 244 O*NET roles + 15 custom modern roles (AI Engineer, DevOps Engineer, Cloud Architect, etc.). Built 8 React pages with 42 comprehensive tests. Provides skill-to-role matching, gap analysis, and personalized learning paths.

Key Learnings

Deep dive into semantic similarity matching with BERT embeddings, RAG implementation with Claude API for contextual explanations, building interactive data visualizations with React Flow, and creating scalable career recommendation systems.

ATE Log Analyzer & PCB Yield Tracker
Independent Project | April 2026 - Present |In Development
Code

Problem Definition

Manufacturing teams need to analyze ATE test logs and track PCB yield across multiple board types, but manual analysis is time-consuming and error-prone.

Approach

A Python-based ATE log analysis and PCB yield tracking tool that automates log parsing, classifies hardware failure modes, and generates comprehensive defect reports with trend analysis.

Implementation Details

  • Engineered Python tool to parse ATE-style test logs in CSV and JSON format
  • Extracted per-board pass/fail records and computed yield KPIs across test steps
  • Built SMT data analytics engine that classifies hardware failure modes (opens, shorts, parametric deviations) by board type and test step
  • Generated ranked defect reports with trend charts for actionable insights
  • Integrated SQLite traceability store logging board ID, test step, result, and timestamp for audit trail
  • Implemented failure trend query system for historical analysis
  • Validated with 18 pytest cases covering log parser accuracy, failure classification logic, yield computation, and PDF report generation
  • Achieved zero regressions across all test cases

Technology Stack

Python
SQLite
pytest
pandas
PDF Generation
CSV Parsing
JSON Parsing
Data Analytics

Results & Evaluation

Successfully analyzed test logs for 150 boards across three board types. Automated defect classification and yield tracking, reducing manual analysis time. Comprehensive test coverage with 18 pytest cases ensures reliability.

Key Learnings

Gained expertise in ATE log parsing, SMT data analytics, hardware failure mode classification, yield analysis, defect trending, and building robust test automation with pytest.

Flipkart Product Data Extraction Bot
Independent Project | Aug 2025 - Sept 2025 |Completed

Problem Definition

Manual product data extraction from e-commerce platforms is time-consuming, error-prone, and difficult to scale for competitive analysis and market research.

Approach

Designed an unattended UiPath RPA bot for end-to-end automated product data extraction from Flipkart with dynamic user input handling and structured data output.

Implementation Details

  • Built unattended UiPath RPA bot accepting dynamic user input for flexible product searches
  • Implemented web UI navigation automation to scrape structured product listings
  • Extracted comprehensive product data including name, price, discount, and specifications
  • Validated extracted data output against live Flipkart results for accuracy verification
  • Automated export to formatted Excel report with proper data structure
  • Confirmed reliability and accuracy of full automation workflow through end-to-end testing

Technology Stack

UiPath
RPA
Excel Automation
Web Scraping

Results & Evaluation

Successfully automated complete product data extraction workflow from Flipkart. Bot reliably handles dynamic inputs, validates data accuracy, and generates formatted Excel reports, eliminating manual extraction efforts.

Key Learnings

Gained expertise in unattended RPA bot development, web UI automation, dynamic data scraping, Excel automation, and building reliable data validation workflows with UiPath.

AI-Driven Sentiment Analysis on Warzone Tweets
Group Project | Dec 2024 - Mar 2025 |Completed
Code

Problem Definition

Need to track real-time public opinion and sentiment trends regarding the Russia-Ukraine conflict through social media analysis.

Approach

Built a multi-model sentiment classifier using TF-IDF vectorization and ensemble of classical ML algorithms to analyze tweet sentiment polarity.

Implementation Details

  • Collected and preprocessed tweets using Twitter API
  • Applied VADER sentiment analyzer for initial labeling
  • Engineered features using TF-IDF vectorization
  • Trained multiple classifiers: Logistic Regression, Random Forest, Naive Bayes
  • Performed hyperparameter tuning to optimize model performance
  • Deployed interactive Streamlit dashboard for real-time sentiment tracking
  • Serialized models using Pickle for production deployment

Technology Stack

Twitter API
VADER
TF-IDF
Logistic Regression
Random Forest
Naive Bayes
Streamlit
Pickle
Python

Results & Evaluation

Achieved 94% accuracy on test set. Dashboard successfully tracks sentiment polarity trends over time, enabling real-time public opinion analysis.

Key Learnings

Gained experience with social media data pipelines, handling noisy text data, model comparison and selection, and building production-ready ML dashboards.

Anomaly Detection System for CCTV Footage
Independent Project | Nov 2025 - Dec 2025 |Completed
Code

Problem Definition

Manual monitoring of CCTV footage for suspicious activities like shoplifting and loitering is time-consuming and error-prone.

Approach

Developed an LSTM-based deep learning model to automatically detect temporal anomalies in surveillance video streams.

Implementation Details

  • Collected and annotated surveillance video dataset
  • Extracted temporal features from video frame sequences
  • Designed LSTM architecture to capture sequential patterns
  • Trained model to distinguish normal behavior from anomalies (shoplifting, loitering)
  • Optimized inference pipeline for real-time processing
  • Deployed system via Streamlit with alert mechanism
  • Implemented 2.5-second average processing time per video segment

Technology Stack

Python
LSTM
TensorFlow/Keras
OpenCV
Streamlit
Pickle
NumPy

Results & Evaluation

Achieved 92% accuracy with 2.5-second average processing time. System successfully deployed with real-time anomaly alerts.

Key Learnings

Deep dive into temporal modeling with LSTMs, video processing pipelines, real-time inference optimization, and handling imbalanced datasets.

Roshni Kobula Raja

AI Engineer pursuing MS in Computer Science at Binghamton University

Connect

© 2026 Roshni Kobula Raja. Built with React, FastAPI, and MongoDB.

Made with Emergent