Technical Projects
Hands-on implementations focusing on ML systems, NLP, and production deployment
Problem Definition
CS students struggle to figure out their ideal career path based on their skills, leading to confusion about which roles to pursue and what skills to develop.
Approach
A full-stack AI-powered career guidance platform that semantically matches user skills to 244+ tech roles using BERT embeddings, identifies skill gaps, and generates personalized learning roadmaps with course recommendations.
Implementation Details
- User inputs their skills → BERT model (sentence-transformers) semantically matches them to 244+ tech roles from O*NET data
- Skill gap analysis engine calculates what skills are missing for target roles
- Interactive career roadmap visualization using React Flow showing progression paths
- Claude API integration for plain-English explanations of recommendations using RAG (Retrieval-Augmented Generation)
- YouTube API integration to recommend relevant courses for filling skill gaps
- Resume upload feature with PDF parsing to auto-extract user skills
- FastAPI backend with SQLite database for data persistence
- React frontend with 8 pages for seamless user experience
- Comprehensive test suite with 42 unit and integration tests
Technology Stack
Results & Evaluation
Platform supports 244 O*NET roles + 15 custom modern roles (AI Engineer, DevOps Engineer, Cloud Architect, etc.). Built 8 React pages with 42 comprehensive tests. Provides skill-to-role matching, gap analysis, and personalized learning paths.
Key Learnings
Deep dive into semantic similarity matching with BERT embeddings, RAG implementation with Claude API for contextual explanations, building interactive data visualizations with React Flow, and creating scalable career recommendation systems.
Problem Definition
Manufacturing teams need to analyze ATE test logs and track PCB yield across multiple board types, but manual analysis is time-consuming and error-prone.
Approach
A Python-based ATE log analysis and PCB yield tracking tool that automates log parsing, classifies hardware failure modes, and generates comprehensive defect reports with trend analysis.
Implementation Details
- Engineered Python tool to parse ATE-style test logs in CSV and JSON format
- Extracted per-board pass/fail records and computed yield KPIs across test steps
- Built SMT data analytics engine that classifies hardware failure modes (opens, shorts, parametric deviations) by board type and test step
- Generated ranked defect reports with trend charts for actionable insights
- Integrated SQLite traceability store logging board ID, test step, result, and timestamp for audit trail
- Implemented failure trend query system for historical analysis
- Validated with 18 pytest cases covering log parser accuracy, failure classification logic, yield computation, and PDF report generation
- Achieved zero regressions across all test cases
Technology Stack
Results & Evaluation
Successfully analyzed test logs for 150 boards across three board types. Automated defect classification and yield tracking, reducing manual analysis time. Comprehensive test coverage with 18 pytest cases ensures reliability.
Key Learnings
Gained expertise in ATE log parsing, SMT data analytics, hardware failure mode classification, yield analysis, defect trending, and building robust test automation with pytest.
Problem Definition
Manual product data extraction from e-commerce platforms is time-consuming, error-prone, and difficult to scale for competitive analysis and market research.
Approach
Designed an unattended UiPath RPA bot for end-to-end automated product data extraction from Flipkart with dynamic user input handling and structured data output.
Implementation Details
- Built unattended UiPath RPA bot accepting dynamic user input for flexible product searches
- Implemented web UI navigation automation to scrape structured product listings
- Extracted comprehensive product data including name, price, discount, and specifications
- Validated extracted data output against live Flipkart results for accuracy verification
- Automated export to formatted Excel report with proper data structure
- Confirmed reliability and accuracy of full automation workflow through end-to-end testing
Technology Stack
Results & Evaluation
Successfully automated complete product data extraction workflow from Flipkart. Bot reliably handles dynamic inputs, validates data accuracy, and generates formatted Excel reports, eliminating manual extraction efforts.
Key Learnings
Gained expertise in unattended RPA bot development, web UI automation, dynamic data scraping, Excel automation, and building reliable data validation workflows with UiPath.
Problem Definition
Need to track real-time public opinion and sentiment trends regarding the Russia-Ukraine conflict through social media analysis.
Approach
Built a multi-model sentiment classifier using TF-IDF vectorization and ensemble of classical ML algorithms to analyze tweet sentiment polarity.
Implementation Details
- Collected and preprocessed tweets using Twitter API
- Applied VADER sentiment analyzer for initial labeling
- Engineered features using TF-IDF vectorization
- Trained multiple classifiers: Logistic Regression, Random Forest, Naive Bayes
- Performed hyperparameter tuning to optimize model performance
- Deployed interactive Streamlit dashboard for real-time sentiment tracking
- Serialized models using Pickle for production deployment
Technology Stack
Results & Evaluation
Achieved 94% accuracy on test set. Dashboard successfully tracks sentiment polarity trends over time, enabling real-time public opinion analysis.
Key Learnings
Gained experience with social media data pipelines, handling noisy text data, model comparison and selection, and building production-ready ML dashboards.
Problem Definition
Manual monitoring of CCTV footage for suspicious activities like shoplifting and loitering is time-consuming and error-prone.
Approach
Developed an LSTM-based deep learning model to automatically detect temporal anomalies in surveillance video streams.
Implementation Details
- Collected and annotated surveillance video dataset
- Extracted temporal features from video frame sequences
- Designed LSTM architecture to capture sequential patterns
- Trained model to distinguish normal behavior from anomalies (shoplifting, loitering)
- Optimized inference pipeline for real-time processing
- Deployed system via Streamlit with alert mechanism
- Implemented 2.5-second average processing time per video segment
Technology Stack
Results & Evaluation
Achieved 92% accuracy with 2.5-second average processing time. System successfully deployed with real-time anomaly alerts.
Key Learnings
Deep dive into temporal modeling with LSTMs, video processing pipelines, real-time inference optimization, and handling imbalanced datasets.