Internship Experience

Deep dive into the systems and architectures I built during my internships

🚀

Amazon

Software Development Engineer Intern

May 2025 - August 2025

Seattle, USA

End-to-End Search Query Classification System

Implemented a comprehensive search query classification system to fetch and normalize 50,000 search queries from database, applied LLM prompt templates via Python script to categorize queries into distinct categories, and stored results in Amazon S3.

PythonLLMAmazon S3JavaSpring BootSQL

Key Highlights

▸Fetched and normalized 50,000+ search queries from database using SQL
▸Applied LLM prompt templates for intelligent query categorization
▸Stored classifications in Amazon S3 for scalable access
▸Implemented caching mechanism using HashMap data structure for classification-aligned query execution
▸Achieved precise category-specific query results using Java Spring Framework

System Components

Query Fetcher

Fetches and normalizes search queries from database

SQL

LLM Classifier

Applies LLM to categorize queries

PythonLLM API

S3 Storage

Stores query classifications

Amazon S3JSON

Cache Layer

In-memory cache using HashMap data structure

HashMapJava

Query Executor

Executes category-specific queries

JavaSpring Boot

Data Flow

Query FetcherLLM Classifier

Sends normalized queries

50K search queries

LLM ClassifierS3 Storage

Stores categorized queries

Query + Category mappings

S3 StorageCache Layer

Loads classifications

Category data

Cache LayerQuery Executor

Provides cached classifications

Classification results

Architecture Flow

Query Fetcher

LLM Classifier

S3 Storage

Cache Layer

Query Executor

View Full Project Details & README

📊

Innomatics Research Labs

Data Science Intern

January 2024 - April 2024

Hyderabad, India

Product Review Sentiment Analysis System

Engineered a comprehensive system to analyze and classify over 8,500 product reviews, leveraging Prefect for ETL pipeline automation and scheduling. Trained multiple sentiment analysis models achieving F1-Score of 0.92.

PythonPrefectBERTTF-IDFWord2VecMLflowAWS

Key Highlights

▸Analyzed and classified 8,500+ product reviews
▸Implemented Prefect for ETL pipeline automation and scheduling
▸Trained models using BoW, TF-IDF, Word2Vec, and BERT
▸Achieved F1-Score of 0.92 with BERT model
▸Utilized MLflow for model management and experiment tracking
▸Deployed sentiment analysis web application on AWS
▸Enabled real-time customer feedback insights

System Components

Data Ingestion

Fetches product reviews from multiple sources

PythonPrefectAPIs

ETL Pipeline

Automated data transformation and loading

PrefectPythonPandas

Feature Engineering

Creates features using BoW, TF-IDF, Word2Vec

Scikit-learnNLTK

Model Training

Trains multiple sentiment models

BERTPython

MLflow Tracker

Tracks experiments and model versions

MLflowPython

Inference API

Real-time sentiment prediction

FlaskREST API

AWS Deployment

Cloud hosting and scaling using ECS cluster

AWS ECS

Data Flow

Data IngestionETL Pipeline

Raw review data

8,500+ reviews

ETL PipelineFeature Engineering

Cleaned and preprocessed text

Processed reviews

Feature EngineeringModel Training

Feature vectors

BoW, TF-IDF, Word2Vec, BERT embeddings

Model TrainingMLflow Tracker

Model metrics and artifacts

F1-Score: 0.92

Model TrainingInference API

Trained model

Best BERT model

Inference APIAWS Deployment

Deploy model to ECS cluster

Production model