Amazon
Software Development Engineer Intern
End-to-End Search Query Classification System
Implemented a comprehensive search query classification system to fetch and normalize 50,000 search queries from database, applied LLM prompt templates via Python script to categorize queries into distinct categories, and stored results in Amazon S3.
📋 Project Overview
Implemented a comprehensive search query classification system to fetch and normalize 50,000 search queries from database, applied LLM prompt templates via Python script to categorize queries into distinct categories, and stored results in Amazon S3.
✨ Key Highlights
Fetched and normalized 50,000+ search queries from database using SQL
Applied LLM prompt templates for intelligent query categorization
Stored classifications in Amazon S3 for scalable access
Implemented caching mechanism using HashMap data structure for classification-aligned query execution
Achieved precise category-specific query results using Java Spring Framework
🛠️ Technology Stack
🏗️ System Architecture
System Components
Query Fetcher
Fetches and normalizes search queries from database
LLM Classifier
Applies LLM to categorize queries
S3 Storage
Stores query classifications
Cache Layer
In-memory cache using HashMap data structure
Query Executor
Executes category-specific queries
Data Flow
Sends normalized queries
50K search queriesStores categorized queries
Query + Category mappingsLoads classifications
Category dataProvides cached classifications
Classification resultsArchitecture Flow
Query Fetcher
LLM Classifier
S3 Storage
Cache Layer
Query Executor
🔄 End-to-End Flow
Sends normalized queries
50K search queriesStores categorized queries
Query + Category mappingsLoads classifications
Category dataProvides cached classifications
Classification results📊 Component Details
Query Fetcher
Fetches and normalizes search queries from database
LLM Classifier
Applies LLM to categorize queries
S3 Storage
Stores query classifications
Cache Layer
In-memory cache using HashMap data structure
Query Executor
Executes category-specific queries
🎯 Impact & Results
Fetched and normalized 50,000+ search queries from database using SQL
Applied LLM prompt templates for intelligent query categorization
Stored classifications in Amazon S3 for scalable access