Health Orchestrator | FastAPI, LangGraph, MCP, PostgreSQL, WebSockets, Android, LLMs: Architected a multi-agent orchestration layer handling requests via WebSockets, reducing response latency by 30% while eliminating agent collisions through deterministic sequencing. Engineered PostgreSQL-backed session memory to persist context across 20+ turn interactions, improving clinical instruction recall by 40% and ensuring rigid agent boundaries. Deployed MCP-governed guardrails with custom hallucination checks, blocking 100% of non-compliant tool calls and securing tool-chain execution for safety-critical queries. Samsung Personal Health Records (PHR) | Triton, EC2, SQS, S3, Auto Scaling, Docker, LLMs: Orchestrated a scalable LLM inference service on AWS EC2 via Triton Inference Server, utilizing custom AMIs and Auto Scaling groups to optimize compute costs by 20%. Engineered an asynchronous processing pipeline using SQS and S3, decoupling data ingestion to reduce end-to-end latency by 70% and increasing throughput by 3x. Samsung Internet Browser Semantic Search | Python, ML, RAG, NLP, TensorFlow, FastAPI, LLMs: Engineered a hybrid semantic-search and autocorrect system by ensembling BERT contextual embeddings with Word2Vec and Doc2Vec, achieving 97% Top-k accuracy on complex queries. Productionized a 94-language polyglot NLU model served via FastAPI and integrated with Android (Retrofit), optimizing inference for real-time multilingual support. Architected a RAG pipeline using dense vector embeddings and GPT for contextual query generation, improving intent resolution rates by 30% via semantic grounding. Smart Vibration and Ringtone Adjustment System | TensorFlow Lite, CNN, Android On-device ML: Engineered an ultra-compact 719 KB on-device CNN for surface classification (98.8% accuracy), optimizing inference to run within 610 KB RAM & 10 MB ROM to drive adaptive haptic feedback.