Machine Learning Engineer
Eli Lilly and CompanyAug, 2023 - Dec, 20252 yr 4 months
Architected and deployed an end-to-end enterprise RAG system for 10,000+ complex clinical study documents using LangChain, OpenAI GPT4o, and OpenSearch, deployed on AWS, enabling contextual Q&A, document comparison, and reference-backed summarization to accelerate internal research workflows (evaluated using RAGAs). Reduced clinical documentation time by 60% by developing a Validation Plan Generator powered by LLMs (Azure OpenAI Claude 3.5), accelerating PSR (Periodic Safety Report) generation time from 11 weeks to 4 weeks. Developed a time series forecasting model to project Verzenio sales trends. Integrated model with a live dashboard used by leadership for strategic planning and market analysis. Built and deployed end-to-end machine learning models on Microsoft Fabric, leveraging Dataflows Gen2 and Lakehouse for centralized data ingestion and transformation, and Synapse Data Science Notebooks for feature engineering and model training using Python, pandas, scikit-learn, and PyTorch. Utilized MLflow integration for experiment tracking, model versioning, and registration, and deployed models as Fabric endpoints to serve REST APIs and integrate with Power BI dashboards for real-time analytics. Implemented automated retraining pipelines using Fabric Data Pipelines, ensuring continuous learning, and maintaining model accuracy in production. Won Lillys Best Project Award for our team project and received the Best Individual Achiever Award for Q1 2024.