Principal Engineer AI/ML
Aviso AIFeb, 2024 - Present2 yr 2 months
Designed and deployed an Agentic AI-based intelligent document processing system for automated loan approval platform, processing over 2M+ construction loan documents/month. Achieved a 45% reduction in manual review time through a comprehensive MDLC implementation leveraging hybrid OCR + LLM + ML model architecture with multimodal validation and intelligent workflow automation. Implemented robust data collection pipelines aggregating multi-format document sources (PDFs, images, invoices, certificates) with automated data quality validation and preprocessing workflows. Engineered feature extraction processes from unstructured documents using hybrid OCR-LLM pipeline, handling missing values, noise reduction, and format standardization across 2M+ monthly documents. Developed hybrid neural network architecture combining CNN layers for visual document analysis with transformer-based LLMs for semantic understanding. Implemented backpropagation optimization using Adam optimizer with ReLU activation functions for hidden layers and sigmoid for binary classification outputs. Applied advanced feature engineering techniques including document embedding generation, semantic similarity scoring, and multimodal feature fusion. Established comprehensive model evaluation framework using cross-validation, precision-recall metrics, and confidence scoring thresholds. Achieved >90% alignment accuracy through iterative model tuning, hyperparameter optimization, and ensemble methods. Implemented A/B testing protocols to validate model performance against baseline OCR systems. Built production-ready ML pipeline using FastAPI for real-time inference with distributed agent controller architecture. Implemented MLOps best practices including model versioning, automated retraining pipelines, and performance monitoring dashboards. Deployed scalable system handling 2M+ documents/month with <2s latency and 99.9% uptime. Auto-generated missing line-item descriptions using multimodal analysis of inspection photos and architectural plans, achieving >90% semantic accuracy through feature-engineered document embeddings. Developed contextual recommendation engine using NLP-based user profiling and policy matching algorithms, reducing approval delays through intelligent SOP automation. Implemented semantic validation layers ensuring coherence between budget data and supporting documents via transformer-based similarity scoring. Built semantic similarity matching engine with custom-trained embeddings, enabling multi-vendor document format support through advanced feature engineering and fuzzy matching algorithms with confidence-based threshold tuning. Developed content recommendation engine clustering inspection photos using CNN-based visual feature extraction and cosine similarity scoring for contextually relevant photo-to-line-item matching. Implemented ensemble models combining structured and unstructured data matching with anomaly detection algorithms for overbudget prediction and claim verification. Engineered fallback heuristics with probabilistic confidence scoring, ensuring robust performance across edge cases through systematic model evaluation. Built audit trail system with explainable AI features providing granular ML/LLM decision transparency for compliance reviews and model interpretability. Implemented real-time feedback loops with active learning capabilities, enabling continuous model improvement through human-in-the-loop validation and automated retraining pipelines.