profile-pic

Nikhil Sai Chigullapalli

At Kore.ai, I contribute as a Software Engineer, focusing on the application and development of Large Language Models (LLMs), Generative AI, and Retrieval-Augmented Generation (RAG). My work involves implementing innovative solutions for conversational AI and enterprise search.

With a B.Tech in Electronics and Communication Engineering from NIT Jamshedpur, I bring strong technical expertise to the development and deployment of scalable AI-driven systems. I am committed to advancing AI technologies that address complex challenges and improve user experiences.

  • Role

    RAG Engineer (AI/ML)

  • Years of Experience

    3.6 years

  • Professional Portfolio

    View here

Skillsets

  • Model evaluation
  • TensorFlow
  • Text embeddings
  • Transformers
  • Weaviate
  • Data pipeline design
  • Feature Engineering
  • few-shot learning
  • Hugging Face
  • SQL
  • Multi-Agent Systems
  • Opensearch
  • Prompt Engineering
  • semantic search
  • text2sql
  • Toxicity filtering
  • Response moderation
  • Aiohttp
  • Rest APIs
  • Redis
  • rag
  • PyTorch
  • Python
  • MongoDB
  • Microservices
  • LLM Fine-tuning
  • LangGraph
  • LangChain
  • Knowledge graphs
  • JavaScript
  • Hallucination detection
  • Elasticsearch
  • asyncio

Professional Summary

3.6Years
  • Jul, 2022 - Present3 yr 8 months

    Software Engineer (AI/ML)

    Kore.ai

Work History

3.6Years

Software Engineer (AI/ML)

Kore.ai
Jul, 2022 - Present3 yr 8 months
    Excel/CSV RAG with Autonomous Query Planning: Designed goal-oriented agentic orchestrator employing plan-execute-replan cycles for complex analytical queries over structured data sources. Developed intelligent query router where LLM autonomously selects between Text2SQL and semantic RAG based on qualified chunk analysis and table metadata, optimizing for aggregation vs. semantic query patterns. Implemented multi-hop reasoning with goal decomposition, enabling agents to break down complex questions into executable sub-tasks, validate intermediate results, and dynamically replan execution strategies. Achieved seamless handling of hybrid queries requiring both structured SQL operations and unstructured semantic retrieval through autonomous decision-making. Intelligent Document Extraction with Adaptive Strategies: Built context-aware extraction pipeline with autonomous decision-making for optimal chunking strategies based on document structure, content type, and embedded media. Developed adaptive processing logic for documents containing images, tables, and mixed content. Improved chunk retrieval quality by 40% through intelligent preprocessing and context-aware segmentation, directly enhancing downstream RAG system performance. Agentic RAG System with Multi-Agent Orchestration: Built an autonomous RAG framework using LangChain and LangGraph with full agentic capabilities including planning, reflection, and self-correction mechanisms to handle complex multi-step queries. Implemented hybrid tool-selection logic enabling both rule-based and LLM-driven autonomous decision-making, allowing developers to configure agent behavior based on use case requirements. Built intelligent orchestrator managing multi-agent collaboration for enterprise connector applications, enabling autonomous execution of cross-platform tasks (email automation, ticket management, CRM integration) through dynamic API orchestration. Integrated memory management, context tracking, and adaptive reasoning loops to improve agent decision quality over conversation sessions. Deployed as production microservice handling real-time agentic workflows with sub-second latency requirements. Enhanced NLP Pipeline with LLM Fine-Tuning: Designed end-to-end NLP system using domain-adapted open-source LLMs fine-tuned on proprietary conversational data for intent recognition and entity extraction. Integrated multi-layer guardrails for AI safety including toxicity filtering, hallucination detection, and response moderation to ensure compliant and trustworthy outputs. Deployed as scalable microservice within Kubernetes infrastructure, serving real-time inference with optimized latency and resource utilization. Collaborated with enterprise clients to improve model behavior through iterative feedback on labeling strategies, domain adaptation techniques, and fallback mechanisms. High-Performance Async ML Client: Engineered asynchronous Python client using aiohttp and asyncio for distributed ML inference workloads, supporting parallel processing of agent tool calls and API integrations. Implemented intelligent retry logic, timeout management, and request batching to optimize resource utilization and reduce end-to-end latency in agentic workflows. Production ML Microservices & Optimization: Contributed in rebuilding an optimised Python-based ML microservices reducing inference latency by significant margins. Which helped in scaling the system for high-throughput, low-latency scenarios supporting thousands of concurrent requests in conversational AI platforms.

Education

  • Bachelor of Technology in Electronics and Communication Engineering

    National Institute of Technology, Jamshedpur (2022)