Arunika
Our Work
AI / LLM & RAG

Competitor Intelligence Agent

An AI-powered agent that ingests unstructured competitor documents (PDFs), extracts structured insights using LLM + RAG, stores them in a SQL database, and enables natural language querying — built with FastAPI, LangChain, ChromaDB, Gemini API, and Docker.

3

Core AI components

129

Pages processed

Docker

Production-ready

Competitor Intelligence Agent

The Challenge

Companies dealing with large volumes of competitor research documents (PDFs, reports) struggle to extract structured insights at scale — manual review is slow, inconsistent, and impossible to query programmatically.


Our Solution

Built a three-component AI agent system: a RAG ingestion pipeline (LangChain + ChromaDB) for semantic document retrieval, an LLM extraction agent (Gemini API + Pydantic) for structured data extraction, and a Natural Language to SQL interface for querying extracted data.

  • RAG pipeline: PDF ingestion → text chunking → vector embedding → ChromaDB indexing
  • LLM extraction agent: Gemini API reads retrieved context → extracts structured JSON → Pydantic validation
  • NL-to-SQL interface: natural language question → Gemini generates SQL → SQLite executes → natural language answer
  • FastAPI REST endpoints: /ingest, /query, /rag-query, /competitors with filter support
  • Full Docker + docker-compose deployment for production-ready containerization
  • SQL injection prevention, SELECT-only enforcement, and hallucination mitigation prompts

Tech Stack

Python FastAPI LangChain ChromaDB Gemini API Pydantic SQLite Docker sentence-transformers

The Result

Delivered a fully containerized AI agent capable of processing 129-page competitor reports, extracting structured competitive intelligence, and answering natural language queries like 'Which games lead downloads in Indonesia?' — all via a clean REST API.

Got a project in mind?

Drop us a message and we'll get back to you within 24 hours.

Tell Us About It