AI / LLM & RAG

Competitor Intelligence Agent

An AI-powered agent that ingests unstructured competitor documents (PDFs), extracts structured insights using LLM + RAG, stores them in a SQL database, and enables natural language querying — built with FastAPI, LangChain, ChromaDB, Gemini API, and Docker.

Core AI components

129

Pages processed

Docker

Production-ready

The Challenge

Companies dealing with large volumes of competitor research documents (PDFs, reports) struggle to extract structured insights at scale — manual review is slow, inconsistent, and impossible to query programmatically.

Our Solution

Built a three-component AI agent system: a RAG ingestion pipeline (LangChain + ChromaDB) for semantic document retrieval, an LLM extraction agent (Gemini API + Pydantic) for structured data extraction, and a Natural Language to SQL interface for querying extracted data.

RAG pipeline: PDF ingestion → text chunking → vector embedding → ChromaDB indexing
LLM extraction agent: Gemini API reads retrieved context → extracts structured JSON → Pydantic validation
NL-to-SQL interface: natural language question → Gemini generates SQL → SQLite executes → natural language answer
FastAPI REST endpoints: /ingest, /query, /rag-query, /competitors with filter support
Full Docker + docker-compose deployment for production-ready containerization
SQL injection prevention, SELECT-only enforcement, and hallucination mitigation prompts

Tech Stack

Python FastAPI LangChain ChromaDB Gemini API Pydantic SQLite Docker sentence-transformers

The Result

Delivered a fully containerized AI agent capable of processing 129-page competitor reports, extracting structured competitive intelligence, and answering natural language queries like 'Which games lead downloads in Indonesia?' — all via a clean REST API.

Got a project in mind?

Drop us a message and we'll get back to you within 24 hours.

Tell Us About It