October 3, 2025

Agentic chat

Agentic chat is an AI workspace that combines document search, memory, tools, and research flows so it can answer with better context and lower repeat cost.

Live project GitHub

Overview

Agentic chat routes requests through different execution paths based on intent: simple queries use cached responses, document questions trigger RAG, complex tasks use LangGraph research flows, and tool calls go through Google Workspace integration.

The stack includes OpenAI for generation, Mem0 for long-term memory, PostgreSQL for vector storage, LangGraph for multi-step planning, and a custom router that selects the optimal path per request.

How It Was Built

The main technical choices behind the product, from system design to the parts that make it work day to day.

Implemented intent classification to route requests: cached responses for repeated queries, RAG for document-specific questions, LangGraph agents for multi-step research, and direct tool calls for Google Workspace actions.
Built document ingestion pipeline with PDF/Word parsing, semantic chunking (512 tokens with overlap), OpenAI embeddings, PostgreSQL pgvector storage, and cross-encoder reranking for relevance.

Implemented intent classification to route requests: cached responses for repeated queries, RAG for document-specific questions, LangGraph agents for multi-step research, and direct tool calls for Google Workspace actions.
Built document ingestion pipeline with PDF/Word parsing, semantic chunking (512 tokens with overlap), OpenAI embeddings, PostgreSQL pgvector storage, and cross-encoder reranking for relevance.
Created LangGraph research agents with explicit planning phases, parallel task execution, result synthesis, and self-correction loops for complex queries.
Added conversation branching, export to markdown, and Google Workspace integration (Gmail, Calendar, Docs) so users can take action on AI-generated insights.

Impact

Semantic caching reduced API costs by 40% because similar requests stopped repeating the same expensive work.
Answers improved because the product chooses the right context for each request instead of pushing everything through one path.

Semantic caching reduced API costs by 40% because similar requests stopped repeating the same expensive work.
Answers improved because the product chooses the right context for each request instead of pushing everything through one path.
Memory, branching, and bring-your-own-key support made the product easier to use for ongoing work instead of one-off demos.

Highlights

Semantic caching reduced API costs by 40% by avoiding redundant embeddings and model calls.
Routing layer achieves sub-100ms intent classification with 92% accuracy on path selection.

Semantic caching reduced API costs by 40% by avoiding redundant embeddings and model calls.
Routing layer achieves sub-100ms intent classification with 92% accuracy on path selection.
LangGraph research agents handle 8-step planning with parallel execution in under 30 seconds.

Tech Stack

Next.jsTypeScriptOpenAIMem0PostgreSQLLangGraphRAGAgentic ResearchAgents

More Projects

Additional work across AI products, developer tooling, and full-stack systems.

Browse all →

Next.js 16

Edward

An AI coding workspace where developers can describe apps in plain language, generate production-ready code, inspect and edit files in real-time, run projects in isolated Docker environments, publish live previews, and sync everything directly to GitHub without leaving the product.

View project

Next.js

Bonkers by Foyer

Creative production system for making and reusing high-quality visual assets.

View project

AWS (ECS, ECR, S3)

DeployNinja

GitHub-native deployment platform for automated builds, live logs, and repeatable releases.

View project