CHATTY
theunpartychatty
Location: unparty-app/theunpartychatty
Status: Active Development
Primary Purpose: RAG-powered conversational AI chatbot with web crawling capabilities
Overview
theunpartychatty is an intelligent chatbot application built with Next.js that uses Retrieval Augmented Generation (RAG) to provide accurate, contextually relevant responses. It combines Pinecone's vector database with OpenAI's language models to create a context-aware conversational experience that can understand and respond to queries based on crawled web content.
Tech Stack
- Framework: Next.js 14 (App Router)
- Language: TypeScript
- AI/ML: OpenAI GPT-3.5-turbo (via openai-edge)
- Vector Database: Pinecone
- UI Framework: React 18
- Styling: Tailwind CSS 3.3
- AI SDK: Vercel AI SDK (ai package)
- Testing: Playwright
- Runtime: Edge-compatible
- Web Scraping: Cheerio + node-html-markdown
- Additional: Firebase, React Icons, React Markdown
Key Features
Core Capabilities
- RAG-Powered Responses: Retrieval Augmented Generation ensures accurate, context-aware answers without hallucination
- Web Crawling System: Automated crawler that indexes website content for knowledge base
- Real-time Streaming: Uses Vercel AI SDK for efficient streaming responses in edge environments
- Context Panel: Visual display of source material used to generate responses
- Vector Embeddings: Content chunked and embedded using Pinecone for semantic search
- Interactive Chat UI: Terminal-style interface with message history and real-time updates
Technical Features
- Edge Runtime: Optimized for Vercel edge functions for low latency
- Document Splitting: Multiple strategies (RecursiveCharacterTextSplitter, MarkdownTextSplitter)
- Semantic Search: Vector similarity matching with configurable score thresholds
- Dynamic Context Retrieval: Fetches relevant document chunks based on query embeddings
- Responsive Design: Mobile-friendly interface with collapsible context panel
- Modal Instructions: Built-in help system for user guidance
Architecture
Code
RAG Chatbot (Next.js)
├── Frontend (React + Tailwind)
│ ├── Chat Interface (useChat hook)
│ ├── Context Panel (Source display)
│ ├── Messages Component
│ └── Terminal Header
├── API Routes (Edge Functions)
│ ├── /api/chat (OpenAI streaming)
│ ├── /api/context (Retrieve sources)
│ └── /api/crawl (Seed knowledge base)
├── RAG Pipeline
│ ├── Crawler (Web scraping)
│ ├── Document Splitter (Chunking)
│ ├── Embeddings (OpenAI)
│ └── Vector Store (Pinecone)
└── Utilities
├── getContext (Semantic search)
├── getMatchesFromEmbeddings (Vector query)
└── seed (Index initialization)Data Flow
- Crawling Phase: User provides URL → Crawler fetches content → Content split into chunks → Embeddings generated → Vectors stored in Pinecone
- Query Phase: User sends message → Message embedded → Similar vectors retrieved from Pinecone → Context injected into prompt → OpenAI generates response → Response streamed to client
- Context Display: After response completion → Context endpoint queries Pinecone → Source chunks displayed in panel
Integration Points
External Services
- OpenAI API: GPT-3.5-turbo for chat completion and text embeddings
- Pinecone: Vector database for semantic search and RAG
- Vercel: Deployment platform with edge runtime support
- Firebase: Additional backend services (configured but usage TBD)
Internal Dependencies
- Vercel AI SDK: Streaming chat interface and message handling
- Shadcn UI: Component library for consistent UI elements
- Radix UI: Headless UI components (Avatar, Icons)
Business Value
ABOUT → BUILD → CONNECT
ABOUT (Understanding):
- Provides accurate, source-backed answers to user questions
- Helps users understand complex topics through contextual responses
- Reduces information overload by surfacing relevant content
- Transparency through context panel showing information sources
BUILD (Creation):
- Enables rapid knowledge base creation from any website
- Allows organizations to build custom AI assistants trained on their content
- Supports different document splitting strategies for optimized retrieval
- Configurable parameters (chunk size, overlap, similarity threshold)
CONNECT (Sharing):
- Real-time conversational interface for interactive learning
- Streaming responses for immediate engagement
- Visual context display helps users verify and explore sources
- Terminal-style UI creates familiar, accessible chat experience
Relationship to UNPARTY Ecosystem
theunpartychatty serves as a conversational AI interface within the UNPARTY ecosystem, focusing on intelligent information retrieval and context-aware interactions.
Ecosystem Position
- Complementary to theunpartyapp: Could provide AI-powered chat support for web platform users
- Content Intelligence: RAG capabilities align with content-focused nature of UNPARTY products
- Conversation Analysis: Could integrate with
theunpartycrawlerfor conversation analytics - Knowledge Management: Crawling and indexing capabilities support ecosystem documentation
Potential Integrations
- theunpartyapp: Embed as customer support chatbot or content discovery tool
- theunpartyunppp: Provide conversational interface for journal entries or story suggestions
- theunpartycrawler: Analyze chatbot conversations for usage patterns and insights
- theunpartyrunway: Use for developer documentation assistant and workflow guidance
Getting Started
Prerequisites
bash
# Environment variables required
OPENAI_API_KEY=your_openai_key
PINECONE_API_KEY=your_pinecone_key
PINECONE_CLOUD=your_cloud_provider # e.g., 'aws', 'gcp', or 'azure'
PINECONE_REGION=your_region # e.g., 'us-east-1', 'us-west-2', 'eu-west-1'
PINECONE_INDEX=your_index_nameInstallation
bash
# Clone and install dependencies
npm install
# Run development server
npm run dev
# Build for production
npm run build
# Start production server
npm startTesting
bash
# Run end-to-end tests with Playwright
npm run test:e2e
# Display test reports
npm run test:showUsage
- Start the application: Navigate to
http://localhost:3000 - Crawl a website: Use the crawl interface to index content from a URL
- Ask questions: Chat with the AI about the crawled content
- View context: Check the context panel to see which sources were used
Development Notes
Code Structure
src/app/page.tsx: Main chat interface pagesrc/app/api/chat/route.ts: OpenAI streaming endpointsrc/app/api/context/route.ts: Context retrieval endpointsrc/app/api/crawl/route.ts: Web crawling and indexing endpointsrc/app/components/: React components (Chat, Context, Messages, Header)src/app/utils/: Helper functions (crawler, embeddings, vector queries)
Key Technologies
- Edge Runtime: All API routes use
export const runtime = "edge"for optimal performance - useChat Hook: Vercel AI SDK hook manages chat state and streaming
- Pinecone Queries: Vector similarity search with metadata filtering
- Document Processing: Cheerio for HTML parsing, node-html-markdown for conversion
Alignment with UNPARTY Principles
Creator Ownership
- Knowledge bases remain under user control
- Self-hosted deployment options via Next.js
- No vendor lock-in for content indexing
Privacy
- Edge runtime ensures minimal data exposure
- Configurable similarity thresholds for privacy-conscious retrieval
- Source transparency through context panel
Cost-Sensitivity
- Efficient edge functions reduce compute costs
- Streaming responses minimize token usage
- Configurable chunk sizes optimize embedding costs
- Pinecone serverless option available
Future Enhancements
Potential improvements for ecosystem integration:
- Multi-modal support (images, PDFs)
- Integration with theunpartyapp CMS for content Q&A
- Conversation export for theunpartycrawler analysis
- Cost tracking integration with theunpartyrunway
- Journal entry suggestions for theunpartyunppp
- Multi-language support
- Custom embedding models
- Advanced citation system
Documentation
- Setup Guide: See
.env.examplefor configuration - API Documentation: Check individual route files in
src/app/api/ - Component Docs: Inline documentation in component files
- Testing Strategy: Playwright configuration in
playwright.config.ts
Contributing
This repository follows UNPARTY's development standards:
- TypeScript strict mode
- ESLint for code quality
- Playwright for E2E testing
- Next.js best practices
- Edge-first architecture
License
Part of the UNPARTY ecosystem.
Last Updated: 2025-10-29
Maintained By: UNPARTY Development Team
Status: 🚧 Active Development
UNPARTY Vision: Measurable user progress through ABOUT → BUILD → CONNECT while protecting creator ownership, privacy, and cost-sensitivity.