CORE
theunpartycore
Location: github.com/unparty-app/theunpartycore
Status: Active Development
Primary Purpose: Intelligent repository classification and metadata management system
Overview
theunpartycore is a developer-focused tool that automatically classifies and organizes code repositories using machine learning. It provides intelligent taxonomy, automated metadata generation, and consistent documentation management across the UNPARTY ecosystem.
Classification Types
The system intelligently categorizes repositories into 8 distinct types:
- ๐ง Tool - Command-line utilities and standalone tools
- ๐ฑ App - Applications with user interfaces (web, desktop, mobile)
- ๐ค Assistant - AI-powered assistants and conversational agents
- โ๏ธ Agent - Autonomous agents and automation systems
- ๐ค Bot - Platform-specific bots (Discord, Telegram, Slack)
- ๐ Library - Code libraries and packages for developers
- ๐๏ธ Framework - Development frameworks and platforms
- ๐ Service - Backend services, APIs, and microservices
Tech Stack
- Framework: Swift Package Manager (SPM)
- Language: Swift 5.9+, Python 3.11+
- Platform: iOS 16+, macOS 13+, Linux (CLI only)
- Machine Learning: Core ML (Apple platforms)
- Database: JSON-based training data (JSONL format)
- Build Tool: Xcode 15+, Swift CLI
- Automation: GitHub Actions
- Key Dependencies:
- Swift Argument Parser (CLI interface)
- python-frontmatter (metadata processing)
Key Features
โ Intelligent Classification System
- Core ML Integration: Native machine learning on Apple platforms with text classification models
- 8 Repository Types: Comprehensive taxonomy covering tools, apps, bots, libraries, frameworks, services, assistants, and agents
- Confidence Scoring: Probability-based classification with transparent decision-making
- Rule-based Fallback: Handles edge cases without trained models
โ Automated Metadata Management
- Frontmatter Automation: Automated slug generation from titles with checksum-based change detection
- GitHub Actions Integration: CI/CD workflows for automated validation and synchronization
- Content Validation: Ensures consistency across documentation and markdown files
- Multi-field Processing: Handles tags, titles, descriptions, categories, authors, and dates
โ Multi-Platform Architecture
- CLI Tool: macOS/Linux command-line interface for batch processing
- iOS App: Native iOS interface for on-device classification
- macOS App: Native macOS interface with SwiftUI
- Python Scripts: Cross-platform automation for metadata processing
โ Developer-Focused Workflow
- Swift Package Manager: Modular architecture with shared
ClassifyCorelibrary - Comprehensive Testing: Unit tests for core classification logic
- Training Data Pipeline: Structured JSONL format for model training and validation
- Documentation: Extensive guides for architecture, features, and workflows
Architecture
Code
Classification System (Swift + Python)
โโโ Core ML Pipeline
โ โโโ Data Collection โ Extract metadata via GitHub API
โ โโโ Feature Engineering โ Transform text/metadata to ML features
โ โโโ Model Training โ Train Core ML text classifier
โ โโโ Inference โ Classify with confidence scores
โ โโโ Fallback โ Rule-based classification
โ
โโโ Swift Components (Sources/)
โ โโโ ClassifyCore/ # Shared Swift logic
โ โ โโโ MetadataParser.swift # JSON โ Swift model
โ โ โโโ FeatureExtractor.swift # Model โ ML features
โ โ โโโ ClassifierEngine.swift # ML prediction logic
โ โ โโโ FrontmatterChecksum.swift # Content validation
โ โโโ CLI/ # Command-line interface
โ โ โโโ main.swift # CLI entrypoint
โ โโโ App/ # iOS/macOS SwiftUI app
โ โโโ App.swift
โ โโโ ContentView.swift
โ โโโ ClassifyViewModel.swift
โ
โโโ Python Automation (.github/scripts/)
โ โโโ generate_slug.py # Slug generation
โ โโโ generate_checksum_checker.py # Validation script generator
โ โโโ slug_utils.py # Utility functions
โ
โโโ Data Pipeline (data/)
โ โโโ training/ # ML training data
โ โ โโโ repo_samples.jsonl # Labeled samples
โ โ โโโ taxonomy.json # Classification definitions
โ โโโ processed/ # Processed repository data
โ โโโ workflows/ # Workflow configuration
โ โโโ checksum_fields.csv # Checksum field definitions
โ
โโโ Automation (GitHub Actions)
โโโ generate-checksum-checker.yml # Checksum validation
โโโ sync-frontmatter.yml # Metadata synchronizationIntegration Points
External Services
- GitHub API: Repository metadata extraction and automation
- Core ML: On-device machine learning inference (Apple platforms)
- Swift Package Manager: Dependency management
- GitHub Actions: CI/CD automation and validation
Internal Dependencies
- Swift Argument Parser: CLI interface and command parsing
- python-frontmatter: YAML frontmatter parsing and manipulation
- Foundation: Swift standard library for data processing
Business Value
ABOUT: Understanding Repository Taxonomy
- Intelligent Organization: Automatically categorizes repositories by purpose and function
- Consistent Classification: Provides standardized taxonomy across the entire UNPARTY ecosystem
- Metadata Insights: Extracts and analyzes repository characteristics for better understanding
- Discovery: Helps developers and users understand what each repository does at a glance
BUILD: Enabling Automated Development
- Automated Workflows: Reduces manual documentation work through intelligent automation
- Quality Assurance: Validates metadata consistency and completeness across projects
- Scalable Architecture: Modular design enables easy extension and customization
- Multi-Platform: Supports development across iOS, macOS, and command-line environments
CONNECT: Facilitating Ecosystem Integration
- Cross-Repository Standards: Enforces consistent documentation patterns
- API Integration: Enables programmatic access to classification services
- Shared Knowledge: Training data and models can be shared across teams
- Ecosystem Mapping: Provides foundation for understanding repository relationships
Relationship to Ecosystem
theunpartycore serves as the classification and organization brain of the UNPARTY ecosystem:
Dependencies
- Used by theunpartyapp for repository organization and discovery
- Leveraged by theunpartyrunway for development workflow automation
- Integrated with theunpartycrawler for analytics and intelligence
Data Flow
Note: The diagram below uses Mermaid syntax. GitHub and most modern markdown viewers will render it automatically. If you don't see a diagram, view this file on GitHub or use a Mermaid-compatible viewer.
Text Representation:
- GitHub Repositories โ theunpartycore (provides metadata)
- theunpartycore โ theunpartyapp (classification results)
- theunpartycore โ theunpartyrunway (automation support)
- theunpartycore โ theunpartycrawler (analytics data)
- theunpartycore โ Machine Learning Model (training/prediction cycle)
Ecosystem Position
- Type: Developer Tool
- Audience: Internal development team, automation systems
- Purpose: Repository classification, metadata management, documentation automation
- Integration Level: Core infrastructure component used by multiple repositories
Quick Start
Prerequisites
- Swift 5.9+ (for Swift components)
- Python 3.11+ (for automation scripts)
- Xcode 15+ (for iOS/macOS development)
Installation
bash
# Clone the repository
git clone https://github.com/unparty-app/theunpartycore.git
cd theunpartycore
# Build Swift components
swift build
# Install Python dependencies
pip install python-frontmatterUsage
๐ฅ๏ธ Command Line Interface
bash
# Classify a repository from JSON metadata
swift run classify data/processed/theunpartycore.processed.json
# Use custom model and JSON output
swift run classify --model model/RepoClassifier.mlmodel --format json data.json
# Show detailed probabilities
swift run classify --verbose --show-probabilities data.json๐ฑ iOS/macOS App
- Open
Package.swiftin Xcode - Select the
Appscheme - Build and run on your device/simulator
- Tap "Classify Sample Repo" to see the classification in action
๐ Python Automation
bash
# Generate slugs for markdown files
python .github/scripts/generate_slug.py content/
# Check if frontmatter has changed (used by CI)
python .github/scripts/check_frontmatter_checksum.py content/example.md
# Generate the checksum checker script
python .github/scripts/generate_checksum_checker.py๐งช Features
โ Core ML Integration
- Native machine learning on Apple platforms
- Supports both text classification and rule-based fallback
- Optimized for iOS and macOS deployment
โ Swift Package Manager
- Modular architecture with
ClassifyCoreshared library - Separate CLI and App targets
- Comprehensive test coverage
โ Frontmatter Automation
- Automated slug generation from titles
- Checksum-based change detection
- GitHub Actions integration for CI/CD
โ Multi-Platform Support
- CLI: macOS command-line tool
- iOS App: Native iOS interface
- macOS App: Native macOS interface
- Python Scripts: Cross-platform automation
Training Data & Machine Learning
The system learns from labeled repository samples in data/training/repo_samples.jsonl:
json
{"label": "theunpartycore", "type": "tool", "description": "CLI utility to classify machine-executable systems by role using Core ML", "readme_text": "theunpartycore is a tool for classification of machine-executable-systems. Uses machine-learning to assign each machine-executable-system a core-type.", "topics": ["coreml", "classifier", "audit", "repo-metadata", "swift"], "languages": ["Swift", "Python"], "file_signals": ["scripts/", "Sources/", "Package.swift", "README.md"], "custom_terms": ["machine-executable-system", "core-type", "classification"]}
{"label": "lodash", "type": "library", "description": "A modern JavaScript utility library delivering modularity, performance & extras", "readme_text": "Lodash makes JavaScript easier by taking the hassle out of working with arrays, numbers, objects, strings, etc.", "topics": ["javascript", "utility", "library", "functional"], "languages": ["JavaScript"], "file_signals": ["lib/", "index.js", "package.json"], "custom_terms": ["utility", "functional-programming"]}Training Sample Components
- Repository Metadata: Stars, languages, topics, file structure
- README Analysis: Content extraction and pattern recognition
- File Signals: Directory structure indicators (e.g.,
src/,lib/,bot/) - Custom Terms: Domain-specific vocabulary and classification hints
- Manual Labels: Ground truth classifications for supervised learning
ML Pipeline Stages
- Data Collection โ Extract repository metadata via GitHub API
- Feature Engineering โ Transform text and metadata into ML features
- Model Training โ Train Core ML text classifier from labeled samples
- Inference โ Classify new repositories with confidence scores
- Rule-based Fallback โ Handle cases without trained models
Documentation
- Architecture Overview - Detailed system design and component breakdown
- Classification Types - Complete guide to classification taxonomy
- Frontmatter Automation - Automated metadata management workflows
- Training Data Format - Structure and format of ML training data
- Getting Started Guide - Step-by-step setup and usage instructions
Project Dependencies
| Component | Dependencies | Purpose |
|---|---|---|
| Swift Core | Core ML, Foundation | Native Apple frameworks for ML and data processing |
| CLI Tool | ArgumentParser | Swift Package Manager for command-line interface |
| Python Scripts | python-frontmatter | Minimal external dependencies for metadata processing |
| GitHub Actions | Standard runners | CI/CD automation without custom requirements |
| ML Models | Core ML format | Apple's native ML framework for on-device inference |
Contributing to UNPARTY Ecosystem Documentation
This repository follows the UNPARTY Repository Template standards. When documenting:
Repository Analysis Checklist
- Primary purpose identified
- Tech stack documented
- Key features listed
- Architecture diagram created
- Integration points mapped
- Business value alignment explained (ABOUT โ BUILD โ CONNECT)
- Relationship to ecosystem explained
- External dependencies documented
Ecosystem Alignment
- Protects Creator Ownership: Classification system respects repository autonomy
- Privacy-Focused: All processing can be done locally without external APIs
- Cost-Sensitive: Minimal dependencies reduce operational costs
- User Progress: Enables measurable organization and discovery (ABOUT โ BUILD โ CONNECT)
Maintenance
Last Updated: 2025-10-29
Maintained By: UNPARTY Development Team
Review Frequency: Updated with major feature releases
Status: โ
Active Development
Ecosystem Role: Core infrastructure for repository classification and metadata management
Related Documentation
Note: Links below assume UNPARTY repositories are in the same parent directory. For absolute paths, visit the repositories directly on GitHub.
theunpartyapp/CLAUDE.md- Web app development guidetheunpartyrunway/CLAUDE.md- Automation framework guidetheunpartycrawler/CLAUDE.md- Analytics intelligence guidePARTY.md- One-sentence value and current state summary
Focus: Intelligent repository classification enabling measurable user progress through ABOUT โ BUILD โ CONNECT while protecting creator ownership, privacy, and cost-sensitivity.