CORE

the tool for labels

theunpartycore

Location: github.com/unparty-app/theunpartycore
Status: Active Development
Primary Purpose: Intelligent repository classification and metadata management system

Swift Platform Core ML License


Overview

theunpartycore is a developer-focused tool that automatically classifies and organizes code repositories using machine learning. It provides intelligent taxonomy, automated metadata generation, and consistent documentation management across the UNPARTY ecosystem.

Classification Types

The system intelligently categorizes repositories into 8 distinct types:

  • ๐Ÿ”ง Tool - Command-line utilities and standalone tools
  • ๐Ÿ“ฑ App - Applications with user interfaces (web, desktop, mobile)
  • ๐Ÿค– Assistant - AI-powered assistants and conversational agents
  • โš™๏ธ Agent - Autonomous agents and automation systems
  • ๐Ÿค– Bot - Platform-specific bots (Discord, Telegram, Slack)
  • ๐Ÿ“š Library - Code libraries and packages for developers
  • ๐Ÿ—๏ธ Framework - Development frameworks and platforms
  • ๐ŸŒ Service - Backend services, APIs, and microservices

Tech Stack

  • Framework: Swift Package Manager (SPM)
  • Language: Swift 5.9+, Python 3.11+
  • Platform: iOS 16+, macOS 13+, Linux (CLI only)
  • Machine Learning: Core ML (Apple platforms)
  • Database: JSON-based training data (JSONL format)
  • Build Tool: Xcode 15+, Swift CLI
  • Automation: GitHub Actions
  • Key Dependencies:
    • Swift Argument Parser (CLI interface)
    • python-frontmatter (metadata processing)

Key Features

โœ… Intelligent Classification System

  • Core ML Integration: Native machine learning on Apple platforms with text classification models
  • 8 Repository Types: Comprehensive taxonomy covering tools, apps, bots, libraries, frameworks, services, assistants, and agents
  • Confidence Scoring: Probability-based classification with transparent decision-making
  • Rule-based Fallback: Handles edge cases without trained models

โœ… Automated Metadata Management

  • Frontmatter Automation: Automated slug generation from titles with checksum-based change detection
  • GitHub Actions Integration: CI/CD workflows for automated validation and synchronization
  • Content Validation: Ensures consistency across documentation and markdown files
  • Multi-field Processing: Handles tags, titles, descriptions, categories, authors, and dates

โœ… Multi-Platform Architecture

  • CLI Tool: macOS/Linux command-line interface for batch processing
  • iOS App: Native iOS interface for on-device classification
  • macOS App: Native macOS interface with SwiftUI
  • Python Scripts: Cross-platform automation for metadata processing

โœ… Developer-Focused Workflow

  • Swift Package Manager: Modular architecture with shared ClassifyCore library
  • Comprehensive Testing: Unit tests for core classification logic
  • Training Data Pipeline: Structured JSONL format for model training and validation
  • Documentation: Extensive guides for architecture, features, and workflows

Architecture

Code

Classification System (Swift + Python)
โ”œโ”€โ”€ Core ML Pipeline
โ”‚   โ”œโ”€โ”€ Data Collection โ†’ Extract metadata via GitHub API
โ”‚   โ”œโ”€โ”€ Feature Engineering โ†’ Transform text/metadata to ML features
โ”‚   โ”œโ”€โ”€ Model Training โ†’ Train Core ML text classifier
โ”‚   โ”œโ”€โ”€ Inference โ†’ Classify with confidence scores
โ”‚   โ””โ”€โ”€ Fallback โ†’ Rule-based classification
โ”‚
โ”œโ”€โ”€ Swift Components (Sources/)
โ”‚   โ”œโ”€โ”€ ClassifyCore/              # Shared Swift logic
โ”‚   โ”‚   โ”œโ”€โ”€ MetadataParser.swift        # JSON โ†’ Swift model
โ”‚   โ”‚   โ”œโ”€โ”€ FeatureExtractor.swift      # Model โ†’ ML features
โ”‚   โ”‚   โ”œโ”€โ”€ ClassifierEngine.swift      # ML prediction logic
โ”‚   โ”‚   โ””โ”€โ”€ FrontmatterChecksum.swift   # Content validation
โ”‚   โ”œโ”€โ”€ CLI/                       # Command-line interface
โ”‚   โ”‚   โ””โ”€โ”€ main.swift                  # CLI entrypoint
โ”‚   โ””โ”€โ”€ App/                       # iOS/macOS SwiftUI app
โ”‚       โ”œโ”€โ”€ App.swift
โ”‚       โ”œโ”€โ”€ ContentView.swift
โ”‚       โ””โ”€โ”€ ClassifyViewModel.swift
โ”‚
โ”œโ”€โ”€ Python Automation (.github/scripts/)
โ”‚   โ”œโ”€โ”€ generate_slug.py               # Slug generation
โ”‚   โ”œโ”€โ”€ generate_checksum_checker.py   # Validation script generator
โ”‚   โ””โ”€โ”€ slug_utils.py                  # Utility functions
โ”‚
โ”œโ”€โ”€ Data Pipeline (data/)
โ”‚   โ”œโ”€โ”€ training/                  # ML training data
โ”‚   โ”‚   โ”œโ”€โ”€ repo_samples.jsonl          # Labeled samples
โ”‚   โ”‚   โ””โ”€โ”€ taxonomy.json               # Classification definitions
โ”‚   โ”œโ”€โ”€ processed/                 # Processed repository data
โ”‚   โ””โ”€โ”€ workflows/                 # Workflow configuration
โ”‚       โ””โ”€โ”€ checksum_fields.csv         # Checksum field definitions
โ”‚
โ””โ”€โ”€ Automation (GitHub Actions)
    โ”œโ”€โ”€ generate-checksum-checker.yml   # Checksum validation
    โ””โ”€โ”€ sync-frontmatter.yml            # Metadata synchronization

Integration Points

External Services

  • GitHub API: Repository metadata extraction and automation
  • Core ML: On-device machine learning inference (Apple platforms)
  • Swift Package Manager: Dependency management
  • GitHub Actions: CI/CD automation and validation

Internal Dependencies

  • Swift Argument Parser: CLI interface and command parsing
  • python-frontmatter: YAML frontmatter parsing and manipulation
  • Foundation: Swift standard library for data processing

Business Value

ABOUT: Understanding Repository Taxonomy

  • Intelligent Organization: Automatically categorizes repositories by purpose and function
  • Consistent Classification: Provides standardized taxonomy across the entire UNPARTY ecosystem
  • Metadata Insights: Extracts and analyzes repository characteristics for better understanding
  • Discovery: Helps developers and users understand what each repository does at a glance

BUILD: Enabling Automated Development

  • Automated Workflows: Reduces manual documentation work through intelligent automation
  • Quality Assurance: Validates metadata consistency and completeness across projects
  • Scalable Architecture: Modular design enables easy extension and customization
  • Multi-Platform: Supports development across iOS, macOS, and command-line environments

CONNECT: Facilitating Ecosystem Integration

  • Cross-Repository Standards: Enforces consistent documentation patterns
  • API Integration: Enables programmatic access to classification services
  • Shared Knowledge: Training data and models can be shared across teams
  • Ecosystem Mapping: Provides foundation for understanding repository relationships

Relationship to Ecosystem

theunpartycore serves as the classification and organization brain of the UNPARTY ecosystem:

Dependencies

  • Used by theunpartyapp for repository organization and discovery
  • Leveraged by theunpartyrunway for development workflow automation
  • Integrated with theunpartycrawler for analytics and intelligence

Data Flow

Note: The diagram below uses Mermaid syntax. GitHub and most modern markdown viewers will render it automatically. If you don't see a diagram, view this file on GitHub or use a Mermaid-compatible viewer.

Text Representation:

  • GitHub Repositories โ†’ theunpartycore (provides metadata)
  • theunpartycore โ†’ theunpartyapp (classification results)
  • theunpartycore โ†’ theunpartyrunway (automation support)
  • theunpartycore โ†’ theunpartycrawler (analytics data)
  • theunpartycore โ†” Machine Learning Model (training/prediction cycle)

Ecosystem Position

  • Type: Developer Tool
  • Audience: Internal development team, automation systems
  • Purpose: Repository classification, metadata management, documentation automation
  • Integration Level: Core infrastructure component used by multiple repositories

Quick Start

Prerequisites

  • Swift 5.9+ (for Swift components)
  • Python 3.11+ (for automation scripts)
  • Xcode 15+ (for iOS/macOS development)

Installation

bash

# Clone the repository
git clone https://github.com/unparty-app/theunpartycore.git
cd theunpartycore

# Build Swift components
swift build

# Install Python dependencies
pip install python-frontmatter

Usage

๐Ÿ–ฅ๏ธ Command Line Interface

bash

# Classify a repository from JSON metadata
swift run classify data/processed/theunpartycore.processed.json

# Use custom model and JSON output
swift run classify --model model/RepoClassifier.mlmodel --format json data.json

# Show detailed probabilities
swift run classify --verbose --show-probabilities data.json

๐Ÿ“ฑ iOS/macOS App

  1. Open Package.swift in Xcode
  2. Select the App scheme
  3. Build and run on your device/simulator
  4. Tap "Classify Sample Repo" to see the classification in action

๐Ÿ Python Automation

bash

# Generate slugs for markdown files
python .github/scripts/generate_slug.py content/

# Check if frontmatter has changed (used by CI)
python .github/scripts/check_frontmatter_checksum.py content/example.md

# Generate the checksum checker script
python .github/scripts/generate_checksum_checker.py

๐Ÿงช Features

โœ… Core ML Integration

  • Native machine learning on Apple platforms
  • Supports both text classification and rule-based fallback
  • Optimized for iOS and macOS deployment

โœ… Swift Package Manager

  • Modular architecture with ClassifyCore shared library
  • Separate CLI and App targets
  • Comprehensive test coverage

โœ… Frontmatter Automation

  • Automated slug generation from titles
  • Checksum-based change detection
  • GitHub Actions integration for CI/CD

โœ… Multi-Platform Support

  • CLI: macOS command-line tool
  • iOS App: Native iOS interface
  • macOS App: Native macOS interface
  • Python Scripts: Cross-platform automation

Training Data & Machine Learning

The system learns from labeled repository samples in data/training/repo_samples.jsonl:

json

{"label": "theunpartycore", "type": "tool", "description": "CLI utility to classify machine-executable systems by role using Core ML", "readme_text": "theunpartycore is a tool for classification of machine-executable-systems. Uses machine-learning to assign each machine-executable-system a core-type.", "topics": ["coreml", "classifier", "audit", "repo-metadata", "swift"], "languages": ["Swift", "Python"], "file_signals": ["scripts/", "Sources/", "Package.swift", "README.md"], "custom_terms": ["machine-executable-system", "core-type", "classification"]}
{"label": "lodash", "type": "library", "description": "A modern JavaScript utility library delivering modularity, performance & extras", "readme_text": "Lodash makes JavaScript easier by taking the hassle out of working with arrays, numbers, objects, strings, etc.", "topics": ["javascript", "utility", "library", "functional"], "languages": ["JavaScript"], "file_signals": ["lib/", "index.js", "package.json"], "custom_terms": ["utility", "functional-programming"]}

Training Sample Components

  • Repository Metadata: Stars, languages, topics, file structure
  • README Analysis: Content extraction and pattern recognition
  • File Signals: Directory structure indicators (e.g., src/, lib/, bot/)
  • Custom Terms: Domain-specific vocabulary and classification hints
  • Manual Labels: Ground truth classifications for supervised learning

ML Pipeline Stages

  1. Data Collection โ†’ Extract repository metadata via GitHub API
  2. Feature Engineering โ†’ Transform text and metadata into ML features
  3. Model Training โ†’ Train Core ML text classifier from labeled samples
  4. Inference โ†’ Classify new repositories with confidence scores
  5. Rule-based Fallback โ†’ Handle cases without trained models

Documentation


Project Dependencies

ComponentDependenciesPurpose
Swift CoreCore ML, FoundationNative Apple frameworks for ML and data processing
CLI ToolArgumentParserSwift Package Manager for command-line interface
Python Scriptspython-frontmatterMinimal external dependencies for metadata processing
GitHub ActionsStandard runnersCI/CD automation without custom requirements
ML ModelsCore ML formatApple's native ML framework for on-device inference

Contributing to UNPARTY Ecosystem Documentation

This repository follows the UNPARTY Repository Template standards. When documenting:

Repository Analysis Checklist

  • Primary purpose identified
  • Tech stack documented
  • Key features listed
  • Architecture diagram created
  • Integration points mapped
  • Business value alignment explained (ABOUT โ†’ BUILD โ†’ CONNECT)
  • Relationship to ecosystem explained
  • External dependencies documented

Ecosystem Alignment

  • Protects Creator Ownership: Classification system respects repository autonomy
  • Privacy-Focused: All processing can be done locally without external APIs
  • Cost-Sensitive: Minimal dependencies reduce operational costs
  • User Progress: Enables measurable organization and discovery (ABOUT โ†’ BUILD โ†’ CONNECT)

Maintenance

Last Updated: 2025-10-29
Maintained By: UNPARTY Development Team
Review Frequency: Updated with major feature releases

Status: โœ… Active Development
Ecosystem Role: Core infrastructure for repository classification and metadata management


Note: Links below assume UNPARTY repositories are in the same parent directory. For absolute paths, visit the repositories directly on GitHub.


Focus: Intelligent repository classification enabling measurable user progress through ABOUT โ†’ BUILD โ†’ CONNECT while protecting creator ownership, privacy, and cost-sensitivity.