Master Project Task List

Document Purpose

This document serves as the authoritative overview for the Voice by aiConnected platform build. It provides Claude Code with complete context about the project’s goals, architecture, infrastructure decisions, and the sequence of work required to deliver a production-ready Voice AI contact center platform. Read this document first before beginning any implementation work.

Project Overview

What We Are Building

Voice by aiConnected is a white-label Voice AI contact center platform that enables businesses to deploy autonomous AI agents capable of handling inbound and outbound phone calls. The platform integrates with existing phone infrastructure (GoToConnect), leverages real-time audio processing (LiveKit), and delivers hyper-realistic conversational AI through a streaming pipeline of Speech-to-Text, Large Language Model, and Text-to-Speech services.

Business Context

Target Market: Small to medium-sized businesses needing 24/7 phone coverage, lead response, appointment scheduling, and customer service automation
Pricing Model: Fixed credit buckets plus per-minute overages
Competitive Advantage: 50-75% lower cost than competitors (Vapi, Retell, Bland AI) through infrastructure ownership and optimized provider selection
Parent Company: Oxford Pierpont Corporation (business development and digital marketing)

Core Capabilities

Inbound Call Handling — AI answers calls, converses naturally, resolves inquiries or transfers to humans
Outbound Call Automation — AI initiates calls for lead follow-up, appointment reminders, reactivation campaigns
Human Handoff — Seamless transfer to live agents via blind transfer, warm transfer, or conference
Tool Calling — AI executes business logic (CRM updates, calendar booking, data lookup) via webhooks/n8n
Knowledge Base Integration — AI responses informed by client-specific business context (already built)
Multi-Tenant Architecture — Single platform serves multiple clients with isolated configurations

Architecture Summary

Voice Pipeline

┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│   PSTN ←→ GoToConnect PBX ←→ WebRTC Bridge ←→ LiveKit Room                 │
│                                    (aiortc)         │                       │
│                                                     ├── Deepgram STT        │
│                                                     │   (streaming)         │
│                                                     │                       │
│                                                     ├── Claude LLM          │
│                                                     │   (streaming)         │
│                                                     │                       │
│                                                     ├── Chatterbox TTS      │
│                                                     │   (streaming)         │
│                                                     │                       │
│                                                     └── Tool Webhooks       │
│                                                         (async, n8n)        │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Latency Budget (Target: <1000ms mouth-to-ear)

Stage	Target	Notes
Audio capture → STT	~100ms	Streaming VAD
STT processing	~300ms	Deepgram interim results
LLM time-to-first-token	~350ms	Claude streaming
TTS time-to-first-byte	~150ms	Chatterbox streaming
Return audio path	~70ms	LiveKit → GoTo → PSTN
Total	~970ms	Achievable with optimization

Infrastructure Topology

┌─────────────────────────────────────────────────────────────────────────────┐
│ EXTERNAL SERVICES                                                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   GoToConnect          LiveKit Cloud        RunPod                          │
│   (Telephony)          (Real-time Audio)    (Chatterbox GPU)                │
│        │                     │                   │                          │
│   Deepgram             Anthropic API        n8n Cloud/Self-hosted           │
│   (STT)                (Claude LLM)         (Webhooks)                      │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ DIGITALOCEAN / DOKPLOY                                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐       │
│   │   API       │  │   WebRTC    │  │   Agent     │  │   Worker    │       │
│   │   Gateway   │  │   Bridge    │  │   Service   │  │   Service   │       │
│   └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘       │
│                                                                             │
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                        │
│   │ PostgreSQL  │  │   Redis     │  │ DO Spaces   │                        │
│   │ (Database)  │  │   (Cache)   │  │ (Storage)   │                        │
│   └─────────────┘  └─────────────┘  └─────────────┘                        │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Infrastructure Decisions (Finalized)

These decisions have been made and should not be revisited during implementation:

Component	Decision	Rationale
Telephony	GoToConnect	Grandfathered $17/user unlimited plan; full call control API
Real-time Audio	LiveKit Cloud	Industry standard; Agents SDK for voice AI
STT	Deepgram Nova-2	Low latency streaming; phone audio optimized
LLM	Anthropic Claude (Sonnet)	Best reasoning; streaming support
TTS	Chatterbox-Turbo on RunPod	Zero per-minute cost; MIT license; paralinguistics
GPU	RunPod RTX A5000	Best value ($0.27/hr); 24GB VRAM sufficient
Platform Hosting	DigitalOcean + Dokploy	Existing infrastructure; container orchestration
Database	PostgreSQL	Relational; proven; existing expertise
Cache/State	Redis	Session state; call state machine
Object Storage	DO Spaces	Voice samples; call recordings
Webhooks	n8n	Tool calling; existing expertise
Knowledge Base	Existing system	Already built and integrated
Admin Dashboard	Existing system	Add service config page; UI polish is last priority

Cost Structure

Per-Minute Breakdown (at 50k min/month scale)

Component	Cost
LiveKit Agent Session	$0.0100/min
GoToConnect Telephony	$0.0000/min (unlimited)
Deepgram STT	$0.0043/min
Claude Sonnet LLM	$0.0080/min (estimated)
Chatterbox TTS (amortized)	$0.0040/min
Total	~$0.025/min

Monthly Infrastructure

Service	Est. Cost
GoToConnect	$17/user
LiveKit Cloud	~$50-100
RunPod A5000	~$197
Deepgram	~$50-100
Anthropic API	~$100-300
DigitalOcean	~$50-100
Total	~$500-800/mo starting

Build Phases

Phase 1: Foundation (Documents 1-6)

Goal: Development environment ready, architecture fully documented

#	Document	Purpose
1	System Architecture Overview	Complete technical blueprint
2	GoToConnect Integration Specification	Telephony API details
3	Voice Pipeline Architecture	STT→LLM→TTS streaming design
4	WebRTC Bridge Technical Design	GoTo↔LiveKit audio bridging
5	Development Environment Setup Guide	Local dev stack
6	Codebase Structure & Conventions	Repo organization

Deliverables:

Architecture diagrams finalized
All API contracts documented
Local development environment functional
Repository structure established

Phase 2: Core Infrastructure (Documents 7-11)

Goal: Database, state management, and service skeleton operational

#	Document	Purpose
7	Database Schema Design	PostgreSQL tables, migrations
8	State Management Specification	Call state machine, Redis structures
9	Message Queue & Event Bus Design	Async communication patterns
10	Error Handling & Recovery Patterns	Resilience patterns
11	Core Services Implementation Guide	Service implementations

Deliverables:

Database migrations created and tested
Redis state management implemented
Event bus operational
Core services running (API gateway, bridge, agent, worker)

Phase 3: Provider Integrations (Documents 12-16)

Goal: All external services connected and functional

#	Document	Purpose
12	LiveKit Integration Specification	Agents SDK, room management
13	Deepgram STT Integration Guide	Streaming transcription
14	Anthropic Claude Integration Guide	LLM streaming, tools
15	Chatterbox TTS Integration Guide	RunPod deployment, synthesis
16	Tool Calling & Webhook Specification	n8n integration

Deliverables:

LiveKit Agents pipeline functional
Deepgram streaming STT working
Claude streaming responses working
Chatterbox deployed on RunPod
Tool calling via webhooks operational

Phase 4: Call Features (Documents 17-20)

Goal: Complete call handling capabilities

#	Document	Purpose
17	Inbound Call Flow Specification	Answer, converse, resolve
18	Outbound Call Flow Specification	Dial, converse, resolve
19	Human Handoff Specification	Transfer patterns
20	Knowledge Base Integration Guide	Context injection

Deliverables:

Inbound calls answered by AI
Outbound calls initiated by AI
Transfers to human agents working
Knowledge base context in AI responses

Phase 5: Platform (Documents 21-23)

Goal: Multi-tenant API complete

#	Document	Purpose
21	Tenant Configuration API Specification	Agent/voice/number management
22	Usage Metering & Billing Integration	Credit tracking, overages
23	API Specification (OpenAPI)	Public API documentation

Deliverables:

Tenant CRUD operations
Usage tracking per tenant
Billing hooks implemented
API documented and versioned

Phase 6: Operations (Documents 24-27)

Goal: Production deployment with observability

#	Document	Purpose
24	Infrastructure Architecture	DO/Dokploy/RunPod topology
25	Deployment Runbook	Step-by-step production deploy
26	CI/CD Pipeline Specification	Automated build/deploy
27	Monitoring & Observability Guide	Metrics, logs, alerts

Deliverables:

Production environment provisioned
Deployment automated
Monitoring dashboards operational
Alerting configured

Phase 7: Hardening (Documents 28-30)

Goal: Secure, tested, resilient system

#	Document	Purpose
28	Security Architecture Document	Auth, encryption, trust boundaries
29	Testing Strategy Document	Test coverage plan
30	Failure Mode Handling Guide	Failovers, fallbacks

Deliverables:

Security audit passed
Test suite comprehensive
Failure scenarios handled gracefully

Skills (Provider API Reference)

In addition to the 30 build documents, the following skills provide API reference material:

/mnt/skills/user/voice-platform/
├── SKILL.md                     # Overview, when to use each sub-skill
├── gotoconnect/
│   ├── SKILL.md                 # Auth, endpoints, code patterns
│   └── postman_collection.json  # Full API collection
├── livekit/
│   └── SKILL.md                 # Agents SDK, room management
├── deepgram/
│   └── SKILL.md                 # Streaming STT configuration
├── anthropic/
│   └── SKILL.md                 # Streaming, tool calling
├── chatterbox/
│   └── SKILL.md                 # RunPod, API wrapper, voice cloning
└── n8n/
    └── SKILL.md                 # Webhook patterns

Success Criteria

MVP Definition

The minimum viable product is achieved when:

Inbound Call: A call to a GoToConnect number is answered by the AI, which holds a natural conversation and either resolves the inquiry or transfers to a human
Outbound Call: The platform initiates a call via API trigger, AI converses with the recipient
Human Handoff: AI successfully transfers a call (blind or warm) to a live agent
Tool Execution: AI executes at least one tool call (e.g., CRM update, calendar check) during a conversation
Multi-Tenant: Two separate clients can operate independent AI agents simultaneously
Latency: Mouth-to-ear response time under 1.5 seconds for 90% of interactions

Quality Gates

Metric	Target
Call completion rate	>95%
Transfer success rate	>99%
STT accuracy	>90%
Average latency	<1000ms
Concurrent calls (per tenant)	10+
Uptime	99.5%

Constraints & Requirements

Technical Constraints

Python preferred for WebRTC bridge (aiortc ecosystem)
LiveKit Agents SDK is Python-native
PostgreSQL for relational data (existing expertise)
Redis for ephemeral state (call sessions)
Docker/Dokploy for container orchestration

Business Constraints

Timeline: MVP within 6-10 weeks
Budget: Minimize upfront costs; scale with usage
Team: Development via Claude Code with human oversight
Existing Systems: Must integrate with existing Knowledge Base and Admin Dashboard

Non-Goals (Out of Scope for MVP)

Custom voice cloning per client (use pre-set voices initially)
Multi-language support (English only for MVP)
SMS/chat channels (voice only)
Compliance certifications (SOC 2, HIPAA) — plan for later
Mobile app
Analytics dashboard beyond basic usage metrics

Document Dependency Map

┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 1: FOUNDATION                                                         │
│                                                                             │
│   [1] System Architecture ─────┬─────────────────────────────────────────┐ │
│            │                   │                                         │ │
│            ▼                   ▼                                         │ │
│   [2] GoToConnect    [3] Voice Pipeline    [5] Dev Environment          │ │
│            │                   │                     │                   │ │
│            └─────────┬─────────┘                     │                   │ │
│                      ▼                               │                   │ │
│            [4] WebRTC Bridge ◄───────────────────────┘                   │ │
│                      │                                                   │ │
│                      ▼                                                   │ │
│            [6] Codebase Structure                                        │ │
└─────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 2: CORE INFRASTRUCTURE                                                │
│                                                                             │
│   [7] Database Schema ◄─── [8] State Management ◄─── [9] Event Bus         │
│            │                        │                       │              │
│            └────────────────────────┼───────────────────────┘              │
│                                     ▼                                      │
│                          [10] Error Handling                               │
│                                     │                                      │
│                                     ▼                                      │
│                      [11] Core Services Implementation                     │
└─────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 3: PROVIDER INTEGRATIONS                                              │
│                                                                             │
│   [12] LiveKit ──┬── [13] Deepgram ──┬── [14] Claude ──┬── [15] Chatterbox │
│                  │                   │                 │                   │
│                  └───────────────────┴─────────────────┘                   │
│                                      │                                     │
│                                      ▼                                     │
│                            [16] Tool Calling                               │
└─────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 4: CALL FEATURES                                                      │
│                                                                             │
│   [17] Inbound ────┬──── [18] Outbound                                     │
│         │          │            │                                          │
│         │          ▼            │                                          │
│         │    [19] Human Handoff │                                          │
│         │          │            │                                          │
│         └──────────┼────────────┘                                          │
│                    ▼                                                       │
│         [20] Knowledge Base Integration                                    │
└─────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 5: PLATFORM                                                           │
│                                                                             │
│   [21] Tenant Config API ──── [22] Usage Metering ──── [23] OpenAPI Spec   │
└─────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 6: OPERATIONS                                                         │
│                                                                             │
│   [24] Infrastructure ──── [25] Deployment ──── [26] CI/CD ──── [27] Monitoring │
└─────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 7: HARDENING                                                          │
│                                                                             │
│   [28] Security ──── [29] Testing ──── [30] Failure Modes                  │
└─────────────────────────────────────────────────────────────────────────────┘

How to Use This Document

For Claude Code

Read this document completely before starting any implementation
Follow the phase order — each phase builds on the previous
Consult skills for API-specific implementation details
Reference individual documents for detailed specifications
Check deliverables at the end of each phase before proceeding

For Human Oversight

Review completed phases before approving progression
Test deliverables against success criteria
Provide credentials and access as needed per phase
Clarify requirements when documents reference “TBD” items

Open Questions (To Be Resolved)

Question	Owner	Status
GoToConnect OAuth credentials for dev environment	Human	Pending
LiveKit Cloud project setup	Human	Pending
Deepgram API key	Human	Pending
Anthropic API key	Human	Pending
RunPod account and A5000 provisioning	Human	Pending
DigitalOcean/Dokploy access	Human	Pending
n8n instance URL and credentials	Human	Pending
Knowledge Base API endpoint	Human	Pending
Existing Admin Dashboard repo access	Human	Pending

Version History

Version	Date	Author	Changes
1.0	2026-01-16	Claude	Initial document

Next Steps

Human reviews and approves this Master Project Task List
Human provides access credentials for open questions
Claude Code proceeds to Document #1: System Architecture Overview
Build proceeds phase by phase with human checkpoints

This document is the single source of truth for the Voice by aiConnected project. All implementation decisions should align with the specifications herein.

Pre-Build Checklist

✅ Infrastructure (Covered)

GoToConnect (telephony)
LiveKit Cloud (real-time audio)
RunPod A5000 (Chatterbox TTS)
Deepgram (STT)
Anthropic Claude (LLM)
DigitalOcean/Dokploy (platform)

⚠️ Technical (Needs Planning)

Item	Status	Notes
WebRTC Bridge	To build	Python/aiortc service connecting GoTo ↔ LiveKit
Database	Needed	PostgreSQL for tenants, configs, logs
Redis	Needed	Call state machine, session cache
Object Storage	Needed	Voice samples, call recordings
Knowledge Base	Needed	How clients upload business context for their agents
n8n / Webhooks	Needed	Tool calling (CRM, calendar, etc.)
Monitoring	Needed	Grafana/Prometheus or Datadog

⚠️ Business Logic (Needs Planning)

Item	Question to Answer
Billing/Metering	How do you charge clients? Per minute? Per seat? Flat rate?
Usage Tracking	How do you track minutes per tenant for billing?
Admin Dashboard	What can clients configure themselves?
Onboarding Flow	How do clients set up their first agent?
Voice Management	How do clients provide/record their brand voice?
Human Handoff	How do live agents get notified and take over?
Call Recording	Store recordings? How long? Client access?
Rate Limits	Max concurrent calls per client tier?

⚠️ Compliance/Legal (Critical)

Item	Why It Matters
AI Disclosure	Some states (CA, WA, etc.) require disclosure that caller is speaking to AI
TCPA Compliance	Outbound calling rules, consent requirements
Call Recording Consent	Two-party consent states
Data Retention Policy	How long do you keep call data?
Privacy Policy	Required for handling caller PII
Terms of Service	Liability, acceptable use
DPA (Data Processing Agreement)	For B2B clients

⚠️ Failure Modes (Needs Planning)

Scenario	Fallback Plan
LLM times out	Graceful “one moment please” + retry?
TTS fails	Pre-recorded fallback audio?
STT fails	Ask caller to repeat?
RunPod goes down	Failover to Resemble API?
Call volume spike	Queue management? Auto-scale?

Overview

aiConnected OS

Business Platform

Apps & Modules

Neurigraph

Acquired Intelligence

Spatial Computing

Papers & Research

Supporting Docs

Archive

​Master Project Task List

​Document Purpose

​Project Overview

​What We Are Building

​Business Context

​Core Capabilities

​Architecture Summary

​Voice Pipeline

​Latency Budget (Target: <1000ms mouth-to-ear)

​Infrastructure Topology

​Infrastructure Decisions (Finalized)

​Cost Structure

​Per-Minute Breakdown (at 50k min/month scale)

​Monthly Infrastructure

​Build Phases

​Phase 1: Foundation (Documents 1-6)

​Phase 2: Core Infrastructure (Documents 7-11)

​Phase 3: Provider Integrations (Documents 12-16)

​Phase 4: Call Features (Documents 17-20)

​Phase 5: Platform (Documents 21-23)

​Phase 6: Operations (Documents 24-27)

​Phase 7: Hardening (Documents 28-30)

​Skills (Provider API Reference)

​Success Criteria

​MVP Definition

​Quality Gates

​Constraints & Requirements

​Technical Constraints

​Business Constraints

​Non-Goals (Out of Scope for MVP)

​Document Dependency Map

​How to Use This Document

​For Claude Code

​For Human Oversight

​Open Questions (To Be Resolved)

​Version History

​Next Steps

​Pre-Build Checklist

​✅ Infrastructure (Covered)

​⚠️ Technical (Needs Planning)

​⚠️ Business Logic (Needs Planning)

​⚠️ Compliance/Legal (Critical)

​⚠️ Failure Modes (Needs Planning)

Master Project Task List

Document Purpose

Project Overview

What We Are Building

Business Context

Core Capabilities

Architecture Summary

Voice Pipeline

Latency Budget (Target: <1000ms mouth-to-ear)

Infrastructure Topology

Infrastructure Decisions (Finalized)

Cost Structure

Per-Minute Breakdown (at 50k min/month scale)

Monthly Infrastructure

Build Phases

Phase 1: Foundation (Documents 1-6)

Phase 2: Core Infrastructure (Documents 7-11)

Phase 3: Provider Integrations (Documents 12-16)

Phase 4: Call Features (Documents 17-20)

Phase 5: Platform (Documents 21-23)

Phase 6: Operations (Documents 24-27)

Phase 7: Hardening (Documents 28-30)

Skills (Provider API Reference)

Success Criteria

MVP Definition

Quality Gates

Constraints & Requirements

Technical Constraints

Business Constraints

Non-Goals (Out of Scope for MVP)

Document Dependency Map

How to Use This Document

For Claude Code

For Human Oversight

Open Questions (To Be Resolved)

Version History

Next Steps

Pre-Build Checklist

✅ Infrastructure (Covered)

⚠️ Technical (Needs Planning)

⚠️ Business Logic (Needs Planning)

⚠️ Compliance/Legal (Critical)

⚠️ Failure Modes (Needs Planning)