Serve ↔ Actor Architecture Overview

This document provides a comprehensive overview of the SeedCore system architecture, specifically focusing on the mapping between Ray Serve applications and Ray Actors, and how they work together to form a distributed, intelligent organism.

System Overview
Serve Applications (Logical Apps)
Serve Deployments → Ray Actors
Control Plane Actors
System Relationships
Architecture Diagram
Health Indicators
Memory Architecture
Performance and Energy Tracking
Scaling and Distribution

System Overview

SeedCore implements a distributed, intelligent organism architecture using Ray Serve for service orchestration and Ray Actors for distributed computation. The system is designed to be scalable, fault-tolerant, and capable of handling complex cognitive workloads.

The architecture follows a microservices pattern where each logical service is deployed as a Ray Serve application, which in turn spawns one or more Ray Actors to handle the actual computation and state management.

Serve Applications (Logical Apps)

From serve status, the system runs 6 main applications:

Application	Purpose	Status	Replicas
ml_service	ML inference and model serving	Running	1
cognitive	Reasoning, planning, and cognitive tasks	Running	2
coordinator	Global coordination, routing, and escalation	Running	1
state	Centralized state aggregation and memory management	Running	1
energy	Energy tracking and performance metrics	Running	1
organism	Local organ management and task execution	Running	1

Each application represents a logical service namespace that can contain multiple deployments and replicas. The cognitive service runs with 2 replicas for distributed reasoning capabilities, while the coordinator handles global task routing and escalation decisions.

Serve Deployments → Ray Actors

Every deployment spawns one or more ServeReplica actors, plus global actors for control and proxying:

1. ML Service → MLService

ServeDeployment: MLService
Actor: ServeReplica:ml_service:MLService
Replicas: 1 RUNNING
Ray Actor ID: Actor 2
Purpose: Dedicated ML inference service for model serving

2. Cognitive → CognitiveService

ServeDeployment: CognitiveService
Actors: ServeReplica:cognitive:CognitiveService
Replicas: 2 RUNNING
Ray Actor IDs: Actor 1 and Actor 4
Purpose: Parallel workers for reasoning and planning tasks
Distribution: Replicas are distributed across different Ray nodes for redundancy

3. Coordinator → Coordinator

ServeDeployment: Coordinator
Replicas: 1 RUNNING
Ray Actor ID: Actor 5
Purpose: Global coordination, task routing, OCPS valve, and HGNN escalation
Route Prefix: /pipeline

4. State → StateService

ServeDeployment: StateService
Replicas: 1 RUNNING
Ray Actor ID: Actor 9
Purpose: Centralized state aggregation and memory management
Memory: 1GB allocated for state collection and caching

5. Energy → EnergyService

ServeDeployment: EnergyService
Replicas: 1 RUNNING
Ray Actor ID: Actor 10
Purpose: Energy tracking and performance metrics collection
Memory: 1GB allocated for energy state management

6. Organism → OrganismManager

ServeDeployment: OrganismManager
Replicas: 1 RUNNING
Ray Actor ID: Actor 8
Purpose: Local organ management, agent distribution, and direct task execution
Memory: 2GB allocated for organism state management
Route Prefix: /organism

Control Plane Actors

Besides service replicas, Serve maintains several control plane actors:

Actor	Purpose	Status
ServeController (Actor 3)	Central brain of Serve, manages deployments and replicas	Running
ProxyActors (Actor 6 & 7)	Handle HTTP/GRPC ingress, route to correct deployment	Running
StatusActor (Actor 0)	Ray Dashboard integration, tracks job/actor status	Running

System Relationships

1. Serve → Actors Mapping

Each Serve deployment represents one logical service
It owns N ServeReplica actors (workers), distributed across Ray nodes
Scale is reflected in replica_states.RUNNING
Replicas can be scaled independently based on workload requirements

2. Actors → Nodes Distribution

Replicas are distributed between nodes for fault tolerance
Example: CognitiveService has 2 replicas on different nodes:
- Actor 1 on node f7d...
- Actor 4 on node 34d...
This provides redundancy and parallelism

3. Core Service Interactions

Coordinator & OrganismManager

Coordinator (Actor 5) handles global task routing and coordination
Uses OCPS valve for drift detection and escalation decisions
Routes tasks to OrganismManager for local execution
OrganismManager (Actor 8) manages organ lifecycle and agent distribution

CognitiveService Integration

CognitiveService (Actors 1 & 4) provides reasoning and planning capabilities
Called by Coordinator for HGNN decomposition and escalation
Serves as the "Cognitive organ" for complex task planning

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Ray Cluster │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ Serve Proxy │ │ Serve Proxy │ │ ServeController │ │ │ │ (Actor 6) │ │ (Actor 7) │ │ (Actor 3) │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ │ │ │ │ └───────────────────────┼───────────────────────┘ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Serve Applications │ │ │ │ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ml_service│ │cognitive│ │coordinator│ │ state │ │ energy │ │organism │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │MLService│ │Cognitive│ │Coordinator│ │StateSvc │ │EnergySvc│ │OrganismMgr│ │ │ │ │ │(Actor 2)│ │(Actor 1,4)│ │(Actor 5)│ │(Actor 9)│ │(Actor 10)│ │(Actor 8)│ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Ray Actors │ │ │ │ │ │ │ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │ │ │A 0 │ │A 1 │ │A 2 │ │A 3 │ │A 4 │ │A 5 │ │A 6 │ │A 7 │ │ │ │ │ │Status│ │Cogn │ │ML │ │Ctrl │ │Cogn │ │Coord│ │Proxy│ │Proxy│ │ │ │ │ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ │ │ │ │ │ │ │ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │ │ │A 8 │ │A 9 │ │A 10 │ │A 11 │ │ │ │ │ │Org │ │State│ │Energy│ │Tier0│ │ │ │ │ │Mgr │ │Svc │ │Svc │ │Mem │ │ │ │ │ └─────┘ └─────┘ └─────┘ └─────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Memory Architecture │ │ │ │ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ Mw │ │ Mlt │ │ Mfb │ │ Ma │ │ │ │ │ │ Working │ │Long-Term│ │Flashbulb│ │Private │ │ │ │ │ │ Memory │ │ Memory │ │ Memory │ │ Memory │ │ │ │ │ │(Volatile)│ │(Persistent)│ │(Rare) │ │(128-D) │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────────┘

Health Indicators

The current system shows healthy status across all components:

✅ All deployments are RUNNING
✅ Cognitive has 2 replicas for distributed reasoning
✅ OrganismManager is alive and ready to manage organs
✅ No DEAD actors left hanging
✅ Proper replica distribution across nodes

Memory Architecture

The SeedCore system implements a sophisticated multi-tier memory architecture that supports both real-time processing and long-term knowledge retention. This architecture is designed to optimize performance while maintaining system efficiency.

Memory Tiers Overview

Tier	Name	Type	Purpose	Characteristics
Mw	Working Memory	Volatile	Fast access to recent information	High-speed cache, limited capacity
Mlt	Long-Term Memory	Persistent	Durable knowledge storage	Large capacity, slower access
Mfb	Flashbulb Memory	Persistent	High-salience events	Rare, critical events only
Ma	Agent Private Memory	Volatile	Agent state representation	128-D embedding vector

Memory Access Flow

Task Query ↓ 1. Check Mw (Working Memory) - Fast lookup ↓ (cache miss) 2. Query Mlt (Long-Term Memory) - Persistent storage ↓ (success) 3. Cache result back to Mw for future access ↓ (high-salience event) 4. Optionally log to Mfb (Flashbulb Memory)

Performance and Energy Tracking

Core Performance Indicators

Metric	Description	Measurement	Target
Task Success Rate	Percentage of successfully completed tasks	EWMA over time window	>95%
Response Time	End-to-end task execution latency	P50, P95, P99 percentiles	<1000ms (SLO)
Memory Hit Rate	Cache hit ratio in working memory	Hit/(Hit+Miss) ratio	>80%
Throughput	Tasks processed per unit time	Tasks/second	Variable by service
Resource Utilization	CPU, memory, energy consumption	Percentage of allocated resources	<80%

Energy Management System

Expected Contribution

Predicted value based on capabilities for load balancing and scheduling

Entropy

Measure of uncertainty and learning potential for learning and adaptation

Consumption

Actual resource usage for cost optimization

Balance

System-wide energy distribution for fair resource allocation

Scaling and Distribution

Horizontal Scaling

CognitiveService: Currently 2 replicas, can scale based on reasoning workload
Other services: Single replica, can be scaled as needed
Load balancing: Automatic across replicas via Serve proxy

Fault Tolerance

Multi-node distribution: Replicas spread across different Ray nodes
Automatic failover: Serve handles replica failures and restarts
State management: Actor state preserved across failures

Performance Characteristics

Parallel processing: Multiple CognitiveService replicas handle concurrent reasoning tasks
Low latency: Direct actor-to-actor communication within Ray cluster
High throughput: Distributed processing across multiple nodes

Note: This document provides a comprehensive overview of the SeedCore architecture. For detailed implementation specifics, refer to the individual component documentation in the docs/architecture/components/ directory.