This document provides a comprehensive overview of the SeedCore system architecture, specifically focusing on the mapping between Ray Serve applications and Ray Actors, and how they work together to form a distributed, intelligent organism.
SeedCore implements a distributed, intelligent organism architecture using Ray Serve for service orchestration and Ray Actors for distributed computation. The system is designed to be scalable, fault-tolerant, and capable of handling complex cognitive workloads.
The architecture follows a microservices pattern where each logical service is deployed as a Ray Serve application, which in turn spawns one or more Ray Actors to handle the actual computation and state management.
From serve status
, the system runs 6 main applications:
Application | Purpose | Status | Replicas |
---|---|---|---|
ml_service | ML inference and model serving | Running | 1 |
cognitive | Reasoning, planning, and cognitive tasks | Running | 2 |
coordinator | Global coordination, routing, and escalation | Running | 1 |
state | Centralized state aggregation and memory management | Running | 1 |
energy | Energy tracking and performance metrics | Running | 1 |
organism | Local organ management and task execution | Running | 1 |
Each application represents a logical service namespace that can contain multiple deployments and replicas. The cognitive service runs with 2 replicas for distributed reasoning capabilities, while the coordinator handles global task routing and escalation decisions.
Every deployment spawns one or more ServeReplica actors, plus global actors for control and proxying:
ServeDeployment: MLService
Actor: ServeReplica:ml_service:MLService
Replicas: 1 RUNNING
Ray Actor ID: Actor 2
Purpose: Dedicated ML inference service for model serving
ServeDeployment: CognitiveService
Actors: ServeReplica:cognitive:CognitiveService
Replicas: 2 RUNNING
Ray Actor IDs: Actor 1 and Actor 4
Purpose: Parallel workers for reasoning and planning tasks
Distribution: Replicas are distributed across different Ray nodes for redundancy
ServeDeployment: Coordinator
Replicas: 1 RUNNING
Ray Actor ID: Actor 5
Purpose: Global coordination, task routing, OCPS valve, and HGNN escalation
Route Prefix: /pipeline
ServeDeployment: StateService
Replicas: 1 RUNNING
Ray Actor ID: Actor 9
Purpose: Centralized state aggregation and memory management
Memory: 1GB allocated for state collection and caching
ServeDeployment: EnergyService
Replicas: 1 RUNNING
Ray Actor ID: Actor 10
Purpose: Energy tracking and performance metrics collection
Memory: 1GB allocated for energy state management
ServeDeployment: OrganismManager
Replicas: 1 RUNNING
Ray Actor ID: Actor 8
Purpose: Local organ management, agent distribution, and direct task execution
Memory: 2GB allocated for organism state management
Route Prefix: /organism
Besides service replicas, Serve maintains several control plane actors:
Actor | Purpose | Status |
---|---|---|
ServeController (Actor 3) | Central brain of Serve, manages deployments and replicas | Running |
ProxyActors (Actor 6 & 7) | Handle HTTP/GRPC ingress, route to correct deployment | Running |
StatusActor (Actor 0) | Ray Dashboard integration, tracks job/actor status | Running |
replica_states.RUNNING
f7d...
34d...
The current system shows healthy status across all components:
The SeedCore system implements a sophisticated multi-tier memory architecture that supports both real-time processing and long-term knowledge retention. This architecture is designed to optimize performance while maintaining system efficiency.
Tier | Name | Type | Purpose | Characteristics |
---|---|---|---|---|
Mw | Working Memory | Volatile | Fast access to recent information | High-speed cache, limited capacity |
Mlt | Long-Term Memory | Persistent | Durable knowledge storage | Large capacity, slower access |
Mfb | Flashbulb Memory | Persistent | High-salience events | Rare, critical events only |
Ma | Agent Private Memory | Volatile | Agent state representation | 128-D embedding vector |
Metric | Description | Measurement | Target |
---|---|---|---|
Task Success Rate | Percentage of successfully completed tasks | EWMA over time window | >95% |
Response Time | End-to-end task execution latency | P50, P95, P99 percentiles | <1000ms (SLO) |
Memory Hit Rate | Cache hit ratio in working memory | Hit/(Hit+Miss) ratio | >80% |
Throughput | Tasks processed per unit time | Tasks/second | Variable by service |
Resource Utilization | CPU, memory, energy consumption | Percentage of allocated resources | <80% |
Predicted value based on capabilities for load balancing and scheduling
Measure of uncertainty and learning potential for learning and adaptation
Actual resource usage for cost optimization
System-wide energy distribution for fair resource allocation
Note: This document provides a comprehensive overview of the SeedCore architecture. For detailed implementation specifics, refer to the individual component documentation in the docs/architecture/components/
directory.