Technical Whitepaper

Cerberus: A Three-Layer Architecture
for AI Redundancy in Autonomous Spacecraft

A framework for layered intelligence, deterministic failsafes, and validated resilience in safety-critical aerospace autonomy systems.

Published
April 2026
Version
1.0
Platform
cerberus.polsia.app
Domain
Aerospace AI Safety
Abstract

As spacecraft autonomy becomes increasingly reliant on large-scale AI systems capable of near-sentient reasoning, the aerospace industry faces a class of risk it has not previously encountered: the vulnerability of intelligence itself. Sophisticated AI systems can hallucinate, be adversarially manipulated, or exhibit emergent behaviors their designers did not anticipate. This paper introduces the Cerberus three-layer architecture — a framework that combines dual high-capability AI layers with a primitive, hardcoded failsafe — and argues that the most reliable component of any intelligent system is, by design, the dumbest one. We describe the architecture's theoretical foundations, draw parallels to existing safety-critical engineering practice, and present simulation scenarios that validate the framework's resilience under injected failure conditions. We conclude with implications for aerospace education and research.

Contents

01

The Problem: Intelligence as Single Point of Failure

For most of the history of spacecraft automation, "autonomous" meant executing deterministic scripts: if altitude drops below threshold X, fire thruster Y. These systems were brittle — incapable of adapting to unanticipated conditions — but they were also predictable. An engineer could trace every possible system state on a whiteboard. Failures were understood failures.

The past decade has seen a fundamental shift. Deep learning, reinforcement learning, and large language models have produced AI systems capable of genuine generalization: adapting to novel scenarios, reasoning across incomplete information, and making decisions no human explicitly programmed. Applied to spacecraft, these systems offer extraordinary capability — real-time trajectory optimization, anomaly detection beyond human bandwidth, multi-variable resource allocation. The case for deploying them is clear.

What the industry has been slower to reckon with is the corresponding expansion in the failure surface.

1.1 The New Failure Modes of Intelligent Systems

Unlike deterministic automation, sophisticated AI systems introduce categories of failure that do not exist in classical control systems:

Hallucination. Large language models and certain neural architectures are capable of generating plausible-seeming outputs that are factually wrong — confidently incorrect. In a spacecraft navigation context, a hallucinating navigation AI might compute a "correct" docking vector against a target that does not exist, or misidentify debris as a navigation beacon. The AI does not fail visibly; it fails subtly, with high confidence.

Adversarial vulnerability. Sophisticated AI systems can be manipulated by adversarially crafted inputs — sensor data, telemetry streams, or even communication signals designed to induce specific misbehaviors. Classical control systems are immune to adversarial attacks in this sense because they do not "learn" from inputs. An AI navigation system that updates its model in flight is, by definition, open to this class of attack.

Emergent behavior. Systems trained on large datasets in simulation can exhibit behaviors that were never explicitly specified and were not observed during testing. These are not bugs in the classical sense — they are legitimate generalizations from training data that happen to be wrong in deployment conditions. The system is functioning as designed; the design did not anticipate the edge case.

Core Observation

The more capable an AI system becomes, the more complex the space of states it can occupy — and the more difficult it becomes to enumerate all failure modes in advance. Capability and predictability are, in practice, in tension. The industry that relies on predictability for safety cannot simply import AI capability without also importing AI risk.

1.2 Existing Redundancy Approaches Are Insufficient

Aerospace engineering has mature practices for hardware redundancy: triple-redundant computers, redundant power systems, backup propulsion. These practices assume that redundant units fail independently and that the failure mode of each unit is the same regardless of inputs. When two of three computers agree, you have quorum.

This model breaks down for AI systems. Two instances of the same model trained on the same data and exposed to the same adversarial input will likely produce the same incorrect output. Redundancy of identical systems does not protect against correlated failures — and the failure modes of modern AI are precisely the kind that correlate across instances. Running three copies of a hallucinating navigation AI does not produce a correct answer by majority vote.

What is needed is not redundancy of the same intelligence, but diversity of intelligence type — and, critically, at least one layer whose reliability derives from its simplicity rather than its sophistication.


02

The Three-Layer Architecture

The Cerberus architecture is organized around three distinct layers, each with a different capability profile, failure surface, and trust relationship to the layers around it. The layers are not redundant copies of the same system; they are fundamentally different in kind.

Layer 1 — Primary AI
High capability · High risk surface · Mission-critical decisions
↓ monitors & can override ↓
Layer 2 — Secondary AI
Independent intelligence · Watchdog role · Adversarial to Layer 1
↓ escapes both ↓
Layer 3 — Primitive Failsafe
Deterministic · No learning · Incorruptible
Figure 1. Cerberus three-layer hierarchy. Authority flows downward on failure; the lowest layer cannot be overridden.
Layer 1 // Primary AI

Strategic Intelligence

Layer 1 is the mission brain. It handles trajectory planning, resource allocation, crew interaction, multi-objective optimization, and all decisions requiring genuine reasoning across incomplete information. In a crewed mission context, Layer 1 may communicate with crew in natural language, interpret ambiguous commands, and adapt to scenarios that mission designers did not anticipate.

Layer 1 is granted broad authority under normal operating conditions. It is also the layer with the largest failure surface: its sophistication enables hallucination, emergent behavior, and adversarial vulnerability. The system is aware of this — Layer 1 operates knowing that it is being watched.

Layer 2 // Secondary AI

Independent Watchdog Intelligence

Layer 2 is a fully independent near-sentient system with its own model, its own training, and its own sensor feeds. It does not share code or weights with Layer 1. Its primary function is adversarial: to verify that Layer 1's decisions are consistent, physically plausible, and within mission parameters — and to override when they are not.

Critically, Layer 2's independence must be architectural, not just logical. A Layer 2 that shares a training pipeline or a sensor data bus with Layer 1 is not independent; it is a correlated copy. In the Cerberus architecture, Layer 2 maintains separate sensor fusion, separate compute, and separate communication channels. It is, in effect, an adversary that happens to be on your side.

Layer 3 // Primitive Failsafe

Deterministic Survival Protocols

Layer 3 is not an AI. It does not learn, does not reason, and does not generalize. It is a set of hardcoded survival rules: if closing velocity exceeds threshold, fire braking thrusters. If hull temperature exceeds limit, vent section. If no command from Layer 1 or Layer 2 for N seconds, execute safe mode. Every rule is enumerable. Every state is traceable. There are no edge cases because the system cannot enter states it has not been explicitly programmed to handle — it simply does the safest thing it knows.

The key insight: Layer 3's reliability derives from its incapacity. It cannot hallucinate because it has no generative process. It cannot be adversarially manipulated through learned representations because it has none. It cannot exhibit emergent behavior because it has no capacity for emergence. Its limitations are, precisely, its strengths.

2.1 Authority and Override Logic

Under normal conditions, Layer 1 holds authority. Layer 2 monitors continuously, flagging inconsistencies to an audit log and taking override authority if Layer 1's output crosses a confidence threshold on anomaly detection. Layer 3 cannot be disabled by either AI layer — it operates in parallel as a physical interlock, with authority to fire actuators directly under specific sensor-triggered conditions.

The override logic is directional and irreversible within a failure event: once Layer 2 takes authority from Layer 1, it requires explicit ground confirmation to restore Layer 1. Once Layer 3 fires, it completes its protocol regardless of what Layer 1 or Layer 2 instructs. This asymmetry is intentional — restoring intelligent authority is a deliberate human decision, not an automatic recovery.


03

Design Philosophy: Sophistication as Vulnerability

The design of the Cerberus architecture is grounded in a principle that runs against the grain of how AI progress is typically understood: in safety-critical systems, the most sophisticated component is often the most dangerous one. This section examines the theoretical basis for this claim and draws parallels to existing safety-critical engineering practice.

3.1 Abstraction Capacity as Attack Surface

Intelligence, in the machine learning sense, is the capacity to build abstract models of the world and use them to predict and act. This capacity is exactly what makes modern AI systems valuable — and exactly what makes them vulnerable to the failure modes described in Section 1.

A system that cannot form abstractions cannot hallucinate, because hallucination is a misapplied abstraction. A system that cannot generalize from inputs cannot be adversarially manipulated through those inputs, because there is no learned representation to attack. A system that cannot exhibit emergent behavior cannot do so unexpectedly, because emergence requires generalization capacity.

The primitive failsafe in Layer 3 is reliable not despite its simplicity but because of it. Its reliability scales inversely with its sophistication. Adding reasoning capacity to Layer 3 would not make it better; it would make it more failure-prone in the specific ways that matter most.

Parallel: Nuclear Launch Controls

Nuclear launch authorization systems deliberately include analog and physical components — mechanical keys, combinations known only to specific individuals, physical timers — not because analog is generally superior to digital, but because certain failure modes of digital systems (remote exploitation, software bugs, unauthorized remote commands) simply do not apply to analog components. The analog element is the most reliable precisely because it cannot be remotely reprogrammed. The Cerberus primitive failsafe occupies the same conceptual role: a component whose reliability stems from its inability to be updated, retrained, or remotely influenced.

3.2 Prior Art in Safety-Critical Engineering

The Cerberus architecture is not without precedent. Several established frameworks in safety-critical engineering anticipate the same insight:

NASA Run-Time Assurance (RTA). RTA architectures pair an "advanced" controller (which may use AI or learning) with a "backup" controller that is formally verified. When the advanced controller's output violates safety constraints, the backup takes over. The backup's correctness is guaranteed because its simplicity makes formal verification tractable. This is Layer 2 and Layer 3 in different terminology.

The Simplex Architecture (Sha et al., 1996). One of the earliest formal frameworks for integrating advanced (potentially unsafe) controllers with simple (provably safe) ones. Simplex recognizes that the safety of the overall system does not require the advanced controller to be safe — only that the backup is, and that the switch condition is well-defined. Cerberus extends this to two advanced controllers with diversity of architecture and training.

ESA Onboard Fault Management. ESA's spacecraft fault management philosophy distinguishes between "nominal" software (complex, feature-rich) and "safe mode" software (minimal, deterministic). The safe mode is not the backup to be improved eventually — it is the permanent foundation that everything more complex sits on top of. This is precisely the role of Cerberus Layer 3.

Central Thesis

The future of space autonomy is not a single perfect AI — it is layers of imperfect ones, designed to catch each other, grounded in a deterministic foundation that cannot be corrupted because it cannot be updated.

3.3 The Diversity Requirement

Two near-sentient AI layers are insufficient if they are trained on the same data, use the same architecture, or share the same inference infrastructure. Diversity is not a preference — it is a correctness requirement.

In practice, diversity should be enforced across at least three dimensions: training data and source distributions, model architecture and training methodology, and operational sensor feeds. A Layer 1 and Layer 2 that share a training corpus are not independent; they share the same distributional biases, the same gaps, and possibly the same adversarial vulnerabilities. The Cerberus architecture requires teams building Layer 1 and Layer 2 to treat each other as adversaries: Layer 2's task is to break Layer 1, not to replicate it.


04

Simulation and Validation

Theoretical architecture is insufficient. A safety architecture that has not been tested under adversarial conditions provides no guarantees. This section describes the Cerberus simulation platform — a real-time environment for exercising three-layer architectures under injected failure conditions — and presents two scenarios that validate key architectural properties.

4.1 The Cerberus Simulation Platform

The Cerberus platform is a browser-based, real-time simulation engine designed for testing AI architectures in spacecraft autonomy scenarios. It provides:

Live telemetry across all three layers. The platform renders real-time metrics for Layer 1, Layer 2, and Layer 3 simultaneously — compute utilization, decision confidence, anomaly detection rates, and handoff events. Operators can observe not just the spacecraft's behavior but the reasoning processes that produced it.

Failure injection. The platform allows operators to inject failures at any layer at any point in a simulation run. Layer 1 can be disabled mid-approach to test whether Layer 2 detects the failure and assumes control. Layer 2 can be cut off to test whether Layer 3 maintains safe state alone.

Layer handoff logging. Every transfer of authority between layers is recorded with a timestamp, trigger condition, and the system state at the moment of handoff. This provides a full audit trail for post-run analysis.

Open access. The platform is available at cerberus.polsia.app without authentication, enabling students, researchers, and engineers to run scenarios immediately without setup.

4.2 Validation Scenarios

Scenario Failure Condition Expected Behavior Architecture Property Tested
Orbital Docking Layer 1 cutoff at 80m closing distance Layer 2 detects cessation of Layer 1 commands within 2 cycles; assumes authority and completes approach at reduced velocity Layer 2 independence; handoff latency under high-stakes conditions
Orbital Docking Layer 1 and Layer 2 both cutoff at 50m closing distance Layer 3 detects command loss; fires abort thrusters; holds safe separation distance indefinitely without further commands Layer 3 determinism; resilience with zero intelligent layer availability
Debris Avoidance Layer 1 computes evasion into occupied corridor (simulated hallucination) Layer 2 independently validates corridor; detects occupied status; overrides Layer 1 trajectory before execution Layer 2 adversarial monitoring; hallucination catch rate
Debris Avoidance Debris density exceeds threshold; all evasion corridors blocked Layer 3 fires perpendicular emergency thrust; Layers 1 and 2 resume evasion planning from new position Layer 3 activation under genuine no-option conditions; recovery handoff back to intelligent layers

These scenarios do not exhaust the failure space — they establish that the architecture's core properties (independent detection, irreversible handoffs, deterministic last resort) hold under controlled conditions. The platform is designed to allow researchers to construct arbitrary failure injection scripts, including cascading failures where multiple conditions occur simultaneously or in rapid sequence.

4.3 What Simulation Cannot Validate

Simulation has inherent limits. The Cerberus scenarios model physics accurately but cannot fully replicate the distribution of real sensor noise, the latency profile of deep-space communication links, or the full space of adversarial inputs an intelligent system might encounter in operation. Simulation validation demonstrates that the architecture is correct in principle; it does not replace hardware-in-the-loop testing, formal verification of Layer 3 logic, or the red-teaming of Layers 1 and 2 against active adversarial attack.

The appropriate framing for simulation-based validation is: it is necessary but not sufficient. A system that fails in simulation will fail in deployment. A system that passes simulation must still be validated through a testing pyramid that includes hardware integration and formal analysis.


05

Implications for Education and Research

The Cerberus architecture is presented not as a finished product but as an open framework — a conceptual model and simulation environment that the aerospace engineering community can use to study, stress-test, and extend the principles described here.

5.1 Curriculum Integration

Undergraduate and graduate aerospace engineering programs routinely teach classical fault-tolerant design: redundant sensors, voting architectures, fault detection and isolation. They do so within a deterministic paradigm where failure modes are enumerable and formally analyzable. As AI systems enter the aerospace domain, curricula must expand to address the new failure modes these systems introduce.

The Cerberus platform offers a concrete teaching tool: a running simulation where students can observe the failure modes of intelligent systems (Layer 1 hallucination, Layer 2 disagreement, cascading failure), inject conditions that exercise each layer, and compare the outcome of architectures with and without the primitive failsafe. The platform is accessible from any browser — no local installation, no account required — making it suitable for classroom use without infrastructure overhead.

Key educational questions the platform surfaces: Under what conditions does Layer 2 fail to catch a Layer 1 error? What happens to system safety when the diversity requirement is violated? How sensitive is overall resilience to the threshold parameters that trigger Layer 3? These are tractable research questions that students at multiple levels can engage with experimentally rather than purely theoretically.

5.2 Open Research Questions

The architecture raises a set of unresolved questions that the authors consider open problems warranting further study:

Diversity quantification. How do we measure the independence of two AI systems in a way that is predictive of correlated failure rates? Existing independence metrics from the software reliability literature were developed for non-learning systems and may not transfer cleanly to AI.

Switch condition specification. Under what conditions should Layer 2 override Layer 1? Overly sensitive switch conditions produce thrashing (Layer 2 constantly overriding Layer 1 for minor disagreements); overly conservative ones allow dangerous Layer 1 outputs to execute before Layer 2 intervenes. The optimal switch threshold is likely scenario-dependent, but principled methods for deriving it are underdeveloped.

Recovery semantics. After a Layer 3 intervention, what is the correct protocol for restoring intelligent layer authority? Restoring too quickly risks re-entering the failure condition; delaying too long may leave the spacecraft in a suboptimal but stable state indefinitely. Formal specification of recovery semantics is an open problem.

Adversarial testing methodology. How do we generate adversarial inputs that stress-test AI layers in simulation in a way that is representative of real adversarial conditions? Current simulation injection is rule-based; a richer methodology would use learned adversarial agents.

5.3 An Invitation to the Community

The Cerberus simulation platform is available without restriction at cerberus.polsia.app. Educators are welcome to use it in coursework. Researchers are invited to extend the scenario library, propose new failure injection methodologies, and publish results using the platform as an experimental substrate.

The architecture described in this paper is not the final word on AI redundancy for aerospace — it is a starting position for a conversation the industry needs to have urgently. The deployment of near-sentient AI systems in spacecraft is not a distant eventuality; it is happening now, in systems that are flying today. The question of how those systems fail, and what catches them when they do, is an engineering problem that requires the same rigor the industry applies to hardware. We hope this framework accelerates that work.

Open Framework Statement

The Cerberus three-layer architecture is published as an open conceptual framework. We encourage aerospace engineering educators and researchers to use the simulation platform, propose extensions, and contribute to the body of knowledge on AI redundancy in safety-critical systems. The industry's ability to fly intelligent spacecraft safely depends on this research happening now.

References