Version: 3.1.0 (Latest)

Architecture Overview

Apache Ignite 3 is a distributed database built on a modular component architecture. Each node runs the same set of services, enabling any node to handle client requests and participate in data storage. This document provides a deeper look at the system architecture introduced in What is Apache Ignite 3?.

System Architecture

Node Components

Every Ignite node runs the same component set. There is no distinction between "server" and "coordinator" nodes at the software level. Node roles emerge from RAFT group membership and placement driver lease assignments.

Core Services

Component	Responsibility
Cluster Service	Network communication, node discovery, message routing
Vault Manager	Local persistent storage for node-specific data
Hybrid Clock	Distributed timestamp generation for MVCC
Failure Manager	Node failure detection and handling

Cluster Coordination

Component	Responsibility
Cluster Management Group (CMG)	Cluster initialization, node admission, logical topology
MetaStorage	Distributed metadata (schemas, configurations, leases)
Placement Driver	Primary replica selection via time-bounded leases

Data Management

Component	Responsibility
Table Manager	Table lifecycle and distributed operations
Replica Manager	Partition replica lifecycle and request routing
Distribution Zone Manager	Data placement policies and partition assignments
Index Manager	Index creation, maintenance, and async building

Query and Transaction Processing

Component	Responsibility
SQL Query Processor	Query parsing, optimization, distributed execution
Transaction Manager	ACID transaction coordination, 2PC protocol
Compute Component	Distributed job execution

Cluster Coordination

Cluster Management Group (CMG)

The CMG is a dedicated RAFT group responsible for cluster-wide coordination:

CMG membership is established during cluster initialization. Typically 3 to 5 nodes are selected as voting members; remaining nodes participate as learners. The CMG validates new nodes before admitting them to the cluster.

MetaStorage

MetaStorage is a distributed key-value store replicated across all nodes via RAFT:

All cluster metadata changes flow through MetaStorage. Nodes maintain local replicas and receive updates via RAFT log replication. The watch mechanism enables real-time notification of metadata changes.

Placement Driver

The Placement Driver manages primary replica selection through time-bounded leases:

Lease properties:

Duration: Configurable, default 5 seconds
Renewal: Automatic, every half of expiration interval
Selection priority: Current holder, then RAFT leader, then stable assignments
Persistence: Stored in MetaStorage for cluster-wide visibility

RAFT Replication

Apache Ignite 3 uses multiple RAFT groups for different purposes:

RAFT Group	Purpose	Members
CMG	Cluster management	3-5 voting nodes
MetaStorage	Cluster metadata	3-5 voting + learner nodes
Placement Driver	Lease management	Placement driver nodes
Partition Groups	Data replication	Nodes in partition assignments

Partition RAFT Groups

Each table partition forms its own RAFT group:

Partition RAFT groups provide:

Write linearization: All writes go through the leader
Replication: Log entries replicated to followers
Automatic failover: New leader elected on failure
State machine: Multi-version storage with B+ trees

Transaction Processing

Transaction Coordination

The Transaction Manager coordinates distributed transactions using two-phase commit:

Concurrency Control

Apache Ignite 3 uses a hybrid concurrency control model:

Transaction Type	Concurrency Control	Isolation
Read-Write	Lock-based with MVCC	Serializable
Read-Only	Timestamp-based snapshot	Snapshot

Read-write transactions acquire locks through the primary replica. Read-only transactions use timestamp-based reads against MVCC version chains without acquiring locks.

Query Execution

SQL Processing Pipeline

The SQL engine is built on Apache Calcite and executes queries in stages:

Parsing: SQL text converted to abstract syntax tree
Validation: Semantic validation against the system catalog
Optimization: Cost-based optimization using Calcite rules
Physical Planning: Conversion to physical operators (IgniteRel)
Mapping: Plan fragments assigned to cluster nodes
Execution: Distributed execution with inter-node data exchange

Fragment Distribution

Query plans are split into fragments that execute on different nodes:

Exchange operators (Sender/Receiver) handle data movement between fragments. Colocation optimization reduces exchanges when joined data resides on the same nodes.

Request Flow

A typical SQL query flows through these components:

Component Initialization Order

Components start in dependency order:

VaultManager → ClusterService → CMG → MetaStorage → PlacementDriver →
ReplicaManager → TxManager → TableManager → SqlQueryProcessor

This ordering ensures each component's dependencies are available before it starts.

Next Steps

Storage Architecture for storage layer details
Data Partitioning for partition distribution and replication
Transactions and MVCC for transaction processing details

System Architecture​

Node Components​

Core Services​

Cluster Coordination​

Data Management​

Query and Transaction Processing​

Cluster Coordination​

Cluster Management Group (CMG)​

MetaStorage​

Placement Driver​

RAFT Replication​

Partition RAFT Groups​

Transaction Processing​

Transaction Coordination​

Concurrency Control​

Query Execution​

SQL Processing Pipeline​

Fragment Distribution​

Request Flow​

Component Initialization Order​

Next Steps​

System Architecture

Node Components

Core Services

Cluster Coordination

Data Management

Query and Transaction Processing

Cluster Coordination

Cluster Management Group (CMG)

MetaStorage

Placement Driver

RAFT Replication

Partition RAFT Groups

Transaction Processing

Transaction Coordination

Concurrency Control

Query Execution

SQL Processing Pipeline

Fragment Distribution

Request Flow

Component Initialization Order

Next Steps