Version: 3.1.0 (Latest)

Distribution and Colocation

Data distribution determines how Ignite spreads data across cluster nodes. Colocation ensures related data resides on the same nodes, enabling efficient joins and transactions without cross-node data movement.

Distribution Zones

A distribution zone defines the rules for data placement across the cluster. Each table belongs to exactly one distribution zone that controls:

Number of partitions
Replication factor
Which nodes can store data
Scale-up and scale-down behavior

Create a distribution zone:

CREATE ZONE my_zone WITH
    PARTITIONS = 25,
    REPLICAS = 3,
    DATA_NODES_FILTER = '$[?(@.region == "us-east")]',
    DATA_NODES_AUTO_ADJUST_SCALE_UP = 300,
    DATA_NODES_AUTO_ADJUST_SCALE_DOWN = 600
    STORAGE_PROFILES = 'default';

Partitioning

When a table is created, its data is divided into partitions based on the zone configuration. Each partition is identified by a number from 0 to PARTITIONS - 1.

Rendezvous Hashing

Ignite uses rendezvous hashing (Highest Random Weight) to assign partitions to nodes. For each partition, the algorithm:

Computes a hash combining each node's ID with the partition number
Sorts nodes by their hash value in descending order
Assigns the partition to nodes with the highest scores

This approach provides:

Deterministic placement: The same partition always maps to the same nodes given the same cluster topology
Minimal rebalancing: Adding or removing nodes affects only partitions that must move
Stable ordering: Consistent replica orderings across all nodes

Partition Count Guidelines

Set partition count based on cluster size and available cores:

recommended_partitions = (node_count * cores_per_node * 2) / replica_count

Examples:

3 nodes, 8 cores each, 3 replicas: 16 to 32 partitions
7 nodes, 16 cores each, 3 replicas: 75 to 150 partitions

The maximum partition count is 65,000 per table. Using significantly more partitions than recommended creates overhead from partition management and distribution tracking.

Replication

Each partition is replicated across multiple nodes for fault tolerance. The REPLICAS parameter controls how many copies exist.

Consensus Groups

Replicas form a Raft consensus group. The group has two member types:

Type	Role
Peers	Vote on writes, contribute to quorum, can become primary
Learners	Receive data asynchronously, cannot vote, used for read scaling

The consensus group size determines how many peers participate in voting:

consensus_group_size = (quorum_size * 2) - 1

For 5 replicas with quorum size 2: 3 peers and 2 learners.

Quorum Requirements

Writes require acknowledgment from a quorum of peers before completion. Losing the majority of the consensus group puts the partition in read-only state.

Replicas	Default Quorum	Consensus Peers
1	1	1
2-4	2	3
5+	3	5

For high availability, use an odd number of replicas (3 or 5) to handle node failures without losing quorum.

Primary Replicas and Leases

Each partition has one primary replica that handles all read-write transactions. The primary is determined through a lease mechanism managed by the placement driver.

Lease properties:

Short-lived: Leases expire after a few seconds if not renewed
Non-revocable: A lease cannot be revoked before expiration
Single holder: Only one node holds the lease for a partition at any time
Negotiated: The candidate must accept the lease before becoming primary

Data Colocation

Colocation ensures that related data from different tables is stored on the same partition. This enables:

Joins without cross-node data transfer
Transactions touching related data without distributed coordination
Colocated compute jobs that access all needed data locally

Colocation Keys

Tables colocate when they share the same:

Distribution zone
Colocation key column(s) with matching values

Define colocation when creating tables:

-- Parent table
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(100)
) WITH PRIMARY_ZONE = 'my_zone';

-- Child table colocated by customer_id
CREATE TABLE orders (
    order_id INT,
    customer_id INT,
    total DECIMAL(10,2),
    PRIMARY KEY (order_id, customer_id)
) COLOCATE BY (customer_id)
  WITH PRIMARY_ZONE = 'my_zone';

-- Another colocated table
CREATE TABLE order_items (
    item_id INT,
    order_id INT,
    customer_id INT,
    product_id INT,
    quantity INT,
    PRIMARY KEY (item_id, customer_id)
) COLOCATE BY (customer_id)
  WITH PRIMARY_ZONE = 'my_zone';

Colocation Requirements

For colocation to work:

Same zone: Tables must use the same distribution zone
Matching columns: Colocation key columns must have compatible types
Primary key inclusion: Colocation columns must be part of the primary key
Consistent hashing: All colocated tables use the same hash function on colocation keys

Query Optimization

The SQL engine detects colocated tables and optimizes join execution:

-- This join executes locally on each node without data shuffling
SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.customer_id = 100;

Without colocation, the same query would require transferring data between nodes.

Partition Rebalancing

When cluster topology changes, Ignite redistributes partitions to maintain balanced data distribution.

Rebalancing is controlled by scale timers:

DATA_NODES_AUTO_ADJUST_SCALE_UP: Delay before adding new node to data distribution (default: immediate)
DATA_NODES_AUTO_ADJUST_SCALE_DOWN: Delay before removing departed node from distribution (default: immediate)

Setting delays prevents unnecessary rebalancing during rolling restarts or brief network issues.

Node Filtering

The DATA_NODES_FILTER parameter selects which nodes participate in the zone using JSONPath expressions against node attributes:

-- Only nodes in us-east region
CREATE ZONE us_east_zone WITH
    DATA_NODES_FILTER = '$[?(@.region == "us-east")]',
    STORAGE_PROFILES = 'default';

-- Nodes with SSD storage and at least 32GB RAM
CREATE ZONE high_performance WITH
    DATA_NODES_FILTER = '$[?(@.storage == "ssd" && @.memory >= 32)]',
    STORAGE_PROFILES = 'default';

Node attributes are configured in each node's configuration file.

Design Constraints

Zone immutability: Once a table is assigned to a zone, it cannot be moved to a different zone
Partition count fixed: The number of partitions cannot be changed after zone creation
Colocation at creation: Colocation must be specified when creating the table
Single leaseholder: Only one node can be the primary for a partition at any time
Consensus majority: Losing the majority of consensus peers makes the partition read-only

Distribution Zones SQL Reference for zone DDL syntax
Data Partitioning for partition internals
Disaster Recovery for handling partition failures

Distribution Zones​

Partitioning​

Rendezvous Hashing​

Partition Count Guidelines​

Replication​

Consensus Groups​

Quorum Requirements​

Primary Replicas and Leases​

Data Colocation​

Colocation Keys​

Colocation Requirements​

Query Optimization​

Partition Rebalancing​

Node Filtering​

Design Constraints​

Related Topics​