Apache Ignite 3 is a memory-first distributed SQL database platform that consolidates transactions, analytics, and compute workloads previously requiring separate systems. Built from the ground up, it represents a complete departure from traditional caching solutions toward a unified distributed computing platform with microsecond latencies and collocated processing capabilities.
Forget everything you knew about Apache Ignite. Version 3.0 is a complete architectural rewrite that transforms Ignite from a caching platform into a memory-first distributed computing platform with microsecond latencies and collocated processing.
Architectural Foundation: Schema-Driven Design
The core architectural shift in Ignite 3 is that your schema becomes the foundation for data placement, query optimization, and compute job scheduling. Instead of managing separate systems with different data models, you define your schema once and it drives everything.
// Unified platform connection
IgniteClient ignite = IgniteClient.builder()
.addresses("node1:10800", "node2:10800", "node3:10800")
.build();
Schema Creation: Ignite 3 supports three approaches for schema creation:
- SQL DDL - Traditional
CREATE TABLEstatements - Java Annotations API - POJO markup with
@Table,@Column, etc. - Java Builder API - Programmatic
TableDefinition.builder()approach
We use the Java Annotations API in this blog for their compile-time type safety and clear colocation syntax.
@Table(zone = @Zone(value = "MusicStore", storageProfiles = "default"))
public class Artist {
@Id
private Integer ArtistId;
@Column(value = "Name", length = 120, nullable = false)
private String Name;
// Constructors, getters, setters...
}
@Table(
zone = @Zone(value = "MusicStore", storageProfiles = "default"),
colocateBy = @ColumnRef("ArtistId"),
indexes = @Index(value = "IFK_AlbumArtistId", columns = {
@ColumnRef("ArtistId") })
)
public class Album {
@Id
private Integer AlbumId;
@Id
private Integer ArtistId;
@Column(value = "Title", length = 160, nullable = false)
private String Title;
// Constructors, getters, setters...
}
The colocateBy annotation ensures that albums are stored on the same nodes as their corresponding artists, eliminating distributed join overhead and enabling local processing.
Multiple APIs, Single Schema
Ignite 3 provides different API views into the same schema, eliminating impedance mismatch between operational and analytical workloads:
// RecordView for structured operations
RecordView<Artist> artists = ignite.tables()
.table("Artist")
.recordView(Artist.class);
// KeyValueView for high-performance access patterns
KeyValueView<Long, Album> albums = ignite.tables()
.table("Album")
.keyValueView(Long.class, Album.class);
// SQL for analytics using Apache Calcite engine
SqlStatement analytics = ignite.sql()
.statementBuilder()
.query("SELECT a.Name, COUNT(al.AlbumId) as AlbumCount " +
"FROM Artist a JOIN Album al ON a.ArtistId = al.ArtistId " +
"GROUP BY a.Name");
// Collocated compute jobs
ComputeJob<String> job = ComputeJob.colocated("Artist", 42,
RecommendationJob.class);
JobExecution<String> recommendation = ignite.compute()
.submit(ignite.clusterNodes(), job, "rock");
This approach eliminates the typical data serialization and movement overhead between different systems while maintaining type safety and schema evolution capabilities.
This represents a fundamental architectural shift from Ignite 2.x, that accessed data as key-value operations using the cache API. Ignite 3 puts an evolvable schema first and uses memory-centric storage to deliver microsecond latencies for all operations, not just cache hits.
Memory-First Storage Architecture
Unlike disk-first distributed databases, Ignite 3 uses a memory-first storage model with configurable persistence options:
-
aimem: Pure in-memory storage for maximum performance -
aipersist: Memory-first with persistence for durability -
RocksDB: Disk-based storage for write-heavy workloads
The memory-first approach delivers microsecond response times for hot data while providing flexible cost-performance trade-offs through configurable memory-to-disk ratios.
Storage Engine Characteristics
| Engine | Primary Use Case | Latency Profile | Durability |
|---|---|---|---|
| aimem | Ultra-low latency | Microseconds | Volatile |
| aipersist | Balanced performance | Microseconds (memory) | Persistent |
| RocksDB | Write-heavy workloads | Variable | Persistent |
Consistency and Concurrency Model
Ignite 3 implements Raft consensus for strong consistency and MVCC (Multi-Version Concurrency Control) for transaction isolation:
- Raft consensus: Ensures data consistency across replicas without split-brain scenarios
- MVCC transactions: Provides snapshot isolation and deadlock-free concurrency
- ACID compliance: Full transactional guarantees across distributed operations
This consistency model applies uniformly across all APIs, whether you're using RecordView operations, SQL queries, or compute jobs.
Collocated Processing: Compute-to-Data Architecture
One of Ignite 3's key architectural advantages is collocated processing, which brings computation to where data is stored rather than moving data to compute resources:
// Traditional approach: data movement overhead
// 1. Query data from database
// 2. Move data to compute cluster
// 3. Process data remotely
// 4. Return results
// Ignite 3 approach: compute colocation
ComputeJob<Result> job = ComputeJob.colocated("Customer", customerId,
RiskAnalysisJob.class);
CompletableFuture<Result> result = ignite.compute()
.submitAsync(job, parameters);
This compute-to-data pattern eliminates network serialization overhead and enables processing of large datasets without data movement. Instead of moving terabytes of data to processing nodes, you move kilobytes of code to where the data lives.
System Consolidation Benefits
Traditional distributed architectures typically require separate systems for different workloads:
Traditional Multi-System Architecture:
- Transactional database (PostgreSQL, MySQL) - millisecond latencies
- Analytics database (ClickHouse, Snowflake) - batch processing
- Caching layer (Redis, Hazelcast) - separate consistency model
- Compute cluster (Spark, Flink) - data movement overhead
- Message queue (Kafka, RabbitMQ) - separate operational model
- Stream processing (Kafka Streams, Pulsar) - additional complexity
Ignite 3 Unified Platform:
- Schema-driven storage with multiple storage engines - microsecond latencies
- SQL analytics through Apache Calcite - real-time processing
- Collocated compute processing - zero data movement
- Built-in streaming with flow control - integrated backpressure
- ACID transactions across all operations - single consistency model
- One operational model and consistency guarantee
Operational Advantages
- Unified Schema Evolution: Schema changes propagate automatically across all access patterns
- Single Consistency Model: ACID guarantees across transactions, analytics, and compute
- Reduced Operational Complexity: One system to monitor, tune, and scale
- Eliminated Data Movement: Processing happens where data lives
- Cost-Elastic Scaling: Adjust memory-to-disk ratios based on performance requirements
Streaming and Flow Control
Ignite 3 includes built-in streaming capabilities with configurable backpressure mechanisms:
// Publisher with flow control configuration
StreamingOptions options = StreamingOptions.builder()
.pageSize(1000)
.autoFlushFrequency(Duration.ofMillis(100))
.retryLimit(3)
.build();
// Handle millions of events with automatic backpressure
CompletableFuture<Void> streaming = ignite.sql()
.streamAsync("INSERT INTO events VALUES (?, ?, ?)",
eventStream,
options);
The streaming API provides automatic flow control through configurable page sizes, flush intervals, and retry policies, preventing system overload without data loss.
Performance Characteristics
Ignite 3's memory-first architecture delivers significantly different performance characteristics compared to disk-based distributed databases:
- Latency: Microsecond response times for memory-resident data vs. millisecond latencies for disk-based systems
- Throughput: Handles millions of operations per second per node
- Scalability: Linear scaling through data partitioning and colocation
- Consistency: ACID transactions with minimal overhead due to memory speeds
The 10-1000x performance improvement comes from eliminating disk I/O bottlenecks and data movement overhead through collocated processing.
Migration and Adoption Strategy
For technical teams considering Ignite 3:
Assessment Phase
- Workload Analysis: Identify performance-critical paths requiring microsecond latencies
- Data Model Mapping: Design colocation strategies for your entities
- Integration Points: Plan API migration from current multi-system architecture
- Performance Benchmarking: Compare memory-first vs. disk-first performance for your workloads
Implementation Approach
- Start with New Features: Use Ignite 3 for new development requiring low latency
- Gradual Migration: Move performance-critical workloads first
- Schema Design: Leverage colocation for optimal data locality
- Operational Integration: Integrate monitoring and deployment pipelines
Technical Considerations
Schema Design Best Practices
- Use
colocateByannotations to ensure related data stays together - Design partition keys to distribute load evenly across nodes
- Consider query patterns when defining indexes and colocation strategies
- Plan for schema evolution with backward-compatible changes
Performance Optimization
- Size memory regions appropriately for your working set
- Use collocated compute jobs to minimize data movement
- Leverage appropriate storage engines for different workload patterns
- Monitor memory usage and adjust disk ratios as needed
Operational Requirements
- Plan for Raft consensus network requirements (low-latency, reliable connectivity)
- Design backup and recovery procedures for persistent storage engines
- Implement monitoring for memory usage, query performance, and compute job execution
- Establish capacity planning procedures for memory-first architecture
Summary
Apache Ignite 3 represents a schema-driven distributed computing platform that consolidates transaction processing, analytics, and compute workloads into a single memory-first architecture. Key architectural elements include:
- Schema-driven design: Single schema definition drives data placement, query optimization, and compute colocation
- Memory-first storage: Multiple storage engines with microsecond latency characteristics
- Collocated processing: Compute-to-data architecture that eliminates data movement overhead
- Unified APIs: Multiple access patterns (RecordView, KeyValueView, SQL, Compute) for the same schema
- ACID consistency: Raft consensus and MVCC transactions across all operations
- Built-in streaming: Flow control and backpressure mechanisms for high-velocity data ingestion
The platform addresses scenarios where traditional multi-system architectures create operational complexity and performance bottlenecks through data movement between separate databases, compute clusters, and analytics systems.
Explore the Ignite 3 documentation for detailed implementation guides and API references.