Discover our quick start guide and build your first application in 5-10 minutes
Quick Start GuideApache Ignite provides reactive streaming with automatic backpressure control. DataStreamer delivers high-throughput ingestion while respecting cluster capacity. The system coordinates producer and consumer rates automatically. This prevents memory overflow and maintains cluster stability under high-velocity data streams.
DataStreamer implements reactive Publisher patterns. Producers publish data items. The system requests items based on cluster capacity. This pull-based approach prevents overwhelming the cluster with data faster than it can process.
The streaming system signals producers when ready for more data. Producers slow down when cluster capacity decreases. Producers speed up when capacity increases. This dynamic coordination happens automatically without manual tuning.
DataStreamer maintains internal buffers sized based on cluster capacity. Buffers absorb temporary rate mismatches. The system applies backpressure before buffers overflow. This prevents out-of-memory conditions during ingestion spikes.
The Publisher interface provides natural flow control. Applications receive signals when the cluster needs more data. Applications pause data generation when backpressure applies. This coordination works across network boundaries.
DataStreamer groups individual items into batches automatically. Batch sizes adapt to network conditions and cluster load. Larger batches reduce network overhead. Smaller batches reduce latency. The system balances throughput and latency dynamically.
The streamer routes data items to partition owners directly. Single-hop writes avoid coordinator overhead. Items for the same partition group together in batches. This optimization delivers maximum throughput for partitioned data.
DataStreamer processes multiple batches in parallel across cluster nodes. Each node processes its partition data independently. This parallelism scales linearly with cluster size. Adding nodes increases total ingestion throughput proportionally.
Streamed data writes directly to memory. No disk I/O during ingestion. Replication handles durability through distributed consensus. This memory-first approach delivers the throughput needed for high-velocity streams.
DataStreamer supports transactional writes. Batches commit atomically. Failures trigger automatic rollback. This ensures consistency for streamed data. Applications choose between throughput-optimized non-transactional mode or consistency-optimized transactional mode.
Streaming writes create new MVCC versions. Concurrent queries see consistent snapshots. Long-running aggregations don't block streaming ingestion. Readers never block writers. This enables mixed streaming and analytical workloads.
DataStreamer supports upsert operations. Inserts new records. Updates existing records. Applications specify keys for conflict resolution. This handles duplicate events in streaming scenarios without application-level deduplication logic.
The system preserves ordering within partitions. Events for the same key process in order. Events across partitions process in parallel. This ordering guarantee simplifies event stream processing logic.
Discover our quick start guide and build your first application in 5-10 minutes
Quick Start GuideLearn about DataStreamer configuration, batching strategies, and reactive patterns
Streaming Documentation