Ignite 3
Edit

Calcite-based SQL Engine

Starting the 2.13 version, Apache Ignite includes a new SQL engine based on the Apache Calcite framework.

Apache Calcite is a dynamic data management framework, which mainly serves for mediating between applications, one or more data storage locations, and data processing engines. For more information on Apache Calcite, please see the product documentation.

The current H2-based SQL engine has a number of fundamental limitations of query execution in a distributed environment. To address these limitations, a new SQL engine was implemented. The new engine uses tools provided by Apache Calcite for parsing and planning queries. It also has a new query execution flow.

Calcite Module Libraries

To use a Calcite-based engine, please make sure that the Calcite module libraries are in a classpath.

Standalone Mode

When starting a standalone node, move optional/ignite-calcite folder to the libs folder before running ignite.{sh|bat} script. In this case, the content of the module folder is added to the classpath.

Maven Configuration

If you are using Maven to manage dependencies of your project, you can add Calcite module dependency as follows: Replace ${ignite.version} with the actual Apache Ignite version you are interested in:

<dependency>
    <groupId>org.apache.ignite</groupId>
    <artifactId>ignite-calcite</artifactId>
    <version>${ignite.version}</version>
</dependency>

Configuring Query Engines

To enable engine, add the explicit CalciteQueryEngineConfiguration instance to the SqlConfiguration.QueryEnginesConfiguration property.

Below is a configuration example of two configured query engines (H2-based and Calcite-based engines) where the Calcite-based engine is chosen as a default one:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="sqlConfiguration">
        <bean class="org.apache.ignite.configuration.SqlConfiguration">
            <property name="queryEnginesConfiguration">
                <list>
                    <bean class="org.apache.ignite.indexing.IndexingQueryEngineConfiguration">
                        <property name="default" value="false"/>
                    </bean>
                    <bean class="org.apache.ignite.calcite.CalciteQueryEngineConfiguration">
                        <property name="default" value="true"/>
                    </bean>
                </list>
            </property>
        </bean>
    </property>
    ...
</bean>
IgniteConfiguration cfg = new IgniteConfiguration().setSqlConfiguration(
    new SqlConfiguration().setQueryEnginesConfiguration(
        new IndexingQueryEngineConfiguration(),
        new CalciteQueryEngineConfiguration().setDefault(true)
    )
);

Routing Queries to Query Engine

Normally, all queries are routed to the query engine that is configured by default. If more than one engine is configured through queryEnginesConfiguration, it’s possible to use another engine instead of the one configured default for individual queries or for the whole connection.

JDBC

To choose a query engine for the JDBC connection, use the queryEngine connection parameter:

jdbc:ignite:thin://127.0.0.1:10800?queryEngine=calcite

ODBC

To configure the query engine for the ODBC connection, use the QUERY_ENGINE property:

[IGNITE_CALCITE]
DRIVER={Apache Ignite};
SERVER=127.0.0.1;
PORT=10800;
SCHEMA=PUBLIC;
QUERY_ENGINE=CALCITE

Memory Quotas

The Calcite-based SQL engine can track and limit heap memory accounted by Calcite query execution operators. This is useful for protecting a server node from a single large query or from many concurrent memory-heavy queries.

Two types of quotas can be configured in CalciteQueryEngineConfiguration:

  • globalMemoryQuota - a per-node heap memory quota for all Calcite SQL queries running on the node.

  • queryMemoryQuota - a per-node heap memory quota for each Calcite SQL query running on the node.

Both quotas are disabled by default (0). The quota values are specified in bytes. If a quota is exceeded, the query fails with an exception. The global quota error message contains Global memory quota for SQL queries exceeded, and the per-query quota error message contains Query quota exceeded.

The quota is applied to query execution structures that keep rows in heap memory, for example sorting, hash joins, hash aggregates, set operations, collection operations, spools, and result materialization. It is not a process memory limit and it does not account for Ignite data regions, direct memory, JVM native memory, or the operating system page cache. The quotas are per-node. For a distributed query, total memory consumption across the cluster can be higher because the limit is applied independently on every participating node. Size these quotas together with -Xmx and the expected SQL concurrency.

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="sqlConfiguration">
        <bean class="org.apache.ignite.configuration.SqlConfiguration">
            <property name="queryEnginesConfiguration">
                <list>
                    <bean class="org.apache.ignite.calcite.CalciteQueryEngineConfiguration">
                        <property name="default" value="true"/>
                        <property name="globalMemoryQuota" value="#{4L * 1024 * 1024 * 1024}"/>
                        <property name="queryMemoryQuota" value="#{512L * 1024 * 1024}"/>
                    </bean>
                </list>
            </property>
        </bean>
    </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration().setSqlConfiguration(
    new SqlConfiguration().setQueryEnginesConfiguration(
        new CalciteQueryEngineConfiguration()
            .setDefault(true)
            .setGlobalMemoryQuota(4L * 1024 * 1024 * 1024)
            .setQueryMemoryQuota(512L * 1024 * 1024)
    )
);

SQL Reference

DDL

Data definition language (DDL) statements are compliant with the old H2-based engine. You can find the DDL syntax description here.

DML

The new SQL engine mostly inherits data manipulation language (DML) statements syntax from the Apache Calcite framework. See the Apache Calcite SQL grammar description here.

In most cases, statement syntax is compliant with the old SQL engine. But there are still some differences between DML dialects in H2-based engine and Calcite-based engine. For example, note the MERGE statement syntax has changed.

Supported Functions

The Calcite-based SQL engine currently supports the following user-facing functions and operators.

Aggregate functions

Name Syntax Description

COUNT

COUNT([ALL | DISTINCT] value) or COUNT(*)

Returns the number of input rows or non-null input values.

SUM

SUM([ALL | DISTINCT] numeric)

Returns the sum of numeric input values.

AVG

AVG([ALL | DISTINCT] numeric)

Returns the average value of numeric input values.

MIN

MIN([ALL | DISTINCT] value)

Returns the minimum input value.

MAX

MAX([ALL | DISTINCT] value)

Returns the maximum input value.

ANY_VALUE

ANY_VALUE([ALL | DISTINCT] value)

Returns one arbitrary input value.

LISTAGG

LISTAGG(value[, separator]) WITHIN GROUP (ORDER BY sortExpr)

Concatenates values from a group in the requested order.

GROUP_CONCAT

GROUP_CONCAT(value[, separator] [ORDER BY sortExpr])

Concatenates values from a group, optionally using a separator and ordering.

STRING_AGG

STRING_AGG(value, separator [ORDER BY sortExpr])

Concatenates string values from a group using a separator.

ARRAY_AGG

ARRAY_AGG(value [ORDER BY sortExpr])

Collects input values into an array.

ARRAY_CONCAT_AGG

ARRAY_CONCAT_AGG(arrayValue [ORDER BY sortExpr])

Concatenates arrays from input rows into one array.

EVERY

EVERY(condition)

Returns TRUE if every input condition is TRUE.

SOME

SOME(condition)

Returns TRUE if at least one input condition is TRUE.

BIT_AND

BIT_AND(integer)

Aggregates integer values using bitwise AND.

BIT_OR

BIT_OR(integer)

Aggregates integer values using bitwise OR.

BIT_XOR

BIT_XOR(integer)

Aggregates integer values using bitwise XOR.

Aggregate FILTER clause

aggregateFunction(…​) FILTER (WHERE condition)

Applies an aggregate function only to input rows that satisfy the condition.

String functions and predicates

Name Syntax Description

UPPER

UPPER(string)

Converts a string to uppercase.

LOWER

LOWER(string)

Converts a string to lowercase.

INITCAP

INITCAP(string)

Converts each word to initial uppercase followed by lowercase characters.

TO_BASE64

TO_BASE64(string)

Encodes a string using Base64.

FROM_BASE64

FROM_BASE64(string)

Decodes a Base64 string.

MD5

MD5(string)

Returns the MD5 hash of a string as a hexadecimal string.

SHA1

SHA1(string)

Returns the SHA-1 hash of a string as a hexadecimal string.

SUBSTRING

SUBSTRING(string FROM start [FOR length])

Returns a substring starting at the specified position.

LEFT

LEFT(string, length)

Returns the leftmost characters of a string.

RIGHT

RIGHT(string, length)

Returns the rightmost characters of a string.

REPLACE

REPLACE(string, search[, replacement])

Replaces occurrences of a search string.

TRANSLATE

TRANSLATE(string, fromString, toString)

Replaces characters from one set with the corresponding characters from another set.

CHR

CHR(integer)

Returns the character for the specified code point.

CHAR_LENGTH

CHAR_LENGTH(string)

Returns the number of characters in a string.

CHARACTER_LENGTH

CHARACTER_LENGTH(string)

Returns the number of characters in a string.

LENGTH

LENGTH(string)

Alias for CHAR_LENGTH(string).

||

string || string

Concatenates two strings.

CONCAT

CONCAT(string, string) or CONCAT(string[, string]…​)

Concatenates strings.

OVERLAY

OVERLAY(string1 PLACING string2 FROM start [FOR length])

Replaces part of a string with another string.

POSITION

POSITION(substring IN string [FROM start])

Returns the position of the first occurrence of a substring.

ASCII

ASCII(string)

Returns the code point of the first character of a string.

REPEAT

REPEAT(string, count)

Returns a string repeated the specified number of times.

SPACE

SPACE(count)

Returns a string that contains the specified number of spaces.

STRCMP

STRCMP(string1, string2)

Compares two strings and returns an integer comparison result.

SOUNDEX

SOUNDEX(string)

Returns the phonetic representation of a string.

DIFFERENCE

DIFFERENCE(string1, string2)

Returns a similarity score for the SOUNDEX values of two strings.

REVERSE

REVERSE(string)

Returns a string with characters in reverse order.

TRIM

TRIM([{BOTH | LEADING | TRAILING} chars FROM] string)

Removes characters from the start, end, or both ends of a string.

LTRIM

LTRIM(string)

Removes spaces from the start of a string.

RTRIM

RTRIM(string)

Removes spaces from the end of a string.

LIKE

string LIKE pattern [ESCAPE escapeChar]

Checks whether a string matches a SQL LIKE pattern.

SIMILAR TO

string SIMILAR TO pattern [ESCAPE escapeChar]

Checks whether a string matches a SQL regular expression pattern.

Regular expression functions and operators

Name Syntax Description

~

string ~ pattern

Checks whether a string matches a case-sensitive POSIX regular expression.

~*

string ~* pattern

Checks whether a string matches a case-insensitive POSIX regular expression.

!~

string !~ pattern

Checks whether a string does not match a case-sensitive POSIX regular expression.

!~*

string !~* pattern

Checks whether a string does not match a case-insensitive POSIX regular expression.

REGEXP_REPLACE

REGEXP_REPLACE(string, regexp, replacement[, position[, occurrence[, matchType]]])

Replaces substrings that match a regular expression.

REGEXP_SUBSTR

REGEXP_SUBSTR(string, regexp[, position[, occurrence]])

Returns the substring that matches a regular expression.

Numeric and math functions

Name Syntax Description

MOD

MOD(numeric1, numeric2) or numeric1 % numeric2

Returns the remainder after division.

BITAND

BITAND(integer1, integer2)

Returns the bitwise AND of two integer values.

BITOR

BITOR(integer1, integer2)

Returns the bitwise OR of two integer values.

BITXOR

BITXOR(integer1, integer2)

Returns the bitwise XOR of two integer values.

EXP

EXP(numeric)

Returns Euler’s number raised to a power.

POWER

POWER(numeric1, numeric2)

Returns one numeric value raised to the power of another.

LN

LN(numeric)

Returns the natural logarithm.

LOG10

LOG10(numeric)

Returns the base-10 logarithm.

ABS

ABS(numeric)

Returns the absolute value.

RAND

RAND([seed])

Returns a random double value.

RAND_INTEGER

RAND_INTEGER([seed,] bound)

Returns a random integer from 0 inclusive to bound exclusive.

ACOS

ACOS(numeric)

Returns the arc cosine.

ACOSH

ACOSH(numeric)

Returns the inverse hyperbolic cosine.

ASIN

ASIN(numeric)

Returns the arc sine.

ASINH

ASINH(numeric)

Returns the inverse hyperbolic sine.

ATAN

ATAN(numeric)

Returns the arc tangent.

ATANH

ATANH(numeric)

Returns the inverse hyperbolic tangent.

ATAN2

ATAN2(numeric1, numeric2)

Returns the angle from rectangular coordinates.

SQRT

SQRT(numeric)

Returns the square root.

CBRT

CBRT(numeric)

Returns the cube root.

COS

COS(numeric)

Returns the cosine.

COSH

COSH(numeric)

Returns the hyperbolic cosine.

COT

COT(numeric)

Returns the cotangent.

COTH

COTH(numeric)

Returns the hyperbolic cotangent.

DEGREES

DEGREES(numeric)

Converts radians to degrees.

RADIANS

RADIANS(numeric)

Converts degrees to radians.

ROUND

ROUND(numeric[, scale])

Rounds a numeric value to the specified scale.

SIGN

SIGN(numeric)

Returns the sign of a numeric value.

SIN

SIN(numeric)

Returns the sine.

SINH

SINH(numeric)

Returns the hyperbolic sine.

TAN

TAN(numeric)

Returns the tangent.

TANH

TANH(numeric)

Returns the hyperbolic tangent.

SEC

SEC(numeric)

Returns the secant.

SECH

SECH(numeric)

Returns the hyperbolic secant.

CSC

CSC(numeric)

Returns the cosecant.

CSCH

CSCH(numeric)

Returns the hyperbolic cosecant.

TRUNCATE

TRUNCATE(numeric[, scale])

Truncates a numeric value to the specified scale.

PI

PI

Returns an approximation of pi.

Date and time functions

Name Syntax Description

EXTRACT

EXTRACT(timeUnit FROM datetime)

Returns the specified field from a date, time, or timestamp value.

FLOOR

FLOOR(datetime TO timeUnit)

Rounds a date, time, or timestamp value down to the specified unit.

CEIL

CEIL(datetime TO timeUnit)

Rounds a date, time, or timestamp value up to the specified unit.

TIMESTAMPADD

TIMESTAMPADD(timeUnit, interval, datetime)

Adds an interval to a timestamp.

TIMESTAMPDIFF

TIMESTAMPDIFF(timeUnit, datetime1, datetime2)

Returns the number of time-unit boundaries between two datetime values.

LAST_DAY

LAST_DAY(date)

Returns the date of the last day of the month.

DAYNAME

DAYNAME(datetime)

Returns the name of the day of the week.

MONTHNAME

MONTHNAME(datetime)

Returns the name of the month.

DAYOFMONTH

DAYOFMONTH(datetime)

Returns the day of the month.

DAYOFWEEK

DAYOFWEEK(datetime)

Returns the day of the week.

DAYOFYEAR

DAYOFYEAR(datetime)

Returns the day of the year.

YEAR

YEAR(datetime)

Returns the year.

QUARTER

QUARTER(datetime)

Returns the quarter of the year.

MONTH

MONTH(datetime)

Returns the month number.

WEEK

WEEK(datetime)

Returns the week number.

HOUR

HOUR(datetime)

Returns the hour.

MINUTE

MINUTE(datetime)

Returns the minute.

SECOND

SECOND(datetime)

Returns the second.

TIMESTAMP_SECONDS

TIMESTAMP_SECONDS(integer)

Converts seconds since 1970-01-01 00:00:00 to a timestamp.

TIMESTAMP_MILLIS

TIMESTAMP_MILLIS(integer)

Converts milliseconds since 1970-01-01 00:00:00 to a timestamp.

TIMESTAMP_MICROS

TIMESTAMP_MICROS(integer)

Converts microseconds since 1970-01-01 00:00:00 to a timestamp.

UNIX_SECONDS

UNIX_SECONDS(timestamp)

Returns seconds since 1970-01-01 00:00:00.

UNIX_MILLIS

UNIX_MILLIS(timestamp)

Returns milliseconds since 1970-01-01 00:00:00.

UNIX_MICROS

UNIX_MICROS(timestamp)

Returns microseconds since 1970-01-01 00:00:00.

UNIX_DATE

UNIX_DATE(date)

Returns the number of days since 1970-01-01.

DATE_FROM_UNIX_DATE

DATE_FROM_UNIX_DATE(integer)

Converts days since 1970-01-01 to a date.

DATE

DATE(value) or DATE(year, month, day)

Converts a value to a date or creates a date from date fields.

TIME

TIME(value) or TIME(hour, minute, second)

Converts a value to a time or creates a time from time fields.

DATETIME

DATETIME(value) or DATETIME(year, month, day, hour, minute, second)

Converts a value to a timestamp or creates a timestamp from date and time fields.

CURRENT_TIME

CURRENT_TIME

Returns the current time.

CURRENT_TIMESTAMP

CURRENT_TIMESTAMP

Returns the current timestamp.

CURRENT_DATE

CURRENT_DATE

Returns the current date.

LOCALTIME

LOCALTIME

Returns the current local time.

LOCALTIMESTAMP

LOCALTIMESTAMP

Returns the current local timestamp.

JDBC current time escape functions

{fn CURDATE()}, {fn CURTIME()}, {fn NOW()}

JDBC escape aliases for the current date, current time, and current timestamp.

TO_CHAR

TO_CHAR(datetime, format)

Formats a date, time, or timestamp value.

TO_DATE

TO_DATE(string, format)

Parses a date from a string.

TO_TIMESTAMP

TO_TIMESTAMP(string, format)

Parses a timestamp from a string.

XML functions

Name Syntax Description

EXTRACTVALUE

EXTRACTVALUE(xml, xpath)

Returns the text selected by an XPath expression.

XMLTRANSFORM

XMLTRANSFORM(xml, xslt)

Transforms XML with an XSLT stylesheet.

EXTRACT

"EXTRACT"(xml, xpath)

Returns the XML fragment selected by an XPath expression. Quote the name when needed to distinguish it from date/time EXTRACT.

EXISTSNODE

EXISTSNODE(xml, xpath)

Returns 1 if an XPath expression selects at least one XML node; otherwise returns 0.

JSON functions and predicates

Name Syntax Description

FORMAT JSON

value FORMAT JSON

Marks a value as JSON-formatted input.

JSON_VALUE

JSON_VALUE(jsonValue, path)

Extracts a scalar SQL value from JSON using a JSON path expression.

JSON_QUERY

JSON_QUERY(jsonValue, path)

Extracts a JSON object or array from JSON using a JSON path expression.

JSON_TYPE

JSON_TYPE(jsonValue)

Returns the type of a JSON value.

JSON_EXISTS

JSON_EXISTS(jsonValue, path)

Returns whether JSON satisfies the specified path expression.

JSON_DEPTH

JSON_DEPTH(jsonValue)

Returns the depth of a JSON value.

JSON_KEYS

JSON_KEYS(jsonValue[, path])

Returns the keys from a JSON object.

JSON_PRETTY

JSON_PRETTY(jsonValue)

Returns formatted JSON.

JSON_LENGTH

JSON_LENGTH(jsonValue[, path])

Returns the length of a JSON value.

JSON_REMOVE

JSON_REMOVE(jsonValue, path[, path]…​)

Removes data selected by path expressions and returns the updated JSON.

JSON_STORAGE_SIZE

JSON_STORAGE_SIZE(jsonValue)

Returns the number of bytes used by the JSON binary representation.

JSON_OBJECT

JSON_OBJECT(jsonKey : jsonValue[, jsonKey : jsonValue]…​)

Builds a JSON object from key-value pairs.

JSON_ARRAY

JSON_ARRAY([jsonValue[, jsonValue]…​])

Builds a JSON array from values.

IS JSON

value IS [NOT] JSON [VALUE]

Checks whether a value is, or is not, JSON.

IS JSON OBJECT

value IS [NOT] JSON OBJECT

Checks whether a value is, or is not, a JSON object.

IS JSON ARRAY

value IS [NOT] JSON ARRAY

Checks whether a value is, or is not, a JSON array.

IS JSON SCALAR

value IS [NOT] JSON SCALAR

Checks whether a value is, or is not, a JSON scalar value.

Collection functions and operators

Name Syntax Description

ARRAY

ARRAY[value[, value]…​] or ARRAY(query)

Creates an array from values or from a query result.

MAP

MAP[key, value[, key, value]…​] or MAP(query)

Creates a map from key-value pairs or from a query result.

ITEM

array[index] or map[key]

Returns an item from an array or map.

CARDINALITY

CARDINALITY(collection)

Returns the number of elements in a collection.

IS EMPTY

collection IS EMPTY

Checks whether a collection has no elements.

IS NOT EMPTY

collection IS NOT EMPTY

Checks whether a collection has at least one element.

Other functions and operators

Name Syntax Description

ROW

ROW(value[, value]…​)

Creates a row value.

CAST

CAST(value AS type)

Converts a value to the specified type.

Infix cast

value::type

Converts a value to the specified type using PostgreSQL-style cast syntax.

TYPEOF

TYPEOF(value)

Returns the Ignite SQL type of a value.

COALESCE

COALESCE(value, value[, value]…​)

Returns the first non-null value.

NVL

NVL(value1, value2)

Returns value1 when it is not null; otherwise returns value2.

NULLIF

NULLIF(value1, value2)

Returns NULL when the values are equal; otherwise returns value1.

CASE

CASE WHEN condition THEN result [ELSE result] END

Returns a result selected by conditional branches.

DECODE

DECODE(value, search, result[, search, result]…​[, default])

Compares a value with search values and returns the matching result.

LEAST

LEAST(value[, value]…​)

Returns the least value from the arguments.

GREATEST

GREATEST(value[, value]…​)

Returns the greatest value from the arguments.

COMPRESS

COMPRESS(string)

Compresses a string and returns binary data.

OCTET_LENGTH

OCTET_LENGTH(binary)

Returns the number of bytes in binary data.

QUERY_ENGINE

QUERY_ENGINE()

Returns the name of the query engine used by the query.

SYSTEM_RANGE

TABLE(SYSTEM_RANGE(start, end[, increment]))

Returns a table with one BIGINT column named X and one row for each value in the range.

Supported Data Types

Below are the data types supported by the Calcite-based SQL engine:

Data type Mapped to Java class

BOOLEAN

java.lang.Boolean

DECIMAL

java.math.BigDecimal

DOUBLE

java.lang.Double

REAL/FLOAT

java.lang.Float

INT

java.lang.Integer

BIGINT

java.lang.Long

SMALLINT

java.lang.Short

TINYINT

java.lang.Byte

CHAR/VARCHAR

java.lang.String

DATE

java.sql.Date

TIME

java.sql.Time

TIMESTAMP

java.sql.Timestamp

INTERVAL YEAR TO MONTH

java.time.Period

INTERVAL DAY TO SECOND

java.time.Duration

BINARY/VARBINARY

byte[]

UUID

java.util.UUID

OTHER

java.lang.Object

Optimizer hints

The query optimizer does its best to build the fastest excution plan. However, this is a far way to create an optimizer which is the most effective for each case. You can better know about the data design, application design or data distribution in the cluster. SQL hints can help the optimizer to make optimizations more rationally or build execution plan faster.

Note

SQL hints are optional to apply and might be skipped in some cases.

Hints format

SQL hints are defined by a special comment /*+ HINT */ reffered as hint block. Spaces before and after the hint name are required. The hint block is placed right after SELECT or after a table name. Several hint blocks for one SELECT or one table are not allowed. Several hints in one hint block are separated with comma.

Example:

SELECT /*+ FORCE_INDEX(IDX_TBL1_V2), EXPAND_DISTINCT_AGG */ V2, AVG(DISTINCT V3) FROM TBL1 WHERE V1=? and V2=? GROUP BY V2

SELECT * FROM TBL1 /*+ FORCE_INDEX(IDX_TBL1_V2) */ where V1=? and V2=?

It is allowed to define several hints for the same relation operator. To use several hints, separate them by comma (spaces are optional).

Example:

SELECT /*+ NO_INDEX, EXPAND_DISTINCT_AGG */ SUM(DISTINCT V1), AVG(DISTINCT V2) FROM TBL1 WHERE V3=? GROUP BY V3

Hint parameters

Hint parameters, if required, are placed in brackets after the hint name and separated by commas.

The hint parameter can be quoted. Quoted parameter is case-sensitive. The quoted and unquoted parameters cannot be defined for the same hint.

Example:

SELECT /*+ FORCE_INDEX(TBL1_IDX2,TBL2_IDX1) */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;

SELECT /*+ FORCE_INDEX('TBL2_idx1') */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;

Hint scope

Hints of a SELECT are "visible" for this operation and the following relation operators, queries and subqueries. Hints in a subquery have effective scope only for this subquery and its subqueries. Hint, defined for a table, is effective only for this table.

Example:

SELECT /*+ NO_INDEX(TBL1_IDX2), FORCE_INDEX(TBL2_IDX2) */ T1.V1 FROM TBL1 T1 WHERE T1.V2 IN (SELECT T2.V2 FROM TBL2 T2 WHERE T2.V1=? AND T2.V2=?);

SELECT T1.V1 FROM TBL1 T1 WHERE T1.V2 IN (SELECT /*+ FORCE_INDEX(TBL2_IDX2) */ T2.V2 FROM TBL2 T2 WHERE T2.V1=? AND T2.V2=?);

SELECT T1.V1 FROM TBL1 T1 JOIN TBL2 /*+ MERGE_JOIN */ T2 ON T1.V2=T2.V2 and T1.V3=T2.V3 and T2.V3=?;

Note that only the first query has a hint in such a case as:

SELECT /*+ FORCE_INDEX */ V1 FROM TBL1 WHERE V1=? AND V2=?
UNION ALL
SELECT V1 FROM TBL1 WHERE V3>?

But there are exceptions: hints of engine or optimizer level, such as DISABLE_RULE or QUERY_ENGINE. Such hints should be defined at the beginning of the query and are related to the whole query.

Hints priority

Hints, defined in subqueries or in the following SELECTs, have priority over the preceding ones. In the following example, an index for TBL2 is actually applied.

SELECT /*+ NO_INDEX */ * FROM TBL1 T1 WHERE T1.V1 = (SELECT /*+ FORCE_INDEX */ T2.V1 FROM TBL2 T2 where T2.V2=? and T2.V3=?)

Table hints usually have a bigger priority. In the following example, an index for TBL1 is actually applied.

SELECT /*+ NO_INDEX */ * FROM TBL /*+ FORCE_INDEX(IDX_TBL1_V2) */ where V1=? and V2=? and V3=?;

Hints errors

The optimizer tries to apply every hint and its parameters, if possible. But it skips the hint or hint parameter if:

  • There is no such supported hint.

  • Required hint parameters are not passed.

  • The hint parameters have been passed, but the hint does not support any parameter.

  • The hint parameter is incorrect or refers to a nonexistent object, such as a nonexistent index or table.

  • The current hints or current parameters are incompatible with the previous ones, such as forcing the use and disabling of the same index.

Hint limitations

Currently, SQL hints do not recognize the aliases. You can’t refer to an alias like this:

SELECT /*+ MERGE_JOIN(T2) */ T2.V1 FROM TBL1 T1 JOIN TBL2 T2 ON T1.V3=T2.V1 WHERE T1.V2=? AND T2.V2=?

Instead, a table name have to be used:

SELECT /*+ MERGE_JOIN(TBL2) */ T2.V1 FROM TBL1 T1 JOIN TBL2 T2 ON T1.V3=T2.V1 WHERE T1.V2=? AND T2.V2=?

Supportted hints

FORCE_INDEX / NO_INDEX

Forces or disables index scan.

Parameters:
  • Empty. To force an index scan for every undelying table. Optimizer will choose any available index. Or to disable all indexes.

  • Single index name to use or skip exactly this index.

  • Several index names. They can relate to different tables. The optimizer will choose indexes for scanning or skip them all.

Example:
SELECT /*+ FORCE_INDEX */ T1.* FROM TBL1 T1 WHERE T1.V1 = T2.V1 AND T1.V2 > ?;

SELECT /*+ FORCE_INDEX(TBL1_IDX2, TBL2_IDX1) */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;

SELECT /*+ NO_INDEX */ T1.* FROM TBL1 T1 WHERE T1.V1 = T2.V1 AND T1.V2 > ?;

SELECT /*+ NO_INDEX(TBL1_IDX2, TBL2_IDX1) */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;

SELECT T1.V1, T2.V2 FROM TBL1 t1 JOIN TBL2 /*+ FORCE_INDEX(IDX2_2) */ T2 on T1.V3=T2.V3 and T1.V1=T2.V2 and T2.V1=?";

ENFORCE_JOIN_ORDER

Forces join order as appears in a query. Fastens building of joins plan.

Example:
SELECT /*+ ENFORCE_JOIN_ORDER */ T1.V1, T2.V1, T2.V2, T3.V1, T3.V2, T3.V3 FROM TBL1 T1 JOIN TBL2 T2 ON T1.V3=T2.V1 JOIN TBL3 T3 ON T2.V3=T3.V1 AND T2.V2=T3.V2

SELECT t1.v1, t3.v2 FROM TBL1 t1 JOIN TBL3 t3 on t1.v3=t3.v3 WHERE t1.v2 in (SELECT /*+ ENFORCE_JOIN_ORDER */ t2.v2 FROM TBL2 t2 JOIN TBL3 t3 ON t2.v1=t3.v1)

MERGE_JOIN, NL_JOIN, CNL_JOIN, HASH_JOIN

Forces certain join type: Merge, Nested Loop, Correlated Nested Loop and Hash Join respectively.

Every of those has the negation like 'NO_INDEX': CNL_JOIN, NO_CNL_JOIN, NO_HASH_JOIN. The negation hint disables certain join type.

Parameters:
  • Empty. To force or disable certain join type for every join.

  • Single or several tables names force or disable certain join type only for joining of these tables.

Example:
SELECT /*+ MERGE_JOIN */ t1.v1, t2.v2 FROM TBL1 t1, TBL2 t2 WHERE t1.v3=t2.v3

SELECT /*+ NL_JOIN(TBL3,TBL1) */ t4.v1, t2.v2 FROM TBL1 t4 JOIN TBL2 t2 on t1.v3=t2.v3 WHERE t2.v1 in (SELECT t3.v3 FROM TBL3 t3 JOIN TBL1 t4 on t3.v2=t4.v2)

SELECT t1.v1, t2.v2 FROM TBL2 t1 JOIN TBL1 t2 on t1.v3=t2.v3 WHERE t2.v3 in (SELECT /*+ NO_CNL_JOIN(TBL4) */ t3.v3 FROM TBL3 t3 JOIN TBL4 t4 on t3.v1=t4.v1)

SELECT t4.v1, t2.v2 FROM TBL1 t4 JOIN TBL2 t2 on t1.v3=t2.v3 WHERE t2.v1 in (SELECT t3.v3 FROM TBL3 t3 JOIN TBL1 /*+ HASH_JOIN */ t4 on t3.v2=t4.v2)

EXPAND_DISTINCT_AGG

If the optimizer wraps aggregation operations with a join, forces expanding of only distinct aggregates to the join. Removes duplicates before the joining and speeds up it.

Example:
SELECT /*+ EXPAND_DISTINCT_AGG */ SUM(DISTINCT V1), AVG(DISTINCT V2) FROM TBL1 GROUP BY V3

QUERY_ENGINE

Selects a particular engine to run individual queries. This is an engine level hint.

Parameters:

Single parameter required: the engine name.

Example:
SELECT /*+ QUERY_ENGINE('calcite') */ V1 FROM TBL1

DISABLE_RULE

Disables certain optimizer rules. This is an optimizer level hint.

Parameters:
  • One or more optimizer rules for skipping.

Example:
SELECT /*+ DISABLE_RULE('MergeJoinConverter') */ T1.* FROM TBL1 T1 JOIN TBL2 T2 ON T1.V1=T2.V1 WHERE T2.V2=?

Transaction-aware Queries

The Calcite-based query engine supports SQL transactions with READ_COMMITTED isolation. To ensure backward compatibility, transaction support is disabled by default. To enable it, use the following configuration:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="transactionConfiguration">
        <bean class="org.apache.ignite.configuration.TransactionConfiguration">
            <property name="txAwareQueriesEnabled" value="true" />
            ...
        </bean>
    </property>
    ...
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
cfg.getTransactionConfiguration().setTxAwareQueriesEnabled(true);

The following API "transactional aware":

  • key-value API.

  • SQL queries with Calcite engine.

  • Scan queries.

"Transactional aware" means:

  1. ACID in the same way as key-value API.

  2. Any data modified or deleted within a transaction will not be visible to concurrent transactions until it is committed.

  3. Any data modified by a query will be returned with updated values by subsequent queries within the same transaction.

So, when txAwareQueriesEnabled = true enabled the usage:

  • INSERT, UPDATE, DELETE statements with regular transaction guarantees.

  • Mix key-value, SQL and Scan queries to process same dataset.

Note

SELECT …​ FOR UPDATE

Currently, no locks are held by the UPDATE or SELECT statements. SELECT …​ FOR UPDATE statement not supported. This means lost updates can occur when multiple transactions concurrently modify the same key.

For example, if you execute a query like SET salary = salary + 50 WHERE id = 1 concurrently across multiple threads, note that due to the lost-update anomaly, the final value may not equal the original salary plus 50 multiplied by the number of threads.