space_object:create_index()

object space_object¶

space_object:create_index(index-name[, index_opts])¶

Create an index.

It is mandatory to create an index for a space before trying to insert tuples into it or select tuples from it. The first created index will be used as the primary-key index, so it must be unique.

Parameters:	space_object (`space_object`) – an object reference index_name (`string`) – name of index, which should conform to the rules for object names index_opts (`table`) – index options (see index_opts)
Return:	index object
Rtype:	index_object

Possible errors:

too many parts
index ‘…’ already exists
primary key must be unique

Building or rebuilding a large index will cause occasional yields so that other requests will not be blocked. If the other requests cause an illegal situation such as a duplicate key in a unique index, building or rebuilding such index will fail.

Example:

-- Create a space --
bands = box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

index_opts

object index_opts¶

Index options that include the index name, type, identifiers of key fields, and so on. These options are passed to the space_object.create_index() method.

Note

These options are also passed to index_object:alter().

index_opts.type¶: The index type.

Type: string

Default: TREE

Possible values: TREE, HASH, RTREE, BITSET

index_opts.id¶: A unique numeric identifier of the index, which is generated automatically.

Type: number

Default: last index’s ID + 1

index_opts.unique¶

Specify whether an index may be unique. When true, the index cannot contain the same key value twice.

Type: boolean

Default: true

Example:

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

index_opts.if_not_exists¶: Specify whether to swallow an error on an attempt to create an index with a duplicated name.

Type: boolean

Default: false

index_opts.parts¶

Specify the index’s key parts.

Type: a table of key_part values

Default: {1, ‘unsigned’}

Example:

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

Note

Alternative way to declare index parts

Before version 2.7.1, if an index consisted of a single part and had some options like is_nullable or collation and its definition was written as

my_space:create_index('one_part_idx', {parts = {1, 'unsigned', is_nullable=true}})

(with the only brackets) then options were ignored by Tarantool.

Since version 2.7.1 it is allowed to omit extra braces in an index definition and use both ways:

-- with extra braces
my_space:create_index('one_part_idx', {parts = {{1, 'unsigned', is_nullable=true}}})

-- without extra braces
my_space:create_index('one_part_idx', {parts = {1, 'unsigned', is_nullable=true}})

index_opts.dimension¶: The RTREE index dimension.

Type: number

Default: 2

index_opts.distance¶: The RTREE index distance type.

Type: string

Default: euclid

Possible values: euclid, manhattan

index_opts.sequence¶: Create a generator for indexes using a sequence object. Learn more from specifying a sequence in create_index().

Type: string or number

index_opts.func¶: Specify the identifier of the functional index function.

Type: string

index_opts.hint¶

Since: 2.6.1

Specify whether hint optimization is enabled for the TREE index:

If true, the index works faster.
If false, the index size is reduced by half.

Type: boolean

Default: true

index_opts.bloom_fpr¶

Vinyl only

Specify the bloom filter’s false positive rate.

Type: number

Default: vinyl_bloom_fpr

index_opts.page_size¶

Vinyl only

Specify the size of a page used for read and write disk operations.

Type: number

Default: vinyl_page_size

index_opts.range_size¶

Vinyl only

Specify the default maximum range size (in bytes) for a vinyl index.

Type: number

Default: vinyl_range_size

index_opts.run_count_per_level¶

Vinyl only

Specify the maximum number of runs per level in the LSM tree.

Type: number

Default: vinyl_run_count_per_level

index_opts.run_size_ratio¶

Vinyl only

Specify the ratio between the sizes of different levels in the LSM tree.

Type: number

Default: vinyl_run_size_ratio

key_part

object key_part¶

A descriptor of a single part in a multipart key. A table of parts is passed to the index_opts.parts option.

key_part.field¶

Specify the field number or name.

Note

To create a key part by a field name, you need to specify space_object:format() first.

Type: string or number

Examples: Creating an index using field names and numbers

key_part.type¶: Specify the field type. If the field type is specified in space_object:format(), key_part.type inherits this value.

Type: string

Default: scalar

Possible values: listed in Indexed field types

key_part.collation¶

Specify the collation used to compare field values. If the field collation is specified in space_object:format(), key_part.collation inherits this value.

Type: string

Possible values: listed in the box.space._collation system space

Example:

-- Create a space --
box.schema.space.create('tester')

-- Use the 'unicode' collation --
box.space.tester:create_index('unicode', { parts = { { field = 1,
                                                        type = 'string',
                                                        collation = 'unicode' } } })

-- Use the 'unicode_ci' collation --
box.space.tester:create_index('unicode_ci', { parts = { { field = 1,
                                                        type = 'string',
                                                        collation = 'unicode_ci' } } })

-- Insert test data --
box.space.tester:insert { 'ЕЛЕ' }
box.space.tester:insert { 'елейный' }
box.space.tester:insert { 'ёлка' }

-- Returns nil --
select_unicode = box.space.tester.index.unicode:select({ 'ЁлКа' })
-- Returns 'ёлка' --
select_unicode_ci = box.space.tester.index.unicode_ci:select({ 'ЁлКа' })

key_part.is_nullable¶

Specify whether nil (or its equivalent such as msgpack.NULL) can be used as a field value. If the is_nullable option is specified in space_object:format(), key_part.is_nullable inherits this value.

You can set this option to true if:

the index type is TREE
the index is not the primary index

It is also legal to insert nothing at all when using trailing nullable fields. Within indexes, such null values are always treated as equal to other null values and are always treated as less than non-null values. Nulls may appear multiple times even in a unique index.

Type: boolean

Default: false

Example:

box.space.tester:create_index('I', {unique = true, parts = {{field = 2, type = 'number', is_nullable = true}}})

Warning

It is legal to create multiple indexes for the same field with different is_nullable values or to call space_object:format() with a different is_nullable value from what is used for an index. When there is a contradiction, the rule is: null is illegal unless is_nullable=true for every index and for the space format.

key_part.exclude_null¶

Since: 2.8.2

Specify whether an index can skip tuples with null at this key part. You can set this option to true if:

the index type is TREE
the index is not the primary index

If exclude_null is set to true, is_nullable is set to true automatically. Note that this option can be changed dynamically. In this case, the index is rebuilt.

Such indexes do not store filtered tuples at all, so indexing can be done faster.

Type: boolean

Default: false

key_part.path¶

Specify the path string for a map field.

Type: string

See the examples below:

Creating an index using the path option for map fields
Creating a multikey index using the path option with [*]

Examples

Creating an index using field names and numbers

create_index() can use field names or field numbers to define key parts.

Example 1 (field names):

To create a key part by a field name, you need to specify space_object:format() first.

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

Example 2 (field numbers):

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 1 } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 2 } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 3 } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { 3, 2 } })

Creating an index using the path option for map fields (JSON-path indexes)

To create an index for a field that is a map (a path string and a scalar value), specify the path string during index creation, like this:

parts = {field-number, 'data-type', path = 'path-name'}

The index type must be TREE or HASH and the contents of the field must always be maps with the same path.

Example 1 – The simplest use of path:

box.schema.space.create('space1')
box.space.space1:create_index('primary', { parts = { { field = 1,
                                                       type = 'scalar',
                                                       path = 'age' } } })
box.space.space1:insert({ { age = 44 } })
box.space.space1:select(44)

Example 2 – path plus format() plus JSON syntax to add clarity:

box.schema.space.create('space2')
box.space.space2:format({ { 'id', 'unsigned' }, { 'data', 'map' } })
box.space.space2:create_index('info', { parts = { { 'data.full_name["firstname"]', 'str' },
                                                  { 'data.full_name["surname"]', 'str' } } })
box.space.space2:insert({ 1, { full_name = { firstname = 'John', surname = 'Doe' } } })
box.space.space2:select { 'John' }

Creating a multikey index using the path option with [*]

The string in a path option can contain [*] which is called an array index placeholder. Indexes defined with this are useful for JSON documents that all have the same structure.

For example, when creating an index on field#2 for a string document that will start with {'data': [{'name': '...'}, {'name': '...'}], the parts section in the create_index request could look like:

parts = {{field = 2, type = 'str', path = 'data[*].name'}}

Then tuples containing names can be retrieved quickly with index_object:select({key-value}).

A single field can have multiple keys, as in this example which retrieves the same tuple twice because there are two keys ‘A’ and ‘B’ which both match the request:

my_space = box.schema.space.create('json_documents')
my_space:create_index('primary')
multikey_index = my_space:create_index('multikey', {parts = {{field = 2, type = 'str', path = 'data[*].name'}}})
my_space:insert({1,
         {data = {{name = 'A'},
                  {name = 'B'}},
          extra_field = 1}})
multikey_index:select({''}, {iterator = 'GE'})

The result of the select request looks like this:

tarantool> multikey_index:select({''},{iterator='GE'})
---
- - [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}]
- [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}]
...

The following restrictions exist:

[*] must be alone or must be at the end of a name in the path.
[*] must not appear twice in the path.
If an index has a path with x[*], then no other index can have a path with x.component.
[*] must not appear in the path of a primary key.
If an index has unique=true and has a path with [*], then duplicate keys from different tuples are disallowed, but duplicate keys for the same tuple are allowed.
The field’s value must have the same structure as in the path definition, or be nil (nil is not indexed).
In a space with multikey indexes, any tuple cannot contain more than ~8,000 elements indexed that way.

Creating a functional index

Functional indexes are indexes that call a user-defined function for forming the index key, rather than depending entirely on the Tarantool default formation. Functional indexes are useful for condensing or truncating or reversing or any other way that users want to customize the index.

There are several recommendations for building functional indexes:

The function definition must expect a tuple, which has the contents of fields at the time a data-change request happens, and must return a tuple, which has the contents that will be put in the index.
The create_index definition must include the specification of all key parts, and the custom function must return a table that has the same number of key parts with the same types.
The space must have a memtx engine.
The function must be persistent and deterministic (see Creating a function with body).
The key parts must not depend on JSON paths.
The function must access key-part values by index, not by field name.
Functional indexes must not be primary-key indexes.
Functional indexes cannot be altered and the function cannot be changed if it is used for an index, so the only way to change them is to drop the index and create it again.
Only sandboxed functions are suitable for functional indexes.

Example:

A function could make a key using only the first letter of a string field.

Create a space. The space needs a primary-key field, which is not the field that we will use for the functional index:

box.schema.space.create('tester')
box.space.tester:create_index('i', { parts = { { field = 1, type = 'string' } } })

Create a function. The function expects a tuple. In this example, it will work on tuple[2] because the key source is field number 2 in what we will insert. Use string.sub() from the string module to get the first character:
```
function_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]]
```

Make the function persistent using the box.schema.func.create function:

box.schema.func.create('my_func',
        { body = function_code, is_deterministic = true, is_sandboxed = true })

Create a functional index. Specify the fields whose values will be passed to the function. Specify the function:

box.space.tester:create_index('func_index', { parts = { { field = 1, type = 'string' } },
                                              func = 'my_func' })

Insert a few tuples. Select using only the first letter, it will work because that is the key. Or, select using the same function as was used for insertion:

box.space.tester:insert({ 'a', 'wombat' })
box.space.tester:insert({ 'b', 'rabbit' })
box.space.tester.index.func_index:select('w')
box.space.tester.index.func_index:select(box.func.my_func:call({ { 'tester', 'wombat' } }))

The results of the two select requests will look like this:

tarantool> box.space.tester.index.func_index:select('w')
---
- - ['a', 'wombat']
...
tarantool> box.space.tester.index.func_index:select(box.func.my_func:call({{'tester','wombat'}}));
---
- - ['a', 'wombat']
...

Here is the full code of the example:

box.schema.space.create('tester')
box.space.tester:create_index('i', { parts = { { field = 1, type = 'string' } } })
function_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]]
box.schema.func.create('my_func',
        { body = function_code, is_deterministic = true, is_sandboxed = true })
box.space.tester:create_index('func_index', { parts = { { field = 1, type = 'string' } },
                                              func = 'my_func' })
box.space.tester:insert({ 'a', 'wombat' })
box.space.tester:insert({ 'b', 'rabbit' })
box.space.tester.index.func_index:select('w')
box.space.tester.index.func_index:select(box.func.my_func:call({ { 'tester', 'wombat' } }))

Functions for functional indexes can return multiple keys. Such functions are called “multikey” functions.

To create a multikey function, the options of box.schema.func.create() must include is_multikey = true. The return value must be a table of tuples. If a multikey function returns N tuples, then N keys will be added to the index.

Example:

tester = box.schema.space.create('withdata')
tester:format({ { name = 'name', type = 'string' },
                { name = 'address', type = 'string' } })
name_index = tester:create_index('name', { parts = { { field = 1, type = 'string' } } })
function_code = [[function(tuple)
       local address = string.split(tuple[2])
       local ret = {}
       for _, v in pairs(address) do
         table.insert(ret, {utf8.upper(v)})
       end
       return ret
     end]]
box.schema.func.create('address',
        { body = function_code,
          is_deterministic = true,
          is_sandboxed = true,
          is_multikey = true })
addr_index = tester:create_index('addr', { unique = false,
                                           func = 'address',
                                           parts = { { field = 1, type = 'string',
                                                  collation = 'unicode_ci' } } })
tester:insert({ "James", "SIS Building Lambeth London UK" })
tester:insert({ "Sherlock", "221B Baker St Marylebone London NW1 6XE UK" })
addr_index:select('Uk')

Version:

space_object:create_index()

index_opts

key_part

Examples

Creating an index using field names and numbers

Creating an index using the path option for map fields (JSON-path indexes)

Creating a multikey index using the path option with [*]

Creating a functional index