Master-master

Example on GitHub: master_master

This tutorial shows how to configure and work with a master-master replica set.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Create a tt environment in the current directory by executing the tt init command.
Inside the instances.enabled directory of the created tt environment, create the master_master directory.
Inside instances.enabled/master_master, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment and should look like this:
```
instance001:
instance002:
```
- The config.yaml file is intended to store a replica set configuration.

Configuring a replica set

This section describes how to configure a replica set in config.yaml.

Step 1: Configuring a failover mode

First, set the replication.failover option to off:

replication:
  failover: off

Step 2: Defining a replica set topology

Define a replica set topology inside the groups section:

The database.mode option should be set to rw to make instances work in read-write mode.
The iproto.listen option specifies an address used to listen for incoming requests and allows replicas to communicate with each other.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Step 3: Creating a user for replication

In the credentials section, create the replicator user with the replication role:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

Step 4: Specifying advertise URIs

Set iproto.advertise.peer to advertise the current instance to other replica set members:

iproto:
  advertise:
    peer:
      login: replicator

Resulting configuration

The resulting replica set configuration should look as follows:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: off

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Working with a replica set

Starting instances

After configuring a replica set, execute the tt start command from the tt environment directory:

$ tt start master_master
   • Starting an instance [master_master:instance001]...
   • Starting an instance [master_master:instance002]...

Check that instances are in the RUNNING status using the tt status command:

$ tt status master_master
INSTANCE                      STATUS      PID
master_master:instance001     RUNNING     30818
master_master:instance002     RUNNING     30819

Checking a replica set status

Connect to both instances using tt connect. Below is the example for instance001:

$ tt connect master_master:instance001
   • Connecting to the instance...
   • Connected to master_master:instance001

Check that both instances are writable using box.info.ro:

instance001:

master_master:instance001> box.info.ro
---
- false
...

instance002:

master_master:instance002> box.info.ro
---
- false
...

Execute box.info.replication to check a replica set status. For instance002, upstream.status and downstream.status should be follow.

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 7
    upstream:
      status: follow
      idle: 0.21281599999929
      peer: replicator@127.0.0.1:3302
      lag: 0.00031614303588867
    name: instance002
    downstream:
      status: follow
      idle: 0.21800899999653
      vclock: {1: 7}
      lag: 0
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 0
    name: instance001
...

To see the diagrams that illustrate how the upstream and downstream connections look, refer to Monitoring a replica set.

Note

Note that a vclock value might include the 0 component that is related to local space operations and might differ for different instances in a replica set.

Adding data

To check that both instances get updates from each other, follow the steps below:

On instance001, create a space, format it, and create a primary index:

box.schema.space.create('bands')
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})
box.space.bands:create_index('primary', { parts = { 'id' } })

Then, add sample data to this space:

box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }

On instance002, use the select operation to make sure data is replicated:

master_master:instance002> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
...

Add more data to the created space on instance002:

box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }

Get back to instance001 and use select to make sure new records are replicated.

Check that box.info.vclock values are the same on both instances:

instance001:

master_master:instance001> box.info.vclock
---
- {2: 5, 1: 9}
...

instance002:

master_master:instance002> box.info.vclock
---
- {2: 5, 1: 9}
...

Resolving replication conflicts

Note

To learn how to fix and prevent replication conflicts using trigger functions, see Resolving replication conflicts.

Inserting conflicting records

To insert conflicting records to instance001 and instance002, follow the steps below:

Stop instance001 using the tt stop command:
```
$ tt stop master_master:instance001
```

On instance002, insert a new record:

box.space.bands:insert { 5, 'incorrect data', 0 }

Stop instance002 using tt stop:
```
$ tt stop master_master:instance002
```
Start instance001 back:
```
$ tt start master_master:instance001
```
Connect to instance001 and insert a record that should conflict with a record already inserted on instance002:
```
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
```

Start instance002 back:

$ tt start master_master:instance002

Then, check box.info.replication on instance001. upstream.status should be stopped because of the Duplicate key exists error:

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 9
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 143.52251672745
      status: stopped
      idle: 3.9462469999999
      message: Duplicate key exists in unique index "primary" in space "bands" with
        old tuple - [5, "Pink Floyd", 1965] and new tuple - [5, "incorrect data", 0]
    name: instance002
    downstream:
      status: stopped
      message: 'unexpected EOF when reading from socket, called on fd 12, aka 127.0.0.1:3301,
        peer of 127.0.0.1:59258: Broken pipe'
      system_message: Broken pipe
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 6
    name: instance001
...

The diagram below illustrates how the upstream and downstream connections look like:

Reseeding a replica

To resolve a replication conflict, instance002 should get the correct data from instance001 first. To achieve this, instance002 should be rebootstrapped:

In the config.yaml file, change database.mode of instance002 to ro:
```
instance002:
  database:
    mode: ro
```

Reload configurations on both instances using the reload() function provided by the config module:

instance001:

master_master:instance001> require('config'):reload()
---
...

instance002:

master_master:instance002> require('config'):reload()
---
...

Delete write-ahead logs and snapshots stored in the var/lib/instance002 directory.

Note

var/lib is the default directory used by tt to store write-ahead logs and snapshots. Learn more from Configuration.
Restart instance002 using the tt restart command:
```
$ tt restart master_master:instance002
```

Connect to instance002 and make sure it received the correct data from instance001:

master_master:instance002> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
...

Restarting replication

After reseeding a replica, you need to resolve a replication conflict that keeps replication stopped:

Execute box.info.replication on instance001. upstream.status is still stopped:

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 9
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 143.52251672745
      status: stopped
      idle: 1309.943383
      message: Duplicate key exists in unique index "primary" in space "bands" with
        old tuple - [5, "Pink Floyd", 1965] and new tuple - [5, "incorrect data",
        0]
    name: instance002
    downstream:
      status: follow
      idle: 0.47881799999959
      vclock: {2: 6, 1: 9}
      lag: 0
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 6
    name: instance001
...

The diagram below illustrates how the upstream and downstream connections look like:

replication status after reseeding a replica

In the config.yaml file, clear the iproto option for instance001 by setting its value to {} to disconnect this instance from instance002. Set database.mode to ro:
```
instance001:
  database:
    mode: ro
  iproto: {}
```

Reload configuration on instance001 only:

master_master:instance001> require('config'):reload()
---
...

Change database.mode values back to rw for both instances and restore iproto.listen for instance001:

instance001:
  database:
    mode: rw
  iproto:
    listen:
    - uri: '127.0.0.1:3301'
instance002:
  database:
    mode: rw
  iproto:
    listen:
    - uri: '127.0.0.1:3302'

Reload configurations on both instances one more time:

instance001:

master_master:instance001> require('config'):reload()
---
...

instance002:

master_master:instance002> require('config'):reload()
---
...

Check box.info.replication. upstream.status be follow now.

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 9
    upstream:
      status: follow
      idle: 0.21281300000192
      peer: replicator@127.0.0.1:3302
      lag: 0.00031113624572754
    name: instance002
    downstream:
      status: follow
      idle: 0.035179000002245
      vclock: {2: 6, 1: 9}
      lag: 0
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 6
    name: instance001
...

Adding and removing instances

The process of adding instances to a replica set and removing them is similar for all failover modes. Learn how to do this from the Master-replica: manual failover tutorial:

Before removing an instance from a replica set with replication.failover set to off, make sure this instance is in read-only mode.

Version:

Master-master

Prerequisites

Configuring a replica set

Step 1: Configuring a failover mode

Step 2: Defining a replica set topology

Step 3: Creating a user for replication

Step 4: Specifying advertise URIs

Resulting configuration

Working with a replica set

Starting instances

Checking a replica set status

Adding data

Resolving replication conflicts

Inserting conflicting records

Reseeding a replica

Restarting replication

Adding and removing instances