You are here

Feed aggregator

Building MariaDB Server from the sources

Shinguz - Fri, 2024-04-05 08:47

Recently I had to test a new MariaDB feature that was developed at our request (MDEV-33782). To test this feature I had to build the MariaDB server myself from source, which I have not done for a long time. So a new challenge, especially with CMake...

I followed the MariaDB documentation Get, Build and Test Latest MariaDB the Lazy Way to build the server.

On Ubuntu 22.04 it did not work for me, for reasons unknown to me. So I cloned an Ubuntu 23.04 (Lunar Lobster) LXC container and built the MariaDB server in it.

To make the whole thing work, however, the package sources had to be added to the file /etc/apt/sources.list in the container first:

deb-src http://de.archive.ubuntu.com/ubuntu lunar main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-updates main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-security main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-backports main restricted universe multiverse

Then we proceeded according to the instructions:

shell> apt install build-essential bison shell> apt build-dep mariadb-server

The corresponding branch was cloned:

shell> # git clone https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> # git branch --all shell> git clone --branch MDEV-33782 --single-branch https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> cd mariadb-MDEV-33782 shell> # git checkout 11.5

and then the server was build. This took about 20 minutes on my old machine. CMake still ran into an error, which was solved by installing the corresponding package (MDEV-33815):

shell> apt install libgnutls28-dev shell> cmake . -DBUILD_CONFIG=mysql_release && make -j8

The tests were executed:

shell> cd mysql-test shell> ./mtr rpl.rpl_create_drop_event Logging: ./mtr rpl.rpl_create_drop_event VS config: vardir: /root/mariadb-MDEV-33782/mysql-test/var Checking leftover processes... Removing old var directory... Creating var directory '/root/mariadb-MDEV-33782/mysql-test/var'... Checking supported features... MariaDB Version 11.5.0-MariaDB - SSL connections supported - binaries built with wsrep patch Collecting tests... Installing system database... ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[01] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 worker[01] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped rpl.rpl_create_drop_event 'mix' [ pass ] 522 rpl.rpl_create_drop_event 'row' [ pass ] 525 rpl.rpl_create_drop_event 'stmt' [ pass ] 525 -------------------------------------------------------------------------- The servers were restarted 2 times Spent 1.572 of 14 seconds executing testcases Completed: All 3 tests were successful.

And then a binary tarball was build for further testing.

shell> make package Run CPack packaging tool... CPack: Create package using TGZ CPack: Install projects CPack: - Run preinstall target for: MariaDB CPack: - Install project: MariaDB [] CPack: Create package CPack: - package: /root/mariadb-MDEV-33782/mariadb-11.5.0-linux-x86_64.tar.gz generated.
Taxonomy upgrade extras: mariadbbuildcompilingsourcestarball

Building MariaDB Server from the sources

Shinguz - Fri, 2024-04-05 08:47

Recently I had to test a new MariaDB feature that was developed at our request (MDEV-33782). To test this feature I had to build the MariaDB server myself from source, which I have not done for a long time. So a new challenge, especially with CMake...

I followed the MariaDB documentation Get, Build and Test Latest MariaDB the Lazy Way to build the server.

On Ubuntu 22.04 it did not work for me, for reasons unknown to me. So I cloned an Ubuntu 23.04 (Lunar Lobster) LXC container and built the MariaDB server in it.

To make the whole thing work, however, the package sources had to be added to the file /etc/apt/sources.list in the container first:

deb-src http://de.archive.ubuntu.com/ubuntu lunar main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-updates main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-security main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-backports main restricted universe multiverse

Then we proceeded according to the instructions:

shell> apt install build-essential bison shell> apt build-dep mariadb-server

The corresponding branch was cloned:

shell> # git clone https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> # git branch --all shell> git clone --branch MDEV-33782 --single-branch https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> cd mariadb-MDEV-33782 shell> # git checkout 11.5

and then the server was build. This took about 20 minutes on my old machine. CMake still ran into an error, which was solved by installing the corresponding package (MDEV-33815):

shell> apt install libgnutls28-dev shell> cmake . -DBUILD_CONFIG=mysql_release && make -j8

The tests were executed:

shell> cd mysql-test shell> ./mtr rpl.rpl_create_drop_event Logging: ./mtr rpl.rpl_create_drop_event VS config: vardir: /root/mariadb-MDEV-33782/mysql-test/var Checking leftover processes... Removing old var directory... Creating var directory '/root/mariadb-MDEV-33782/mysql-test/var'... Checking supported features... MariaDB Version 11.5.0-MariaDB - SSL connections supported - binaries built with wsrep patch Collecting tests... Installing system database... ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[01] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 worker[01] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped rpl.rpl_create_drop_event 'mix' [ pass ] 522 rpl.rpl_create_drop_event 'row' [ pass ] 525 rpl.rpl_create_drop_event 'stmt' [ pass ] 525 -------------------------------------------------------------------------- The servers were restarted 2 times Spent 1.572 of 14 seconds executing testcases Completed: All 3 tests were successful.

And then a binary tarball was build for further testing.

shell> make package Run CPack packaging tool... CPack: Create package using TGZ CPack: Install projects CPack: - Run preinstall target for: MariaDB CPack: - Install project: MariaDB [] CPack: Create package CPack: - package: /root/mariadb-MDEV-33782/mariadb-11.5.0-linux-x86_64.tar.gz generated.
Taxonomy upgrade extras: mariadbbuildcompilingsourcestarball

MariaDB Server aus den Quellen bauen

Oli Sennhauser - Thu, 2024-04-04 14:53

Kürzlich musste ich ein neues MariaDB Feature testen, welches auf unseren Wunsch entwickelt wurde (MDEV-33782). Um dieses Feature zu testen musste ich aber den MariaDB Server selber aus den Quellen bauen, was ich schon seit längerem nicht mehr gemacht habe. Also eine neue Herausforderung, insbesondere mit CMake...

I habe hierzu die MariaDB Dokumentation Get, Build and Test Latest MariaDB the Lazy Way befolgt um den Server zu bauen.

Auf Ubuntu 22.04 hat es bei mir, aus mir nicht bekannten Gründen, nicht funktioniert. Also habe ich mir einen Ubuntu 23.04 (Lunar Lobster) LXC Container geclont und darin den MariaDB Server gebaut.

Damit das ganze funktioniert musste aber vorher noch im Container /etc/apt/sources.list durch die Paketquellen ergänzt werden:

deb-src http://de.archive.ubuntu.com/ubuntu lunar main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-updates main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-security main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-backports main restricted universe multiverse

Dann wurde nach Vorgabe weiter verfahren:

shell> apt install build-essential bison shell> apt build-dep mariadb-server

Der Entsprechende Branch wurde geclont:

shell> # git clone https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> # git branch --all shell> git clone --branch MDEV-33782 --single-branch https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> cd mariadb-MDEV-33782 shell> # git checkout 11.5

und dann der Server gebaut. Dies hat auf meiner alten Maschine etwa 20 Minuten gedauert. CMake ist noch auf einen Fehler gelaufen, welcher mit dem Nachinstallieren des entsprechenden Pakets gelöst wurde (MDEV-33815):

shell> apt install libgnutls28-dev shell> cmake . -DBUILD_CONFIG=mysql_release && make -j8

Die Tests wurden ausgeführt:

shell> cd mysql-test shell> ./mtr rpl.rpl_create_drop_event

Und dann ein Binary-Tarball gebaut.

shell> make package Run CPack packaging tool... CPack: Create package using TGZ CPack: Install projects CPack: - Run preinstall target for: MariaDB CPack: - Install project: MariaDB [] CPack: Create package CPack: - package: /root/mariadb-MDEV-33782/mariadb-11.5.0-linux-x86_64.tar.gz generated.
Taxonomy upgrade extras: mariadbbuildcompilingsourcestarball

MariaDB Server aus den Quellen bauen

Oli Sennhauser - Thu, 2024-04-04 14:53

Kürzlich musste ich ein neues MariaDB Feature testen, welches auf unseren Wunsch entwickelt wurde (MDEV-33782). Um dieses Feature zu testen musste ich aber den MariaDB Server selber aus den Quellen bauen, was ich schon seit längerem nicht mehr gemacht habe. Also eine neue Herausforderung, insbesondere mit CMake...

I habe hierzu die MariaDB Dokumentation Get, Build and Test Latest MariaDB the Lazy Way befolgt um den Server zu bauen.

Auf Ubuntu 22.04 hat es bei mir, aus mir nicht bekannten Gründen, nicht funktioniert. Also habe ich mir einen Ubuntu 23.04 (Lunar Lobster) LXC Container geclont und darin den MariaDB Server gebaut.

Damit das ganze funktioniert musste aber vorher noch im Container /etc/apt/sources.list durch die Paketquellen ergänzt werden:

deb-src http://de.archive.ubuntu.com/ubuntu lunar main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-updates main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-security main restricted universe multiverse deb-src http://de.archive.ubuntu.com/ubuntu lunar-backports main restricted universe multiverse

Dann wurde nach Vorgabe weiter verfahren:

shell> apt install build-essential bison shell> apt build-dep mariadb-server

Der Entsprechende Branch wurde geclont:

shell> # git clone https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> # git branch --all shell> git clone --branch MDEV-33782 --single-branch https://github.com/andremralves/server.git mariadb-MDEV-33782 shell> cd mariadb-MDEV-33782 shell> # git checkout 11.5

und dann der Server gebaut. Dies hat auf meiner alten Maschine etwa 20 Minuten gedauert. CMake ist noch auf einen Fehler gelaufen, welcher mit dem Nachinstallieren des entsprechenden Pakets gelöst wurde (MDEV-33815):

shell> apt install libgnutls28-dev shell> cmake . -DBUILD_CONFIG=mysql_release && make -j8

Die Tests wurden ausgeführt:

shell> cd mysql-test shell> ./mtr rpl.rpl_create_drop_event

Und dann ein Binary-Tarball gebaut.

shell> make package Run CPack packaging tool... CPack: Create package using TGZ CPack: Install projects CPack: - Run preinstall target for: MariaDB CPack: - Install project: MariaDB [] CPack: Create package CPack: - package: /root/mariadb-MDEV-33782/mariadb-11.5.0-linux-x86_64.tar.gz generated.
Taxonomy upgrade extras: mariadbbuildcompilingsourcestarball

MaxScale configuration synchronisation

Shinguz - Thu, 2024-04-04 09:53
Table of contents
Overview

A feature that I recently discovered while browsing is the MaxScale configuration synchronisation functionality.

This is not primarily about a MariaDB replication cluster or a MariaDB Galera cluster, but about a cluster consisting of two or more MaxScale nodes. Or more precisely, the exchange of the configuration between these MaxScale nodes.

Pon Suresh Pandian has already written a blog article about this feature in 2022, which is even more detailed than this post here.

Preparations

An LXD container environment was prepared, consisting of 3 database containers (deb12-n1 (10.139.158.33), deb12-n2 (10.139.158.178), deb12-n3 (10.139.158.39)) and 2 MaxScale containers (deb12-mxs1 (10.139.158.66), deb12-mxs2 (10.139.158.174)). The database version is a MariaDB 10.11.6 from the Debian repository and MaxScale was downloaded in version 22.08.5 from the MariaDB plc website.

The database configuration looks similar for all 3 nodes:

# # /etc/mysql/mariadb.conf.d/99-fromdual.cnf # [server] server_id = 1 log_bin = deb12-n1-binlog binlog_format = row bind_address = * proxy_protocol_networks = ::1, 10.139.158.0/24, localhost gtid_strict_mode = on log_slave_updates = on skip_name_resolve = on

The MaxScale nodes were built as described in the article Sharding with MariaDB MaxScale.

The maxscale_admin user has exactly the same rights as described there, the maxscale_monitor user has the following rights:

RELOAD, SUPER, REPLICATION SLAVE, READ_ONLY ADMIN

See also here: Required Grants.

The MaxScale start configuration looks like this:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [deb12-n1] type = server address = 10.139.158.33 port = 3306 proxy_protocol = true [deb12-n2] type = server address = 10.139.158.178 port = 3306 proxy_protocol = true [Replication-Monitor] type = monitor module = mariadbmon servers = deb12-n1,deb12-n2 user = maxscale_monitor password = secret monitor_interval = 500ms auto_failover = true auto_rejoin = true enforce_read_only_slaves = true replication_user = replication replication_password = secret cooperative_monitoring_locks = majority_of_running [WriteListener] type = listener service = WriteService port = 3306 [WriteService] type = service router = readwritesplit servers = deb12-n1,deb12-n2 user = maxscale_admin password = secret transaction_replay = true transaction_replay_timeout = 30s

Important: The configuration should look the same on all MaxScale nodes!

And then a few more checks were done to be sure that everything is correct:

shell> maxctrl list listeners ┌───────────────┬──────┬──────┬─────────┬──────────────┐ │ Name │ Port │ Host │ State │ Service │ ├───────────────┼──────┼──────┼─────────┼──────────────┤ │ WriteListener │ 3306 │ :: │ Running │ WriteService │ └───────────────┴──────┴──────┴─────────┴──────────────┘ shell> maxctrl list services ┌──────────────┬────────────────┬─────────────┬───────────────────┬────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├──────────────┼────────────────┼─────────────┼───────────────────┼────────────────────┤ │ WriteService │ readwritesplit │ 0 │ 0 │ deb12-n1, deb12-n2 │ └──────────────┴────────────────┴─────────────┴───────────────────┴────────────────────┘ shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 0 │ Master, Running │ 0-1-19 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 0 │ Slave, Running │ 0-1-19 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┴─────────────────────┘ SQL> SELECT @@hostname, test.* FROM test.test; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n2 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+ SQL> SELECT @@hostname, test.* FROM test.test FOR UPDATE; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n1 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+

And another test whether MaxScale really executes the failover correctly:

shell> systemctl stop mariadb 2024-03-26 16:27:05 error : Monitor was unable to connect to server deb12-n2[10.139.158.178:3306] : 'Can't connect to server on '10.139.158.178' (115)' 2024-03-26 16:27:05 notice : Server changed state: deb12-n2[10.139.158.178:3306]: master_down. [Master, Running] -> [Down] 2024-03-26 16:27:05 warning: [mariadbmon] Primary has failed. If primary does not return in 4 monitor tick(s), failover begins. 2024-03-26 16:27:07 notice : [mariadbmon] Selecting a server to promote and replace 'deb12-n2'. Candidates are: 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Selected 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Performing automatic failover to replace failed primary 'deb12-n2'. 2024-03-26 16:27:07 notice : [mariadbmon] Failover 'deb12-n2' -> 'deb12-n1' performed. 2024-03-26 16:27:07 notice : Server changed state: deb12-n1[10.139.158.33:3306]: new_master. [Slave, Running] -> [Master, Running] shell> systemctl start mariadb 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: server_up. [Down] -> [Running] 2024-03-26 16:28:03 notice : [mariadbmon] Directing standalone server 'deb12-n2' to replicate from 'deb12-n1'. 2024-03-26 16:28:03 notice : [mariadbmon] Replica connection from deb12-n2 to [10.139.158.33]:3306 created and started. 2024-03-26 16:28:03 notice : [mariadbmon] 1 server(s) redirected or rejoined the cluster. 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: new_slave. [Running] -> [Slave, Running]

Which MaxScale node is currently responsible for monitoring and failover (cooperatve_monitoring) can be determined as follows:

shell> maxctrl show monitor Replication-Monitor | grep -e 'Diagnostics' -e '"primary"' -e 'lock_held' | uniq │ Monitor Diagnostics │ { │ │ │ "primary": true, │ │ │ "lock_held": true, │

It should be ensured that everything works properly up to this point. Otherwise there is no real point in the next steps.

Activate MaxScale configuration synchronisation

A separate database user with the following rights is required for configuration synchronisation:

SQL> CREATE USER 'maxscale_confsync'@'%' IDENTIFIED BY 'secret'; SQL> GRANT SELECT, INSERT, UPDATE, CREATE ON `mysql`.`maxscale_config` TO maxscale_confsync@'%';

MaxScale must then be configured accordingly (on both MaxScale nodes) so that configuration synchronisation is activated. This configuration takes place in the global MaxScale section:

# # /etc/maxscale.cnf # [maxscale] config_sync_cluster = Replication-Monitor config_sync_user = maxscale_confsync config_sync_password = secret

The MaxScale nodes are then restarted:

shell> systemctl restart maxscale

MaxScale configuration synchronisation can also be activated and deactivated dynamically:

shell> maxctrl show maxscale | grep config_sync │ │ "config_sync_cluster": null, │ │ │ "config_sync_db": "mysql", │ │ │ "config_sync_interval": "5000ms", │ │ │ "config_sync_password": null, │ │ │ "config_sync_timeout": "10000ms", │ │ │ "config_sync_user": null, │

Here it is important to keep to the correct order of the 3 commands, otherwise there will be an error:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_user='maxscale_confsync' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_password='secret' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster='Replication-Monitor'
Change MaxScale parameters

As a first test, we have focussed on the MaxScale monitor variable monitor_interval, which in this case is even different on both MaxScale nodes:

shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "750ms", shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "1000ms",

The variable can now be set on a MaxScale node with the alter monitor command:

shell> MAXCTRL_WARNINGS=0 maxctrl alter monitor Replication-Monitor monitor_interval=500ms OK

which can be seen in the MaxScale error log:

2024-03-26 14:09:16 notice : (ConfigManager); Updating to configuration version 1

On the other hand, the value should be propagated to the second MaxScale node within 5 seconds (config_sync_interval), which can be checked with the above command.

Add new slave and make MaxScale known

A new slave (deb12-n3) is first created and added to the MariaDB replication cluster by hand. The slave is then made known to a MaxScale node:

shell> maxctrl create server deb12-n3 10.139.158.39 shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Replication-Monitor deb12-n3 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service WriteService deb12-n3 OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 3 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘
Remove old slave and make MaxScale known

Before a slave can be deleted, it should be removed from the replication cluster for a MaxScale node:

shell> maxctrl destroy server deb12-n1 --force OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-493034 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-493032 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘

The slave can then be removed.

How is the configuration synchronised?

The configuration of the two MaxScale nodes is synchronised via the database, which I personally consider to be an unfortunate design decision, as a configuration change could potentially cause chaos if the master breaks or network problems occur between the database nodes...

The configuration is stored in the table mysql.maxscale_config, which looks like this:

CREATE TABLE `maxscale_config` ( `cluster` varchar(256) NOT NULL, `version` bigint(20) NOT NULL, `config` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`config`)), `origin` varchar(254) NOT NULL, `nodes` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`nodes`)), PRIMARY KEY (`cluster`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

This table has approximately the following content:

SQL> SELECT cluster, version, CONCAT(SUBSTR(config, 1, 32), ' ... ', SUBSTR(config, -32)) AS config , origin, nodes FROM mysql.maxscale_config; +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | cluster | version | config | origin | nodes | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | Replication-Monitor | 2 | {"config":[{"id":"deb12-n1","typ ... ter_name":"Replication-Monitor"} | deb12-mxs1 | {"deb12-mxs1": "OK", "deb12-mxs2": "OK"} | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+

A local copy is available on each node for security reasons:

shell> cut -b-32 /var/lib/maxscale/maxscale-config.json {"config":[{"id":"deb12-n2","typ
What happens in the event of a conflict?

See also: Error Handling in Configuration Synchronization

If the configuration is changed simultaneously (within config_sync_interval?) on two different MaxScale nodes, we receive the following error message:

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `PATCH monitors/Replication-Monitor` { "errors": [ { "detail": "Cannot start configuration change: Configuration conflict detected: version stored in the cluster (3) is not the same as the local version (2), MaxScale is out of sync." } ] }

The following command may help to recognise the problem in the event of major faults:

shell> maxctrl show maxscale | grep -A9 'Config Sync' │ Config Sync │ { │ │ │ "checksum": "0052fe6f775168bf00778abbe37775f6f642adc7", │ │ │ "nodes": { │ │ │ "deb12-mxs1": "OK", │ │ │ "deb12-mxs2": "OK" │ │ │ }, │ │ │ "origin": "deb12-mxs2", │ │ │ "status": "OK", │ │ │ "version": 3 │ │ │ } │
Tests

All tests were also carried out under load. The following tests ran in parallel:

  • insert_test.php
  • insert_test.sh
  • mixed_test.php
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname' ; sleep 0.5 ; done
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname FOR UPDATE' ; sleep 0.5 ; done

All tests have run flawlessly and without problems with all manipulations.

Deactivate MaxScale configuration synchronisation again

Execute the following command on both MaxScale nodes to end configuration synchronisation:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster=''
Literature/sources
Taxonomy upgrade extras: maxscaleconfigurationclusterload balancer

MaxScale configuration synchronisation

Shinguz - Thu, 2024-04-04 09:53
Table of contents
Overview

A feature that I recently discovered while browsing is the MaxScale configuration synchronisation functionality.

This is not primarily about a MariaDB replication cluster or a MariaDB Galera cluster, but about a cluster consisting of two or more MaxScale nodes. Or more precisely, the exchange of the configuration between these MaxScale nodes.

Pon Suresh Pandian has already written a blog article about this feature in 2022, which is even more detailed than this post here.

Preparations

An LXD container environment was prepared, consisting of 3 database containers (deb12-n1 (10.139.158.33), deb12-n2 (10.139.158.178), deb12-n3 (10.139.158.39)) and 2 MaxScale containers (deb12-mxs1 (10.139.158.66), deb12-mxs2 (10.139.158.174)). The database version is a MariaDB 10.11.6 from the Debian repository and MaxScale was downloaded in version 22.08.5 from the MariaDB plc website.

The database configuration looks similar for all 3 nodes:

# # /etc/mysql/mariadb.conf.d/99-fromdual.cnf # [server] server_id = 1 log_bin = deb12-n1-binlog binlog_format = row bind_address = * proxy_protocol_networks = ::1, 10.139.158.0/24, localhost gtid_strict_mode = on log_slave_updates = on skip_name_resolve = on

The MaxScale nodes were built as described in the article Sharding with MariaDB MaxScale.

The maxscale_admin user has exactly the same rights as described there, the maxscale_monitor user has the following rights:

RELOAD, SUPER, REPLICATION SLAVE, READ_ONLY ADMIN

See also here: Required Grants.

The MaxScale start configuration looks like this:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [deb12-n1] type = server address = 10.139.158.33 port = 3306 proxy_protocol = true [deb12-n2] type = server address = 10.139.158.178 port = 3306 proxy_protocol = true [Replication-Monitor] type = monitor module = mariadbmon servers = deb12-n1,deb12-n2 user = maxscale_monitor password = secret monitor_interval = 500ms auto_failover = true auto_rejoin = true enforce_read_only_slaves = true replication_user = replication replication_password = secret cooperative_monitoring_locks = majority_of_running [WriteListener] type = listener service = WriteService port = 3306 [WriteService] type = service router = readwritesplit servers = deb12-n1,deb12-n2 user = maxscale_admin password = secret transaction_replay = true transaction_replay_timeout = 30s

Important: The configuration should look the same on all MaxScale nodes!

And then a few more checks were done to be sure that everything is correct:

shell> maxctrl list listeners ┌───────────────┬──────┬──────┬─────────┬──────────────┐ │ Name │ Port │ Host │ State │ Service │ ├───────────────┼──────┼──────┼─────────┼──────────────┤ │ WriteListener │ 3306 │ :: │ Running │ WriteService │ └───────────────┴──────┴──────┴─────────┴──────────────┘ shell> maxctrl list services ┌──────────────┬────────────────┬─────────────┬───────────────────┬────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├──────────────┼────────────────┼─────────────┼───────────────────┼────────────────────┤ │ WriteService │ readwritesplit │ 0 │ 0 │ deb12-n1, deb12-n2 │ └──────────────┴────────────────┴─────────────┴───────────────────┴────────────────────┘ shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 0 │ Master, Running │ 0-1-19 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 0 │ Slave, Running │ 0-1-19 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┴─────────────────────┘ SQL> SELECT @@hostname, test.* FROM test.test; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n2 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+ SQL> SELECT @@hostname, test.* FROM test.test FOR UPDATE; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n1 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+

And another test whether MaxScale really executes the failover correctly:

shell> systemctl stop mariadb 2024-03-26 16:27:05 error : Monitor was unable to connect to server deb12-n2[10.139.158.178:3306] : 'Can't connect to server on '10.139.158.178' (115)' 2024-03-26 16:27:05 notice : Server changed state: deb12-n2[10.139.158.178:3306]: master_down. [Master, Running] -> [Down] 2024-03-26 16:27:05 warning: [mariadbmon] Primary has failed. If primary does not return in 4 monitor tick(s), failover begins. 2024-03-26 16:27:07 notice : [mariadbmon] Selecting a server to promote and replace 'deb12-n2'. Candidates are: 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Selected 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Performing automatic failover to replace failed primary 'deb12-n2'. 2024-03-26 16:27:07 notice : [mariadbmon] Failover 'deb12-n2' -> 'deb12-n1' performed. 2024-03-26 16:27:07 notice : Server changed state: deb12-n1[10.139.158.33:3306]: new_master. [Slave, Running] -> [Master, Running] shell> systemctl start mariadb 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: server_up. [Down] -> [Running] 2024-03-26 16:28:03 notice : [mariadbmon] Directing standalone server 'deb12-n2' to replicate from 'deb12-n1'. 2024-03-26 16:28:03 notice : [mariadbmon] Replica connection from deb12-n2 to [10.139.158.33]:3306 created and started. 2024-03-26 16:28:03 notice : [mariadbmon] 1 server(s) redirected or rejoined the cluster. 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: new_slave. [Running] -> [Slave, Running]

Which MaxScale node is currently responsible for monitoring and failover (cooperatve_monitoring) can be determined as follows:

shell> maxctrl show monitor Replication-Monitor | grep -e 'Diagnostics' -e '"primary"' -e 'lock_held' | uniq │ Monitor Diagnostics │ { │ │ │ "primary": true, │ │ │ "lock_held": true, │

It should be ensured that everything works properly up to this point. Otherwise there is no real point in the next steps.

Activate MaxScale configuration synchronisation

A separate database user with the following rights is required for configuration synchronisation:

SQL> CREATE USER 'maxscale_confsync'@'%' IDENTIFIED BY 'secret'; SQL> GRANT SELECT, INSERT, UPDATE, CREATE ON `mysql`.`maxscale_config` TO maxscale_confsync@'%';

MaxScale must then be configured accordingly (on both MaxScale nodes) so that configuration synchronisation is activated. This configuration takes place in the global MaxScale section:

# # /etc/maxscale.cnf # [maxscale] config_sync_cluster = Replication-Monitor config_sync_user = maxscale_confsync config_sync_password = secret

The MaxScale nodes are then restarted:

shell> systemctl restart maxscale

MaxScale configuration synchronisation can also be activated and deactivated dynamically:

shell> maxctrl show maxscale | grep config_sync │ │ "config_sync_cluster": null, │ │ │ "config_sync_db": "mysql", │ │ │ "config_sync_interval": "5000ms", │ │ │ "config_sync_password": null, │ │ │ "config_sync_timeout": "10000ms", │ │ │ "config_sync_user": null, │

Here it is important to keep to the correct order of the 3 commands, otherwise there will be an error:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_user='maxscale_confsync' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_password='secret' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster='Replication-Monitor'
Change MaxScale parameters

As a first test, we have focussed on the MaxScale monitor variable monitor_interval, which in this case is even different on both MaxScale nodes:

shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "750ms", shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "1000ms",

The variable can now be set on a MaxScale node with the alter monitor command:

shell> MAXCTRL_WARNINGS=0 maxctrl alter monitor Replication-Monitor monitor_interval=500ms OK

which can be seen in the MaxScale error log:

2024-03-26 14:09:16 notice : (ConfigManager); Updating to configuration version 1

On the other hand, the value should be propagated to the second MaxScale node within 5 seconds (config_sync_interval), which can be checked with the above command.

Add new slave and make MaxScale known

A new slave (deb12-n3) is first created and added to the MariaDB replication cluster by hand. The slave is then made known to a MaxScale node:

shell> maxctrl create server deb12-n3 10.139.158.39 shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Replication-Monitor deb12-n3 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service WriteService deb12-n3 OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 3 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘
Remove old slave and make MaxScale known

Before a slave can be deleted, it should be removed from the replication cluster for a MaxScale node:

shell> maxctrl destroy server deb12-n1 --force OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-493034 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-493032 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘

The slave can then be removed.

How is the configuration synchronised?

The configuration of the two MaxScale nodes is synchronised via the database, which I personally consider to be an unfortunate design decision, as a configuration change could potentially cause chaos if the master breaks or network problems occur between the database nodes...

The configuration is stored in the table mysql.maxscale_config, which looks like this:

CREATE TABLE `maxscale_config` ( `cluster` varchar(256) NOT NULL, `version` bigint(20) NOT NULL, `config` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`config`)), `origin` varchar(254) NOT NULL, `nodes` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`nodes`)), PRIMARY KEY (`cluster`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

This table has approximately the following content:

SQL> SELECT cluster, version, CONCAT(SUBSTR(config, 1, 32), ' ... ', SUBSTR(config, -32)) AS config , origin, nodes FROM mysql.maxscale_config; +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | cluster | version | config | origin | nodes | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | Replication-Monitor | 2 | {"config":[{"id":"deb12-n1","typ ... ter_name":"Replication-Monitor"} | deb12-mxs1 | {"deb12-mxs1": "OK", "deb12-mxs2": "OK"} | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+

A local copy is available on each node for security reasons:

shell> cut -b-32 /var/lib/maxscale/maxscale-config.json {"config":[{"id":"deb12-n2","typ
What happens in the event of a conflict?

See also: Error Handling in Configuration Synchronization

If the configuration is changed simultaneously (within config_sync_interval?) on two different MaxScale nodes, we receive the following error message:

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `PATCH monitors/Replication-Monitor` { "errors": [ { "detail": "Cannot start configuration change: Configuration conflict detected: version stored in the cluster (3) is not the same as the local version (2), MaxScale is out of sync." } ] }

The following command may help to recognise the problem in the event of major faults:

shell> maxctrl show maxscale | grep -A9 'Config Sync' │ Config Sync │ { │ │ │ "checksum": "0052fe6f775168bf00778abbe37775f6f642adc7", │ │ │ "nodes": { │ │ │ "deb12-mxs1": "OK", │ │ │ "deb12-mxs2": "OK" │ │ │ }, │ │ │ "origin": "deb12-mxs2", │ │ │ "status": "OK", │ │ │ "version": 3 │ │ │ } │
Tests

All tests were also carried out under load. The following tests ran in parallel:

  • insert_test.php
  • insert_test.sh
  • mixed_test.php
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname' ; sleep 0.5 ; done
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname FOR UPDATE' ; sleep 0.5 ; done

All tests have run flawlessly and without problems with all manipulations.

Deactivate MaxScale configuration synchronisation again

Execute the following command on both MaxScale nodes to end configuration synchronisation:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster=''
Literature/sources
Taxonomy upgrade extras: maxscaleconfigurationclusterload balancer

MaxScale Konfigurations-Synchronisation

Oli Sennhauser - Tue, 2024-04-02 17:43
Inhaltsverzeichnis
Übersicht

Ein Feature, welches ich beim Stöbern kürzlich entdecke habe, ist die MaxScale Konfigurations-Synchronisation Funktionalität.

Es geht hier nicht primär um einen MariaDB Replikations-Cluster oder einen MariaDB Galera Cluster sondern um einen Cluster bestehend aus zwei oder mehreren MaxScale Knoten. Bzw. etwas genauer gesagt, den Austausch der Konfiguration unter diesen MaxScale Knoten.

Pon Suresh Pandian hat bereits 2022 einen Blog-Artikel über dieses Feature geschrieben, der noch etwas ausführlicher ist, als dieser Beitrag hier.

Vorbereitungen

Es wurde eine LXD Container-Umgebung vorbereitet, bestehend aus 3 Datenbank-Containern (deb12-n1 (10.139.158.33), deb12-n2 (10.139.158.178), deb12-n3(10.139.158.39)) und 2 MaxScale Containern (deb12-mxs1 (10.139.158.66), deb12-mxs2(10.139.158.174)). Die Datenbankversion ist eine MariaDB 10.11.6 aus dem Debian-Repository und MaxScale wurde in der Version 22.08.5 von der MariaDB plc Website heruntergeladen.

Die Datenbank-Konfiguration sieht für alle 3 Knoten analog aus:

# # /etc/mysql/mariadb.conf.d/99-fromdual.cnf # [server] server_id = 1 log_bin = deb12-n1-binlog binlog_format = row bind_address = * proxy_protocol_networks = ::1, 10.139.158.0/24, localhost gtid_strict_mode = on log_slave_updates = on skip_name_resolve = on

Die MaxScale Knoten wurden wie im Artikel Sharding mit MariaDB MaxScale beschrieben gebaut.

Der maxscale_admin User hat genau die selben Rechte wie dort beschrieben, der maxscale_monitor User hat folgende Rechte erhalten:

RELOAD, SUPER, REPLICATION SLAVE, READ_ONLY ADMIN

Siehe auch hier: Required Grants.

Die MaxScale Start-Konfiguration sieht wie folgt aus:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [deb12-n1] type = server address = 10.139.158.33 port = 3306 proxy_protocol = true [deb12-n2] type = server address = 10.139.158.178 port = 3306 proxy_protocol = true [Replication-Monitor] type = monitor module = mariadbmon servers = deb12-n1,deb12-n2 user = maxscale_monitor password = secret monitor_interval = 500ms auto_failover = true auto_rejoin = true enforce_read_only_slaves = true replication_user = replication replication_password = secret cooperative_monitoring_locks = majority_of_running [WriteListener] type = listener service = WriteService port = 3306 [WriteService] type = service router = readwritesplit servers = deb12-n1,deb12-n2 user = maxscale_admin password = secret transaction_replay = true transaction_replay_timeout = 30s

Wichtig: Die Konfiguration sollte auf allen MaxScale Knoten gleich aussehen!

Und dann noch ein paar Checks um sicher zu sein, dass alles stimmt:

shell> maxctrl list listeners ┌───────────────┬──────┬──────┬─────────┬──────────────┐ │ Name │ Port │ Host │ State │ Service │ ├───────────────┼──────┼──────┼─────────┼──────────────┤ │ WriteListener │ 3306 │ :: │ Running │ WriteService │ └───────────────┴──────┴──────┴─────────┴──────────────┘ shell> maxctrl list services ┌──────────────┬────────────────┬─────────────┬───────────────────┬────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├──────────────┼────────────────┼─────────────┼───────────────────┼────────────────────┤ │ WriteService │ readwritesplit │ 0 │ 0 │ deb12-n1, deb12-n2 │ └──────────────┴────────────────┴─────────────┴───────────────────┴────────────────────┘ shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 0 │ Master, Running │ 0-1-19 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 0 │ Slave, Running │ 0-1-19 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┴─────────────────────┘ SQL> SELECT @@hostname, test.* FROM test.test; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n2 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+ SQL> SELECT @@hostname, test.* FROM test.test FOR UPDATE; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n1 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+

Und noch ein weiterer Test ob MaxScale den Failover auch wirklich richtig ausführt:

shell> systemctl stop mariadb 2024-03-26 16:27:05 error : Monitor was unable to connect to server deb12-n2[10.139.158.178:3306] : 'Can't connect to server on '10.139.158.178' (115)' 2024-03-26 16:27:05 notice : Server changed state: deb12-n2[10.139.158.178:3306]: master_down. [Master, Running] -> [Down] 2024-03-26 16:27:05 warning: [mariadbmon] Primary has failed. If primary does not return in 4 monitor tick(s), failover begins. 2024-03-26 16:27:07 notice : [mariadbmon] Selecting a server to promote and replace 'deb12-n2'. Candidates are: 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Selected 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Performing automatic failover to replace failed primary 'deb12-n2'. 2024-03-26 16:27:07 notice : [mariadbmon] Failover 'deb12-n2' -> 'deb12-n1' performed. 2024-03-26 16:27:07 notice : Server changed state: deb12-n1[10.139.158.33:3306]: new_master. [Slave, Running] -> [Master, Running] shell> systemctl start mariadb 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: server_up. [Down] -> [Running] 2024-03-26 16:28:03 notice : [mariadbmon] Directing standalone server 'deb12-n2' to replicate from 'deb12-n1'. 2024-03-26 16:28:03 notice : [mariadbmon] Replica connection from deb12-n2 to [10.139.158.33]:3306 created and started. 2024-03-26 16:28:03 notice : [mariadbmon] 1 server(s) redirected or rejoined the cluster. 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: new_slave. [Running] -> [Slave, Running]

Welcher MaxScale Knoten gerade für das Monitoring und Failover zuständig ist (cooperatve_monitoring) kann wir folgt eruiert werden:

shell> maxctrl show monitor Replication-Monitor | grep -e 'Diagnostics' -e '"primary"' -e 'lock_held' | uniq │ Monitor Diagnostics │ { │ │ │ "primary": true, │ │ │ "lock_held": true, │

Es sollte sichergestellt sein, das bis hier hin alles sauber funktioniert. Ansonsten haben die weiteren Schritte keinen wirklichen Sinn.

MaxScale Konfigurations-Synchronisation aktivieren

Für die Konfigurations-Synchronisation braucht es einen eigenen Datenbank-User mit folgenden Rechten:

SQL> CREATE USER 'maxscale_confsync'@'%' IDENTIFIED BY 'secret'; SQL> GRANT SELECT, INSERT, UPDATE, CREATE ON `mysql`.`maxscale_config` TO maxscale_confsync@'%';

Anschliessend muss MaxScale entsprechend konfiguriert werden (auf beiden MaxScale Knoten), damit die Konfigurations-Synchronisation aktiviert wird. Diese Konfiguration erfolgt im globalen MaxScale Abschnitt:

# # /etc/maxscale.cnf # [maxscale] config_sync_cluster = Replication-Monitor config_sync_user = maxscale_confsync config_sync_password = secret

Dann erfolgt der Neustart der MaxScale Knoten:

shell> systemctl restart maxscale

Die MaxScale Konfigurations-Synchronisation kann auch dynamisch aktiviert und deaktiviert werden:

shell> maxctrl show maxscale | grep config_sync │ │ "config_sync_cluster": null, │ │ │ "config_sync_db": "mysql", │ │ │ "config_sync_interval": "5000ms", │ │ │ "config_sync_password": null, │ │ │ "config_sync_timeout": "10000ms", │ │ │ "config_sync_user": null, │

Hier ist wichtig, die richtige Reihenfolge der 3 Befehle einzuhalten, da es sonst einen Fehler gibt:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_user='maxscale_confsync' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_password='secret' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster='Replication-Monitor'
MaxScale Parameter ändern

Als ersten Test haben wir die MaxScale Monitor Variable monitor_interval ins Auge gefasst, die in diesem Fall auf beiden MaxScale-Knoten sogar unterschiedlich ist:

shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "750ms", shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "1000ms",

Mit dem alter monitor Befehl kann die Variable jetzt auf einem MaxScale Knoten gesetzt werden:

shell> MAXCTRL_WARNINGS=0 maxctrl alter monitor Replication-Monitor monitor_interval=500ms OK

was einerseits im MaxScale Error Log ersichtlich ist:

2024-03-26 14:09:16 notice : (ConfigManager); Updating to configuration version 1

anderseits sollte der Wert innerhalb von 5 Sekunden (config_sync_interval) auf den zweiten MaxScale Knoten propagiert werden, was mit dem obigen Befehl überprüft werden kann.

Neuer Slave hinzufügen und MaxScale bekannt machen

Ein neuer Slave (deb12-n3) wird zuerst erstellt und dem MariaDB Replikations-Cluster von Hand hinzugefügt. Anschliessen wird der Slave einem MaxScale Knoten bekannt gemacht:

shell> maxctrl create server deb12-n3 10.139.158.39 shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Replication-Monitor deb12-n3 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service WriteService deb12-n3 OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 3 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘
Alter Slave entfernen und MaxScale bekannt machen

Bevor ein Slave gelöscht werden kann, sollte er bei einem MaxScale Knoten aus dem Replikations-Cluster entfernt werden:

shell> maxctrl destroy server deb12-n1 --force OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-493034 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-493032 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘

Anschliessend kann der Slave abgebaut werden.

Wie wird die Konfiguration synchronisiert?

Die Konfiguration der beiden MaxScale Knoten wird über die Datenbank synchronisiert, was ich persönlich als unglückliche Design-Entscheidung ansehe, da eine Konfigurationsänderung potentiell hackelt, wenn der Master kaputt geht oder Netzwerkprobleme zwischen den Datenbanknoten auftreten...

Die Konfiguration wird in der Tabelle mysql.maxscale_config abgespeichert, welche wie folgt aussieht:

CREATE TABLE `maxscale_config` ( `cluster` varchar(256) NOT NULL, `version` bigint(20) NOT NULL, `config` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`config`)), `origin` varchar(254) NOT NULL, `nodes` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`nodes`)), PRIMARY KEY (`cluster`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

Diese Tabelle hat in etwa folgenden Inhalt:

SQL> SELECT cluster, version, CONCAT(SUBSTR(config, 1, 32), ' ... ', SUBSTR(config, -32)) AS config , origin, nodes FROM mysql.maxscale_config; +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | cluster | version | config | origin | nodes | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | Replication-Monitor | 2 | {"config":[{"id":"deb12-n1","typ ... ter_name":"Replication-Monitor"} | deb12-mxs1 | {"deb12-mxs1": "OK", "deb12-mxs2": "OK"} | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+

Eine lokale Kopie liegt zur Sicherheit auf jedem Knoten vor:

shell> cut -b-32 /var/lib/maxscale/maxscale-config.json {"config":[{"id":"deb12-n2","typ
Was passiert im Konfliktfall?

Siehe hierzu auch: Error Handling in Configuration Synchronization

Wenn zeitgleich (innerhalb von config_sync_interval?) auf zwei unterschiedlichen MaxScale Knoten die Konfiguration geändert wird erhalten wir folgende Fehlermeldung:

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `PATCH monitors/Replication-Monitor` { "errors": [ { "detail": "Cannot start configuration change: Configuration conflict detected: version stored in the cluster (3) is not the same as the local version (2), MaxScale is out of sync." } ] }

Folgender Befehl hilft allenfalls bei grösseren Störungen das Problem zu erkennen:

shell> maxctrl show maxscale | grep -A9 'Config Sync' │ Config Sync │ { │ │ │ "checksum": "0052fe6f775168bf00778abbe37775f6f642adc7", │ │ │ "nodes": { │ │ │ "deb12-mxs1": "OK", │ │ │ "deb12-mxs2": "OK" │ │ │ }, │ │ │ "origin": "deb12-mxs2", │ │ │ "status": "OK", │ │ │ "version": 3 │ │ │ } │
Tests

Alle Tests wurden auch unter Last ausgeführt. Folgende Tests liefen parallel:

  • insert_test.php
  • insert_test.sh
  • mixed_test.php
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname' ; sleep 0.5 ; done
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname FOR UPDATE' ; sleep 0.5 ; done

Alle Tests sind bei allen Manipulationen einwandfrei und ohne Probleme durchgelaufen.

MaxScale Konfigurations-Synchronisation wieder deaktivieren

Auf beiden MaxScale Knoten ist folgender Befehl auszuführen und damit ist die Konfigurations-Synchronisation wieder beendet:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster=''
Literatur/Quellen
Taxonomy upgrade extras: maxscaleconfigurationclusterload balancer

MaxScale Konfigurations-Synchronisation

Oli Sennhauser - Tue, 2024-04-02 17:43
Inhaltsverzeichnis
Übersicht

Ein Feature, welches ich beim Stöbern kürzlich entdecke habe, ist die MaxScale Konfigurations-Synchronisation Funktionalität.

Es geht hier nicht primär um einen MariaDB Replikations-Cluster oder einen MariaDB Galera Cluster sondern um einen Cluster bestehend aus zwei oder mehreren MaxScale Knoten. Bzw. etwas genauer gesagt, den Austausch der Konfiguration unter diesen MaxScale Knoten.

Pon Suresh Pandian hat bereits 2022 einen Blog-Artikel über dieses Feature geschrieben, der noch etwas ausführlicher ist, als dieser Beitrag hier.

Vorbereitungen

Es wurde eine LXD Container-Umgebung vorbereitet, bestehend aus 3 Datenbank-Containern (deb12-n1 (10.139.158.33), deb12-n2 (10.139.158.178), deb12-n3(10.139.158.39)) und 2 MaxScale Containern (deb12-mxs1 (10.139.158.66), deb12-mxs2(10.139.158.174)). Die Datenbankversion ist eine MariaDB 10.11.6 aus dem Debian-Repository und MaxScale wurde in der Version 22.08.5 von der MariaDB plc Website heruntergeladen.

Die Datenbank-Konfiguration sieht für alle 3 Knoten analog aus:

# # /etc/mysql/mariadb.conf.d/99-fromdual.cnf # [server] server_id = 1 log_bin = deb12-n1-binlog binlog_format = row bind_address = * proxy_protocol_networks = ::1, 10.139.158.0/24, localhost gtid_strict_mode = on log_slave_updates = on skip_name_resolve = on

Die MaxScale Knoten wurden wie im Artikel Sharding mit MariaDB MaxScale beschrieben gebaut.

Der maxscale_admin User hat genau die selben Rechte wie dort beschrieben, der maxscale_monitor User hat folgende Rechte erhalten:

RELOAD, SUPER, REPLICATION SLAVE, READ_ONLY ADMIN

Siehe auch hier: Required Grants.

Die MaxScale Start-Konfiguration sieht wie folgt aus:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [deb12-n1] type = server address = 10.139.158.33 port = 3306 proxy_protocol = true [deb12-n2] type = server address = 10.139.158.178 port = 3306 proxy_protocol = true [Replication-Monitor] type = monitor module = mariadbmon servers = deb12-n1,deb12-n2 user = maxscale_monitor password = secret monitor_interval = 500ms auto_failover = true auto_rejoin = true enforce_read_only_slaves = true replication_user = replication replication_password = secret cooperative_monitoring_locks = majority_of_running [WriteListener] type = listener service = WriteService port = 3306 [WriteService] type = service router = readwritesplit servers = deb12-n1,deb12-n2 user = maxscale_admin password = secret transaction_replay = true transaction_replay_timeout = 30s

Wichtig: Die Konfiguration sollte auf allen MaxScale Knoten gleich aussehen!

Und dann noch ein paar Checks um sicher zu sein, dass alles stimmt:

shell> maxctrl list listeners ┌───────────────┬──────┬──────┬─────────┬──────────────┐ │ Name │ Port │ Host │ State │ Service │ ├───────────────┼──────┼──────┼─────────┼──────────────┤ │ WriteListener │ 3306 │ :: │ Running │ WriteService │ └───────────────┴──────┴──────┴─────────┴──────────────┘ shell> maxctrl list services ┌──────────────┬────────────────┬─────────────┬───────────────────┬────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├──────────────┼────────────────┼─────────────┼───────────────────┼────────────────────┤ │ WriteService │ readwritesplit │ 0 │ 0 │ deb12-n1, deb12-n2 │ └──────────────┴────────────────┴─────────────┴───────────────────┴────────────────────┘ shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 0 │ Master, Running │ 0-1-19 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 0 │ Slave, Running │ 0-1-19 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┴─────────────────────┘ SQL> SELECT @@hostname, test.* FROM test.test; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n2 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+ SQL> SELECT @@hostname, test.* FROM test.test FOR UPDATE; +------------+----+-----------+---------------------+ | @@hostname | id | data | ts | +------------+----+-----------+---------------------+ | deb12-n1 | 1 | Some data | 2024-03-26 09:40:21 | +------------+----+-----------+---------------------+

Und noch ein weiterer Test ob MaxScale den Failover auch wirklich richtig ausführt:

shell> systemctl stop mariadb 2024-03-26 16:27:05 error : Monitor was unable to connect to server deb12-n2[10.139.158.178:3306] : 'Can't connect to server on '10.139.158.178' (115)' 2024-03-26 16:27:05 notice : Server changed state: deb12-n2[10.139.158.178:3306]: master_down. [Master, Running] -> [Down] 2024-03-26 16:27:05 warning: [mariadbmon] Primary has failed. If primary does not return in 4 monitor tick(s), failover begins. 2024-03-26 16:27:07 notice : [mariadbmon] Selecting a server to promote and replace 'deb12-n2'. Candidates are: 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Selected 'deb12-n1'. 2024-03-26 16:27:07 notice : [mariadbmon] Performing automatic failover to replace failed primary 'deb12-n2'. 2024-03-26 16:27:07 notice : [mariadbmon] Failover 'deb12-n2' -> 'deb12-n1' performed. 2024-03-26 16:27:07 notice : Server changed state: deb12-n1[10.139.158.33:3306]: new_master. [Slave, Running] -> [Master, Running] shell> systemctl start mariadb 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: server_up. [Down] -> [Running] 2024-03-26 16:28:03 notice : [mariadbmon] Directing standalone server 'deb12-n2' to replicate from 'deb12-n1'. 2024-03-26 16:28:03 notice : [mariadbmon] Replica connection from deb12-n2 to [10.139.158.33]:3306 created and started. 2024-03-26 16:28:03 notice : [mariadbmon] 1 server(s) redirected or rejoined the cluster. 2024-03-26 16:28:03 notice : Server changed state: deb12-n2[10.139.158.178:3306]: new_slave. [Running] -> [Slave, Running]

Welcher MaxScale Knoten gerade für das Monitoring und Failover zuständig ist (cooperatve_monitoring) kann wir folgt eruiert werden:

shell> maxctrl show monitor Replication-Monitor | grep -e 'Diagnostics' -e '"primary"' -e 'lock_held' | uniq │ Monitor Diagnostics │ { │ │ │ "primary": true, │ │ │ "lock_held": true, │

Es sollte sichergestellt sein, das bis hier hin alles sauber funktioniert. Ansonsten haben die weiteren Schritte keinen wirklichen Sinn.

MaxScale Konfigurations-Synchronisation aktivieren

Für die Konfigurations-Synchronisation braucht es einen eigenen Datenbank-User mit folgenden Rechten:

SQL> CREATE USER 'maxscale_confsync'@'%' IDENTIFIED BY 'secret'; SQL> GRANT SELECT, INSERT, UPDATE, CREATE ON `mysql`.`maxscale_config` TO maxscale_confsync@'%';

Anschliessend muss MaxScale entsprechend konfiguriert werden (auf beiden MaxScale Knoten), damit die Konfigurations-Synchronisation aktiviert wird. Diese Konfiguration erfolgt im globalen MaxScale Abschnitt:

# # /etc/maxscale.cnf # [maxscale] config_sync_cluster = Replication-Monitor config_sync_user = maxscale_confsync config_sync_password = secret

Dann erfolgt der Neustart der MaxScale Knoten:

shell> systemctl restart maxscale

Die MaxScale Konfigurations-Synchronisation kann auch dynamisch aktiviert und deaktiviert werden:

shell> maxctrl show maxscale | grep config_sync │ │ "config_sync_cluster": null, │ │ │ "config_sync_db": "mysql", │ │ │ "config_sync_interval": "5000ms", │ │ │ "config_sync_password": null, │ │ │ "config_sync_timeout": "10000ms", │ │ │ "config_sync_user": null, │

Hier ist wichtig, die richtige Reihenfolge der 3 Befehle einzuhalten, da es sonst einen Fehler gibt:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_user='maxscale_confsync' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_password='secret' shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster='Replication-Monitor'
MaxScale Parameter ändern

Als ersten Test haben wir die MaxScale Monitor Variable monitor_interval ins Auge gefasst, die in diesem Fall auf beiden MaxScale-Knoten sogar unterschiedlich ist:

shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "750ms", shell> maxctrl show monitor Replication-Monitor | grep monitor_interval │ │ "monitor_interval": "1000ms",

Mit dem alter monitor Befehl kann die Variable jetzt auf einem MaxScale Knoten gesetzt werden:

shell> MAXCTRL_WARNINGS=0 maxctrl alter monitor Replication-Monitor monitor_interval=500ms OK

was einerseits im MaxScale Error Log ersichtlich ist:

2024-03-26 14:09:16 notice : (ConfigManager); Updating to configuration version 1

anderseits sollte der Wert innerhalb von 5 Sekunden (config_sync_interval) auf den zweiten MaxScale Knoten propagiert werden, was mit dem obigen Befehl überprüft werden kann.

Neuer Slave hinzufügen und MaxScale bekannt machen

Ein neuer Slave (deb12-n3) wird zuerst erstellt und dem MariaDB Replikations-Cluster von Hand hinzugefügt. Anschliessen wird der Slave einem MaxScale Knoten bekannt gemacht:

shell> maxctrl create server deb12-n3 10.139.158.39 shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Replication-Monitor deb12-n3 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service WriteService deb12-n3 OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n1 │ 10.139.158.33 │ 3306 │ 3 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-479618 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-479618 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘
Alter Slave entfernen und MaxScale bekannt machen

Bevor ein Slave gelöscht werden kann, sollte er bei einem MaxScale Knoten aus dem Replikations-Cluster entfernt werden:

shell> maxctrl destroy server deb12-n1 --force OK shell> maxctrl list servers ┌──────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────┬─────────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n2 │ 10.139.158.178 │ 3306 │ 3 │ Master, Running │ 0-2-493034 │ Replication-Monitor │ ├──────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────┼─────────────────────┤ │ deb12-n3 │ 10.139.158.39 │ 3306 │ 1 │ Slave, Running │ 0-2-493032 │ Replication-Monitor │ └──────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────┴─────────────────────┘

Anschliessend kann der Slave abgebaut werden.

Wie wird die Konfiguration synchronisiert?

Die Konfiguration der beiden MaxScale Knoten wird über die Datenbank synchronisiert, was ich persönlich als unglückliche Design-Entscheidung ansehe, da eine Konfigurationsänderung potentiell hackelt, wenn der Master kaputt geht oder Netzwerkprobleme zwischen den Datenbanknoten auftreten...

Die Konfiguration wird in der Tabelle mysql.maxscale_config abgespeichert, welche wie folgt aussieht:

CREATE TABLE `maxscale_config` ( `cluster` varchar(256) NOT NULL, `version` bigint(20) NOT NULL, `config` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`config`)), `origin` varchar(254) NOT NULL, `nodes` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`nodes`)), PRIMARY KEY (`cluster`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

Diese Tabelle hat in etwa folgenden Inhalt:

SQL> SELECT cluster, version, CONCAT(SUBSTR(config, 1, 32), ' ... ', SUBSTR(config, -32)) AS config , origin, nodes FROM mysql.maxscale_config; +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | cluster | version | config | origin | nodes | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+ | Replication-Monitor | 2 | {"config":[{"id":"deb12-n1","typ ... ter_name":"Replication-Monitor"} | deb12-mxs1 | {"deb12-mxs1": "OK", "deb12-mxs2": "OK"} | +---------------------+---------+-----------------------------------------------------------------------+------------+------------------------------------------+

Eine lokale Kopie liegt zur Sicherheit auf jedem Knoten vor:

shell> cut -b-32 /var/lib/maxscale/maxscale-config.json {"config":[{"id":"deb12-n2","typ
Was passiert im Konfliktfall?

Siehe hierzu auch: Error Handling in Configuration Synchronization

Wenn zeitgleich (innerhalb von config_sync_interval?) auf zwei unterschiedlichen MaxScale Knoten die Konfiguration geändert wird erhalten wir folgende Fehlermeldung:

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `PATCH monitors/Replication-Monitor` { "errors": [ { "detail": "Cannot start configuration change: Configuration conflict detected: version stored in the cluster (3) is not the same as the local version (2), MaxScale is out of sync." } ] }

Folgender Befehl hilft allenfalls bei grösseren Störungen das Problem zu erkennen:

shell> maxctrl show maxscale | grep -A9 'Config Sync' │ Config Sync │ { │ │ │ "checksum": "0052fe6f775168bf00778abbe37775f6f642adc7", │ │ │ "nodes": { │ │ │ "deb12-mxs1": "OK", │ │ │ "deb12-mxs2": "OK" │ │ │ }, │ │ │ "origin": "deb12-mxs2", │ │ │ "status": "OK", │ │ │ "version": 3 │ │ │ } │
Tests

Alle Tests wurden auch unter Last ausgeführt. Folgende Tests liefen parallel:

  • insert_test.php
  • insert_test.sh
  • mixed_test.php
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname' ; sleep 0.5 ; done
  • while [ true ] ; do mariadb -s --user=app --host=10.139.158.174 --port=3306 --password=secret --execute='SELECT @@hostname, COUNT(*) FROM test.test GROUP BY @@hostname FOR UPDATE' ; sleep 0.5 ; done

Alle Tests sind bei allen Manipulationen einwandfrei und ohne Probleme durchgelaufen.

MaxScale Konfigurations-Synchronisation wieder deaktivieren

Auf beiden MaxScale Knoten ist folgender Befehl auszuführen und damit ist die Konfigurations-Synchronisation wieder beendet:

shell> MAXCTRL_WARNINGS=0 maxctrl alter maxscale config_sync_cluster=''
Literatur/Quellen
Taxonomy upgrade extras: maxscaleconfigurationclusterload balancer

Sharding with MariaDB MaxScale

Shinguz - Tue, 2024-03-19 17:02
Table of contents
Overview

This feature should more or less work with MariaDB MaxScale 6.x.y, 22.08.x, 23.02.x, 23.08.x and 24.02.x. We have tested it with the latest MaxScale version 23.08.05, as we encountered problems with an older version (MXS-5026).

shell> maxscale --version MaxScale 23.08.5

We used MariaDB 10.11 as the database backend (shards).

Less than approx. 2% of all MariaDB installations known to us are what we technically understand by multi-tenant systems (each customer in its own database (also called a schema)).

This MariaDB MaxScale feature is therefore used relatively rarely and there is an increased risk of encountering bugs that no-one has come across before!

This feature is called SchemaRouter at MariadDB MaxScale and is still declared as beta quality (MXS-5025):

maxctrl> show module schemarouter ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ ...

The target topology should look like this: Each customer (client, tenant) is located in its own database (= schema). The databases are distributed across several MariaDB instances (shards). So that the application can access the database transparently, a pair of MaxScale load balancers is connected in front of it, which knows where the customer is located and forwards the traffic to the shard accordingly. To ensure that the MaxScale load balancers are designed for high availability, a virtual IP (VIP) is connected upstream, e.g. using Keepalived. If this is still too simple for you, you can design each individual shard as a master/slave or Galera cluster construct...


Preparation of the shards (MariaDB database instances)

The first problem we had with this PoC was with the test database. By deleting the test database on all shards, the problem disappeared. Alternatively, you can run mariadb-secure-installation, which you should do on production systems anyway, or you can use the MaxScale configuration parameters: ignore_tables or ignore_tables_regex to allow the same tables in different shards (MXS-5027).

See also: MaxScale Router Parameters.

Create test data

So that we have something to play with, we have created test data:

-- On shard 1: 2 customers SQL> CREATE DATABASE customer_0010; SQL> CREATE TABLE customer_0010.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0010.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0010.address VALUES (1, 'Customer 10 GmbH'); SQL> INSERT INTO customer_0010.sales VALUES (1, 'Apples', 5, 1.2, 6), (2, 'Pears', 2, 0.9, 1.8), (3, 'Bread', 1, 2.5, 2.5); SQL> CREATE DATABASE customer_0011; SQL> CREATE TABLE customer_0011.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0011.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0011.address VALUES (1, 'Customer 11 SE'); SQL> INSERT INTO customer_0011.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); -- On shard 2: 3 customers SQL> CREATE DATABASE customer_0020; SQL> CREATE TABLE customer_0020.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0020.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0020.address VALUES (1, 'Customer 20 AG'); SQL> INSERT INTO customer_0020.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); SQL> CREATE DATABASE customer_0021; SQL> CREATE TABLE customer_0021.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0021.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0021.address VALUES (1, 'Customer 21 GmbH'); SQL> INSERT INTO customer_0021.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); SQL> CREATE DATABASE customer_0022; SQL> CREATE TABLE customer_0022.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0022.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0022.address VALUES (1, 'Customer 22 Gebr.'); SQL> INSERT INTO customer_0022.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); -- On shard 3: 1 customer SQL> CREATE DATABASE customer_0030; SQL> CREATE TABLE customer_0030.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0030.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0030.address VALUES (1, 'Customer 30 GmbH'); SQL> INSERT INTO customer_0030.sales VALUES (1, 'Pickles', 2, 2.2, 4.4), (2, 'Salad', 1, 3.1, 3.1), (3, 'Pudding', 5, 2.2, 11.0), (4, 'Asparagus', 12, .3, 3.6);
Create roles and users

Since in a sharded system, in contrast to a Galera cluster for example, the individual database instances do not know anything about each other and do not communicate with each other, we have to create the roles and users or accounts individually on EACH shard.

MariaDB MaxScale needs a user for the SchemaRouter service and the monitor (on each shard).

As the name suggests, the monitor user is responsible for monitoring and the SchemaRouter service user is responsible for collecting the user account information from the sharding backends and forwarding the queries to the correct shard.

Since a redundant system typically works with at least two MaxScale routers and we wanted to prevent the privileges of the accounts from diverging, we work with roles for both the MaxScale users and the application users.

MaxScale Monitor User SQL> CREATE ROLE maxscale_monitor_role; SQL> GRANT SELECT ON mysql.user TO 'maxscale_monitor_role'; SQL> GRANT REPLICATION CLIENT ON *.* TO 'maxscale_monitor_role'; SQL> GRANT SLAVE MONITOR ON *.* TO 'maxscale_monitor_role'; SQL> GRANT FILE ON *.* TO 'maxscale_monitor_role'; SQL> GRANT CONNECTION ADMIN ON *.* TO 'maxscale_monitor_role'; SQL> SHOW GRANTS FOR maxscale_monitor_role; +-----------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor_role | +-----------------------------------------------------------------------------------------------+ | GRANT FILE, BINLOG MONITOR, CONNECTION ADMIN, SLAVE MONITOR ON *.* TO `maxscale_monitor_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_monitor_role` | +-----------------------------------------------------------------------------------------------+ SQL> CREATE USER maxscale_monitor@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_monitor@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.210'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_monitor%'; +-----------------------+----------------+---------+-----------------------+ | User | Host | is_role | default_role | +-----------------------+----------------+---------+-----------------------+ | maxscale_monitor_role | | Y | | | maxscale_monitor | 10.139.158.210 | N | maxscale_monitor_role | | maxscale_monitor | 10.139.158.211 | N | maxscale_monitor_role | +-----------------------+----------------+---------+-----------------------+ SQL> SHOW GRANTS FOR maxscale_monitor@'10.139.158.211'; +------------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor@10.139.158.211 | +------------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_monitor_role` TO `maxscale_monitor`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_monitor`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_monitor_role` FOR `maxscale_monitor`@`10.139.158.211` | +------------------------------------------------------------------------------------------------------------------------------+
MaxScale Admin User SQL> CREATE ROLE maxscale_admin_role; SQL> GRANT SHOW DATABASES ON *.* TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.user TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.db TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.tables_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.columns_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.proxies_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.roles_mapping TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.procs_priv TO 'maxscale_admin_role'; SQL> SHOW GRANTS FOR maxscale_admin_role; +------------------------------------------------------------------+ | Grants for maxscale_admin_role | +------------------------------------------------------------------+ | GRANT USAGE ON *.* TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`roles_mapping` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`tables_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`procs_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`db` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`columns_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`proxies_priv` TO `maxscale_admin_role` | +------------------------------------------------------------------+ SQL> CREATE USER maxscale_admin@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_admin@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.210'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_admin%'; +---------------------+----------------+---------+---------------------+ | User | Host | is_role | default_role | +---------------------+----------------+---------+---------------------+ | maxscale_admin_role | | Y | | | maxscale_admin | 10.139.158.210 | N | maxscale_admin_role | | maxscale_admin | 10.139.158.211 | N | maxscale_admin_role | +---------------------+----------------+---------+---------------------+ SQL> SHOW GRANTS FOR maxscale_admin@'10.139.158.211'; +----------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_admin@10.139.158.211 | +----------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_admin_role` TO `maxscale_admin`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_admin`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_admin_role` FOR `maxscale_admin`@`10.139.158.211` | +----------------------------------------------------------------------------------------------------------------------------+

See also: SchemaRouter Configuration

Create application role and accounts

The application also requires a user, which we create here as on every shard as follows:

SQL> CREATE ROLE app_role; SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO 'app_role'; SQL> GRANT SHOW DATABASES ON *.* TO 'app_role'; SQL> GRANT CREATE, DROP, ALTER ON *.* TO 'app_role'; -- For creating new tenant databases SQL> SHOW GRANTS FOR app_role; +----------------------------------------------------------------------+ | Grants for app_role | +----------------------------------------------------------------------+ | GRANT SHOW DATABASES ON *.* TO `app_role` | | GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO `app_role` | +----------------------------------------------------------------------+ SQL> CREATE USER app@'10.139.158.%' IDENTIFIED BY 'secret'; SQL> GRANT app_role TO app@'10.139.158.%'; SQL> SET DEFAULT ROLE app_role FOR app@'10.139.158.%'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'app%'; +----------+--------------+---------+--------------+ | User | Host | is_role | default_role | +----------+--------------+---------+--------------+ | app_role | | Y | | | app | 10.139.158.% | N | app_role | +----------+--------------+---------+--------------+ SQL> SHOW GRANTS FOR app@'10.139.158.%'; +---------------------------------------------------------------------------------------------------------------+ | Grants for app@10.139.158.% | +---------------------------------------------------------------------------------------------------------------+ | GRANT `app_role` TO `app`@`10.139.158.%` | | GRANT USAGE ON *.* TO `app`@`10.139.158.%` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `app_role` FOR `app`@`10.139.158.%` | +---------------------------------------------------------------------------------------------------------------+
Proxy protocol

Load balancers and proxies have the property that they exchange the IP addresses of the clients with their own IP addresses. On the one hand, this means that you can no longer see where the client originally came from in the database and, on the other hand, you can no longer assign access authorisations to users and IPs, as the IP of the load balancer is always checked.

These two problems can be solved using the proxy protocol.

To do this, both the database and the load balancer, in this case MaxScale, must have the proxy protocol activated.

On the database side, the proxy protocol is activated as follows:

# # my.cnf # [mariadbd] proxy_protocol_networks = ::1, 10.139.158.0/24, localhost

and on the MaxScale side with:

# # /etc/maxscale.cnf # [shard] type = server proxy_protocol = true

You can check the two settings with:

SQL> SHOW GLOBAL VARIABLES LIKE 'proxy%'; +-------------------------+---------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------+ | proxy_protocol_networks | ::1, 10.139.158.0/24, localhost | +-------------------------+---------------------------------+ shell> maxctrl show server shard1 | grep proxy │ │ "proxy_protocol": true, │

Sources:


MaxScale SchemaRouter configuration

Next, we prepare the MaxScale configuration for sharding. The file recommended by MariaDB is /etc/maxscale.cnf. Whether it makes more sense to create a separate configuration file under /etc/maxscale.cnf.d/ or even to configure the entire MaxScale dynamically (/var/lib/maxscale/maxscale.cnf.d/*.cnf) remains to be seen in the long term. See also warnings below. The configuration file for this sharding PoC looks like this:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [shard1] type = server address = 10.139.158.1 port = 3363 proxy_protocol = true [shard2] type=server address=10.139.158.1 port=3364 proxy_protocol = true [shard3] type = server address = 10.139.158.1 port = 3365 proxy_protocol = true [sharding monitor] type = monitor module = galeramon servers = shard1,shard2,shard3 user = maxscale_monitor password = secret monitor_interval = 1s [Sharded-Service-Listener] type = listener service = Sharded-Service protocol = MariaDBClient port = 3306 [Sharded-Service] type = service router = schemarouter servers = shard1,shard2,shard3 user = maxscale_admin password = secret auth_all_servers = true

Note: Recommendation of the MaxScale developer: "One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon."

Starting and stopping the MaxScale Load Balancer

MaxScale is started and stopped as usual via SystemD:

shell> systemctl restart maxscale shell> systemctl status maxscale ● maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/lib/systemd/system/maxscale.service; enabled; vendor preset: enabled) Drop-In: /run/systemd/system/service.d └─zzz-lxc-service.conf Active: active (running) since Tue 2024-02-27 09:52:57 UTC; 39s ago Process: 187 ExecStart=/usr/bin/maxscale (code=exited, status=0/SUCCESS) Main PID: 188 (maxscale) Tasks: 10 (limit: 18663) Memory: 4.6M CPU: 150ms CGroup: /system.slice/maxscale.service └─188 /usr/bin/maxscale systemd[1]: Starting MariaDB MaxScale Database Proxy... maxscale[188]: Module 'galeramon' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libgaleramon.so'. maxscale[188]: Module 'schemarouter' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libschemarouter.so'. maxscale[188]: Using up to 2.3GiB of memory for query classifier cache systemd[1]: Started MariaDB MaxScale Database Proxy.

If there were errors or warnings, you can see them in the MaxScale error log:

shell> grep -v notice /var/log/maxscale/maxscale.log 2024-02-13 16:47:22 MariaDB MaxScale is shut down. ---------------------------------------------------- MariaDB MaxScale /var/log/maxscale/maxscale.log Tue Feb 13 16:47:22 2024 ---------------------------------------------------------------------------- 2024-02-27 09:52:56 warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. File is for module 'mariadbmon'. Current module is 'galeramon'. 2024-02-27 09:52:56 warning: [galeramon] Invalid 'wsrep_local_index' on server 'shard1': 18446744073709551615
Application tests Simple application tests shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='show databases' +--------------------+ | Database | +--------------------+ | customer_0010 | | customer_0011 | | customer_0020 | | customer_0021 | | customer_0022 | | customer_0030 | | information_schema | | mysql | | performance_schema | | sys | +--------------------+
New command show shards shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='show shards' | grep customer_00.* | sort | column -t customer_0010.address shard1 customer_0010.sales shard1 customer_0010. shard1 customer_0011.address shard1 customer_0011.sales shard1 customer_0011. shard1 customer_0020.address shard2 customer_0020.sales shard2 customer_0020. shard2 customer_0021.address shard2 customer_0021.sales shard2 customer_0021. shard2 customer_0022.address shard2 customer_0022.sales shard2 customer_0022. shard2 customer_0030.address shard3 customer_0030.sales shard3 customer_0030. shard3

New databases are not displayed immediately, but only when the cached data has been updated (refresh_interval (300s / 5 min)).

See also: Custom SQL commands

More general test

As a reminder:

ShardPortCustomerState #13363customer_001<n>Running #23364customer_002<n>Running #33365customer_003<n>Running #43366customer_004<n>Running
shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='use customer_0020; SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3365 | +--------+
Less simple (backup) test shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0010 > /tmp/customer_0010.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0020 > /tmp/customer_0020.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0011 > /tmp/customer_0011.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0021 > /tmp/customer_0021.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> ll /tmp/customer_00*sql -rw-rw-r-- 1 oli oli 2738 Mar 18 12:07 /tmp/customer_0010.sql -rw-rw-r-- 1 oli oli 2904 Mar 18 12:08 /tmp/customer_0011.sql -rw-rw-r-- 1 oli oli 2712 Mar 18 12:08 /tmp/customer_0020.sql -rw-rw-r-- 1 oli oli 2906 Mar 18 12:08 /tmp/customer_0021.sql -rw-rw-r-- 1 oli oli 2964 Mar 18 12:08 /tmp/customer_0030.sql shell> tail -n 1 /tmp/customer_*.sql ==> /tmp/customer_0010.sql <== -- Dump completed on 2024-02-13 14:39:21 ==> /tmp/customer_0011.sql <== -- Dump completed on 2024-02-13 14:39:35 ==> /tmp/customer_0020.sql <== -- Dump completed on 2024-02-13 14:40:15 ==> /tmp/customer_0021.sql <== -- Dump completed on 2024-02-13 14:40:42 ==> /tmp/customer_0030.sql <== -- Dump completed on 2024-02-13 14:40:52 shell> cat /tmp/customer_00*sql | grep -A1 -i insert INSERT INTO `address` VALUES (1,'Customer 10 GmbH'); -- INSERT INTO `sales` VALUES (1,'Apples',5,1.20,6.00), -- INSERT INTO `address` VALUES (1,'Customer 11 SE'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 20 AG'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 21 GmbH'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 30 GmbH'); -- INSERT INTO `sales` VALUES (1,'Pickles',2,2.20,4.40),

In MaxScale 23.08.4 there was a pretty bad bug: A return value of 0 but no data in the backup!!! See also the tickets: MXS-4966: mariadb-dump gets an error dumping schemas and MXS-4947: Tables in information_schema are treated as a normal tables. Symptoms of the bug look like this:

Error: Couldn't read status information for table address () Error: Couldn't read status information for table sales ()

We therefore strongly recommend upgrading to MaxScale 23.08.5!

More complex application tests

We have created a somewhat more complex test (./sharding_test.php) that processes the following queries:

SET NAMES utf8mb4 SHOW DATABASES use customer_ START TRANSACTION; SELECT MIN(id) AS first, MAX(id) AS last FROM `sales` INSERT INTO sales (id, product, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) INSERT INTO sales (id, product, sales, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) UPDATE sales SET product = 'Prepare to delete' WHERE id = %d DELETE FROM sales WHERE id = %d COMMIT

This test ran flawlessly. The corresponding control query:

SQL> SELECT * FROM customer_0021.sales WHERE id >= (SELECT MAX(id) - 10 FROM customer_0021.sales);

Various load scenarios can also be tested with db_bench or the Acronis perfkit. For more information, see here.

Cross-shard tests

In any case, you might come up with the idea of running cross-shard queries. This will NOT work, which should not really be surprising, firstly because it is not easy to implement and secondly because it is described here:

"Note: As the sharding solution in MaxScale is relatively simple, cross-database queries between two or more shards are not supported."

Source: Simple Sharding with Two Servers

and

"USE db1 is routed to the server with db1. If the database is divided to multiple servers, only one server will get the command."

Source: SchemaRouter.

Here is a test with UNION:

SQL> use customer_0030 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist

And here is the proof to the contrary:

SQL> use customer_0020 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist

And here is the test with JOIN:

SQL> use customer_0020 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist SQL> use customer_0030 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist
Operation of a MaxScale sharding system

In this chapter we discuss some points that can be useful for the operation of a MariaDB MaxScale sharding system.

Do-on-all-shards

Since it can always happen that O/S or database operations have to be executed on all shards, it would certainly make sense to create a script that executes the same command on all shards in turn:

shell> ./do-on-all-shards.sh --sql='SHOW DATABASES'

A script of this type should greatly reduce the error rate during operation. Operations such as the re-sharding of a tenant, as described below, are also sensibly scripted and executed centrally (MXS-5029).

Invalidating the database map cache

The invalidate command can be used to invalidate the database map cache of the MariaDB MaxScale SchemaRouter. This allows us to quickly update the cache after adding or removing tenants.

shell> maxctrl call command schemarouter invalidate Sharded-Service OK

In contrast to the invalidate command, which updates the entries after the next refresh_interval, the clear command deletes the entries and a remap is executed immediately.

If you want to invalidate the database map cache remotely with a REST API call, you can do this as follows:

shell> curl -i -X POST -u api_admin:secret http://10.139.158.211:8989/v1/maxscale/modules/schemarouter/clear?Sharded-Service HTTP/1.1 204 No Content Connection: close Date: Mon, 18 Mar 24 11:49:58 GMT X-Frame-Options: Deny X-XSS-Protection: 1 Referrer-Policy: same-origin Cache-Control: no-cache

Sources:


How to change SchemaRouter variables dynamically?

The refresh_interval specifies the lifetime of the entries in the SchemaRouter Database Map Cache. The default value is 300 s (5 min). Refresh Interval is therefore, in my opinion, an unfortunate term as it does not define the interval between two mappings but the lifetime of the cache entries (livetime?, timeout?). As soon as the entry has been deleted, a new refresh of the "database map" is triggered on each shard. The command currently looks like this:

SELECT LOWER(t.table_schema), LOWER(t.table_name) FROM information_schema.tables t UNION ALL SELECT LOWER(s.schema_name), '' FROM information_schema.schemata s

It looks like a simple connect is enough to trigger the refresh of the database map.

The current value for refresh_interval can be queried as follows:

shell> maxctrl show service Sharded-Service | grep refresh_interval | awk -F'│' '{ print $3 }' "refresh_interval": "300000ms",

The following command helps to change the value dynamically:

shell> MAXCTRL_WARNINGS=0 maxctrl alter service Sharded-Service refresh_interval=10s OK

The value should not be set too small, as all other connections are stopped during the mapping process.

Sources:


Adding and removing a tenant

Adding a new tenant to a shard is not a major problem:

SQL> CREATE DATABASE customer_0029; SQL> use customer_0029 SQL> CREATE TABLE address LIKE customer_template.address; SQL> CREATE TABLE sales LIKE customer_template.sales; shell> maxctrl call command schemarouter invalidate Sharded-Service OK

Removing a tenant from a shard, on the other hand, is somewhat more complicated and must be done in consultation with the application:

SQL> DROP DATABASE customer_0011; shell> ./sharding_test.php .....ERROR: Table 'customer_0011.sales' doesn't exist...ERROR: Unknown database 'customer_0011'.ERROR: Unknown database 'customer_0011'......ERROR: Unknown database 'customer_0011'... shell> maxctrl call command schemarouter clear Sharded-Service OK

At least I have not come up with a cleverer variant yet. See also Moving a tenant below.

Moving a tenant

The combination of adding and removing would then be moving a tenant from one shard to another shard, also known as re-sharding. This also requires a concerted action to be planned together with the application.

If this is not possible, at least the time that the application receives errors can be reduced... The following procedure can be used to move a tenant from shard 2 to shard 3:

SQL> use customer_0020; LOCK TABLES address READ, sales READ; -- On Shard 2, application will be blocked at best! shell> mariadb-dump --user=app --password=secret --host=10.139.158.1 --port=3364 --single-transaction --skip-add-locks --databases customer_0020 | mariadb --user=app --password=secret --host=10.139.158.1 --port=3365 # Copy tenant 20 from shard 2 to shard 3 SQL> DROP DATABASE customer_0020; -- Deleting tenant 20 does not work! ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction SQL> UNLOCK TABLES; DROP DATABASE customer_0020; # How to delete tenant 20. shell> maxctrl call command schemarouter clear Sharded-Service # Update MaxScale Database Map. Do it quickly!!!

Until the database map is refreshed, the following errors may occur:

error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.address' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.sales' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); Duplicate tables found, closing session.

And on the application side too:

ERROR: Error: duplicate tables found on two different shards
Adding or removing a shard

Moving a tenant from one shard to another shard is a small re-sharding operation. It becomes somewhat more complex if you want to add new shards or remove old shards. Subsequently (after the addition or before the removal), a large re-sharding would then take place. The first step is to add a shard to the cluster:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ 0-3363-26014 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

The prepared shard is made known to MaxScale:

shell> maxctrl create server shard4 10.139.158.1 3366 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-23676 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-52321 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-39751 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Down │ │ │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

The new shard is then linked to the MaxScale Monitor and the service:

shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Sharding monitor shard4 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service Sharded service shard4 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-24961 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-56215 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-45177 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-32 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Whether this second step is also absolutely necessary was not investigated.

You can follow the entire process in the MariaDB MaxScale error log:

warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files. notice : shard4 sent version string '10.11.7-MariaDB-log'. Detected type: MariaDB, version: 10.11.7. notice : Server 'shard4' charset: latin1_swedish_ci notice : Server changed state: shard4[10.139.158.1:3366]: server_up. [Down] -> [Running] warning: Saving runtime modifications to 'Sharded-Service' in '/var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf'. The modified values will override the values found in the static configuration files. notice : Added 'shard4' to 'Sharded-Service'

What we must not forget here is to also equip the new shard with the proxy protocol:

shell> maxctrl show server shard4 | grep proxy │ │ "proxy_protocol": false, │ MAXCTRL_WARNINGS=0 maxctrl alter server shard4 proxy_protocol=true OK

And now new tenants can be added to the new shard or old tenants can be moved to the new shard... In our setup, we want to move all tenants from shard 1 to shard 4 and also create a new tenant customer_0040 on shard 4. The individual steps required for this are listed above.

Once shard 1 has been emptied, it can be dismantled:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-25916 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-62887 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-54035 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-2247 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

A shard is deleted with the destroy server command. Before this works, however, a shard must be removed from the monitor and the service:

shell> MAXCTRL_WARNINGS=0 maxctrl unlink service Sharded service shard1 OK shell> MAXCTRL_WARNINGS=0 maxctrl unlink monitor Sharding monitor shard1 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ │ │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-64394 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56072 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3267 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Once the shard has been removed from the monitor and the service, it can then be deleted:

shell> maxctrl destroy server shard1 Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again. To hide these warnings, run: export MAXCTRL_WARNINGS=0 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-65018 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56886 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3648 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

And you can follow the changes in the MaxScale Error Log:

notice : Removed 'shard1' from 'Sharded-Service' warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. notice : Destroyed server 'shard1' at 10.139.158.1:3363

Important: I was informed that with destroy server --force the unlink service and unlink monitor commands are automatically executed by MaxScale.

Source:


Customising the configuration files

During the shard operations described above we received some warnings:

Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again.

and

Warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files.

The corresponding configuration files are automatically created by MaxScale when dynamic system changes are made:

shell> ll /var/lib/maxscale/maxscale.cnf.d/ /etc/maxscale.cnf -rw-r--r-- 1 root root 612 Feb 13 14:23 /etc/maxscale.cnf /var/lib/maxscale/maxscale.cnf.d/: total 12 -rw------- 1 maxscale maxscale 187 Feb 13 16:08 Sharding-Monitor.cnf -rw------- 1 maxscale maxscale 150 Feb 13 16:07 Sharded-Service.cnf -rw------- 1 maxscale maxscale 52 Feb 13 15:46 shard4.cnf cat /var/lib/maxscale/maxscale.cnf.d/* [Sharded-Service] debug=true refresh_interval=10000ms auth_all_servers=true log_debug=true password=secret router=schemarouter type=service user=maxscale_admin targets=shard2,shard3,shard4 [sharding monitor] module=galeramon monitor_interval=1000ms password=secret servers=shard2,shard3,shard4 type=monitor user=maxscale_monitor [shard4] address=10.139.158.1 port=3366 type=server

The configuration files still need to be improved accordingly. You should generally consider whether you should not configure everything dynamically via commands in a highly dynamic system...

Maintenance work on the shard

If a shard is to be taken offline for maintenance work, here in the example shard2, this can be done as follows:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-69817 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-63166 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-6902 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘ shell> maxctrl set server shard2 drain OK shell> maxctrl set server shard2 maintenance OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬──────────────────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Maintenance, Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴──────────────────────┴───────────────┴──────────────────┘

At this point, maintenance work can be carried out on the machine or the database...

Afterwards, BOTH statuses must be cleared again if both have been set (MXS-5028):

shell> maxctrl clear server shard2 maintenance OK shell> maxctrl clear server shard2 drain OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

The difference between drain and maintenance is that with drain, no new connections are allowed to the shard, but existing connections wait until they are closed. With maintenance, the connections are terminated immediately by force.

Observation of a MariaDB MaxScale sharding system

The MaxScale CLI client maxtrl can be used to query the status of the MariaDB MaxScale load balancer. There are numerous commands for this, mainly list and show:

shell> maxctrl show module schemarouter | head -n 12 ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ Parameters │ ... │ shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 4 │ Running │ 0-3364-290859 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 4 │ Running │ 0-3365-322671 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 4 │ Running │ 0-3366-140018 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

The information for the Connections column is confusing because in this case we only have 1, 1 and 2 connections open on each shard in this sharding system.

However, if you look at the situation on the respective shard with SHOW PROCESSLIST, you can see that MaxScale also establishes a connection on EACH shard for each incoming connection. So the display above is actually technically correct, just not what you would expect:

SQL> SHOW PROCESSLIST; +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | 123 | root | localhost | customer_0021 | Query | 0 | starting | show processlist | 0.000 | | 68107 | maxscale_monitor | 10.139.158.211:35418 | NULL | Sleep | 0 | | NULL | 0.000 | | 113372 | app | 10.139.158.1:47548 | NULL | Sleep | 47 | | NULL | 0.000 | | 113538 | app | 10.139.158.1:49058 | NULL | Sleep | 41 | | NULL | 0.000 | | 113662 | app | 10.139.158.1:47072 | NULL | Sleep | 37 | | NULL | 0.000 | | 114789 | app | 10.139.158.1:39574 | customer_0022 | Query | 0 | Updating | UPDATE sales SET product = 'Prepare to delete' WHERE id = 15622 | 0.000 | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+

This does not scale with large systems with hundreds or thousands of clients! Maybe the MariaDB thread pool feature is used in this case.

According to the MaxScale developer, this is expected behaviour... (MXS-4977)

shell> maxctrl list services ┌─────────────────┬──────────────┬─────────────┬───────────────────┬────────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├─────────────────┼──────────────┼─────────────┼───────────────────┼────────────────────────┤ │ Sharded-Service │ schemarouter │ 4 │ 82776 │ shard2, shard3, shard4 │ └─────────────────┴──────────────┴─────────────┴───────────────────┴────────────────────────┘ shell> maxctrl list listeners ┌──────────────────────────┬──────┬──────┬─────────┬─────────────────┐ │ Name │ Port │ Host │ State │ Service │ ├──────────────────────────┼──────┼──────┼─────────┼─────────────────┤ │ Sharded-Service-Listener │ 3306 │ :: │ Running │ Sharded-Service │ └──────────────────────────┴──────┴──────┴─────────┴─────────────────┘ shell> maxctrl list monitors ┌──────────────────┬─────────┬────────────────────────┐ │ Monitor │ State │ Servers │ ├──────────────────┼─────────┼────────────────────────┤ │ Sharding-Monitor │ Running │ shard2, shard3, shard4 │ └──────────────────┴─────────┴────────────────────────┘ shell> maxctrl show server shard2 | head -n 20 ┌─────────────────────┬──────────────────────────────────────────────┐ │ Server │ shard2 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Address │ 10.139.158.1 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Port │ 3364 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Version │ 10.11.7-MariaDB-log │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Uptime │ 178960 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Last Event │ server_down │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Triggered At │ Sun, 04 Feb 2024 07:37:17 GMT │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Services │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Monitors │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────┤ ... ├─────────────────────┼──────────────────────────────────────────────┤ │ Current Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Total Connections │ 27 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Max Connections │ 5 │ shell> maxctrl show service Sharded-Service ┌─────────────────────┬──────────────────────────────────────────────────────┐ │ Service │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Source │ /var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router │ schemarouter │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ State │ Started │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Started At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Users Loaded At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Current Connections │ 4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Total Connections │ 84590 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Max Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Cluster │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Servers │ shard2 │ │ │ shard3 │ │ │ shard4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Services │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Filters │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "auth_all_servers": true, │ │ │ "connection_keepalive": "300000ms", │ │ │ "debug": true, │ │ │ "disable_sescmd_history": false, │ │ │ "enable_root_user": false, │ │ │ "force_connection_keepalive": false, │ │ │ "idle_session_pool_time": "-1ms", │ │ │ "ignore_tables": [], │ │ │ "ignore_tables_regex": null, │ │ │ "localhost_match_wildcard_host": true, │ │ │ "log_auth_warnings": true, │ │ │ "log_debug": true, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false, │ │ │ "max_connections": 0, │ │ │ "max_sescmd_history": 50, │ │ │ "max_staleness": "150000ms", │ │ │ "multiplex_timeout": "60000ms", │ │ │ "net_write_timeout": "0ms", │ │ │ "password": "*****", │ │ │ "prune_sescmd_history": true, │ │ │ "rank": "primary", │ │ │ "refresh_databases": false, │ │ │ "refresh_interval": "10000ms", │ │ │ "retain_last_statements": -1, │ │ │ "router": "schemarouter", │ │ │ "session_trace": false, │ │ │ "strip_db_esc": true, │ │ │ "type": "service", │ │ │ "user": "maxscale_admin", │ │ │ "user_accounts_file": null, │ │ │ "user_accounts_file_usage": "add_when_load_ok", │ │ │ "version_string": null, │ │ │ "wait_timeout": "0ms" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router Diagnostics │ { │ │ │ "average_session": 0.028822357131634554, │ │ │ "longest_sescmd_chain": 4, │ │ │ "longest_session": 50, │ │ │ "queries": 761134, │ │ │ "sescmd_percentage": 44.44342257736483, │ │ │ "shard_map_hits": 84356, │ │ │ "shard_map_misses": 5, │ │ │ "shard_map_stale": 229, │ │ │ "shard_map_updates": 216, │ │ │ "shortest_session": 0, │ │ │ "times_sescmd_limit_exceeded": 0 │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────┘

See also MaxScale SchemaRouter Router diagnostics.

shell> maxctrl show monitor Sharding-Monitor ┌─────────────────────┬──────────────────────────────────────────────────────────┐ │ Monitor │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Module │ galeramon │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Servers │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "available_when_donor": false, │ │ │ "backend_connect_attempts": 1, │ │ │ "backend_connect_timeout": "3000ms", │ │ │ "backend_read_timeout": "3000ms", │ │ │ "backend_write_timeout": "3000ms", │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "disk_space_check_interval": "0ms", │ │ │ "disk_space_threshold": null, │ │ │ "events": "all,master_down,master_up,...,new_donor", │ │ │ "journal_max_age": "28800000ms", │ │ │ "module": "galeramon", │ │ │ "monitor_interval": "1000ms", │ │ │ "password": "*****", │ │ │ "root_node_as_master": false, │ │ │ "script": null, │ │ │ "script_timeout": "90000ms", │ │ │ "set_donor_nodes": false, │ │ │ "type": "monitor", │ │ │ "use_priority": false, │ │ │ "user": "maxscale_monitor" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Monitor Diagnostics │ { │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "root_node_as_master": false, │ │ │ "server_info": [ │ │ │ { │ │ │ "gtid_binlog_pos": "0-3363-26014", │ │ │ "gtid_current_pos": "0-3363-26014", │ │ │ "master_id": 0, │ │ │ "name": "shard1", │ │ │ "read_only": false, │ │ │ "server_id": 3363 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3364-240612", │ │ │ "gtid_current_pos": "0-3364-240612", │ │ │ "master_id": 0, │ │ │ "name": "shard2", │ │ │ "read_only": false, │ │ │ "server_id": 3364 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3365-289873", │ │ │ "gtid_current_pos": "0-3365-289873", │ │ │ "master_id": 0, │ │ │ "name": "shard3", │ │ │ "read_only": false, │ │ │ "server_id": 3365 │ │ │ } │ │ │ ], │ │ │ "set_donor_nodes": false, │ │ │ "use_priority": false │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────────┘ shell> maxctrl list sessions; ┌───────┬──────┬──────────────┬───────────────────────┬───────┬─────────────────┬────────┬──────────────┐ │ Id │ User │ Host │ Connected │ Idle │ Service │ Memory │ I/O-Activity │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 87240 │ app │ 10.139.158.1 │ 3/18/2024, 2:33:54 PM │ 0 │ Sharded-Service │ 68644 │ 33 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72654 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:27 PM │ 506.3 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72364 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:18 PM │ 516 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72530 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:23 PM │ 510.5 │ Sharded-Service │ 199328 │ 0 │ └───────┴──────┴──────────────┴───────────────────────┴───────┴─────────────────┴────────┴──────────────┘ shell> maxctrl show session 26 ┌───────────────────────┬───────────────────────────────────────┐ │ Id │ 26 │ ├───────────────────────┼───────────────────────────────────────┤ │ Service │ Sharded-Service │ ├───────────────────────┼───────────────────────────────────────┤ │ State │ Session started │ ├───────────────────────┼───────────────────────────────────────┤ │ User │ app │ ├───────────────────────┼───────────────────────────────────────┤ │ Host │ 10.139.158.1 │ ├───────────────────────┼───────────────────────────────────────┤ │ Port │ 42854 │ ├───────────────────────┼───────────────────────────────────────┤ │ Database │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connected │ 2/4/2024, 9:31:12 AM │ ├───────────────────────┼───────────────────────────────────────┤ │ Idle │ 610.4 │ ├───────────────────────┼───────────────────────────────────────┤ │ Parameters │ { │ │ │ "log_error": false, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Client TLS Cipher │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection attributes │ { │ │ │ "_client_name": "libmariadb", │ │ │ "_client_version": "3.3.8", │ │ │ "_os": "Linux", │ │ │ "_pid": "251037", │ │ │ "_platform": "x86_64", │ │ │ "_server_host": "10.139.158.211", │ │ │ "program_name": "mysql" │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Connections │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection IDs │ 666 │ │ │ 139 │ │ │ 138 │ ├───────────────────────┼───────────────────────────────────────┤ │ Queries │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Log │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Memory │ { │ │ │ "connection_buffers": { │ │ │ "backends": { │ │ │ "shard1": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard2": { │ │ │ "misc": 662, │ │ │ "readq": 0, │ │ │ "total": 662, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard3": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ } │ │ │ }, │ │ │ "client": { │ │ │ "misc": 654, │ │ │ "readq": 65536, │ │ │ "total": 66190, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "total": 199280 │ │ │ }, │ │ │ "exec_metadata": 0, │ │ │ "last_queries": 0, │ │ │ "sescmd_history": 48, │ │ │ "total": 199328, │ │ │ "variables": 0 │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ I/O Activity │ 0 │ └───────────────────────┴───────────────────────────────────────┘ shell> maxctrl show listener Sharded-Service-Listener ┌────────────┬───────────────────────────────────────────┐ │ Name │ Sharded-Service-Listener │ ├────────────┼───────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├────────────┼───────────────────────────────────────────┤ │ Service │ Sharded-Service │ ├────────────┼───────────────────────────────────────────┤ │ Parameters │ { │ │ │ "MariaDBProtocol": { │ │ │ "allow_replication": true │ │ │ }, │ │ │ "address": "::", │ │ │ "authenticator": null, │ │ │ "authenticator_options": null, │ │ │ "connection_init_sql_file": null, │ │ │ "connection_metadata": [ │ │ │ "character_set_client=auto", │ │ │ "character_set_connection=auto", │ │ │ "character_set_results=auto", │ │ │ "max_allowed_packet=auto", │ │ │ "system_time_zone=auto", │ │ │ "time_zone=auto", │ │ │ "tx_isolation=auto" │ │ │ ], │ │ │ "port": 3306, │ │ │ "protocol": "MariaDBProtocol", │ │ │ "proxy_protocol_networks": null, │ │ │ "service": "Sharded-Service", │ │ │ "socket": null, │ │ │ "sql_mode": "default", │ │ │ "ssl": false, │ │ │ "ssl_ca": null, │ │ │ "ssl_cert": null, │ │ │ "ssl_cert_verify_depth": 9, │ │ │ "ssl_cipher": null, │ │ │ "ssl_crl": null, │ │ │ "ssl_key": null, │ │ │ "ssl_verify_peer_certificate": false, │ │ │ "ssl_verify_peer_host": false, │ │ │ "ssl_version": "MAX", │ │ │ "type": "listener", │ │ │ "user_mapping_file": null │ │ │ } │ └────────────┴───────────────────────────────────────────┘ shell> maxctrl show module schemarouter ┌─────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Parameters │ [ │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable debug mode", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": [], │ │ │ "description": "List of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables", │ │ │ "type": "stringlist" │ │ │ }, │ │ │ { │ │ │ "description": "Regex of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables_regex", │ │ │ "type": "regex" │ │ │ }, │ │ │ { │ │ │ "default_value": "150000ms", │ │ │ "description": "Maximum allowed staleness of a database map entry before clients block and wait for an update", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_staleness", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Refresh database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_databases", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How often to refresh the database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_interval", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Retrieve users from all backend servers instead of only one", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "auth_all_servers", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How ofted idle connections are pinged", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_keepalive", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "deprecated": true, │ │ │ "description": "Alias for 'wait_timeout'", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_timeout", │ │ │ "type": "duration" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Disable session command history", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "disable_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Allow the root user to connect to this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "enable_root_user", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Ping connections unconditionally", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "force_connection_keepalive", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "-1ms", │ │ │ "description": "Put connections into pool after session has been idle for this long", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "idle_session_pool_time", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Match localhost to wildcard host", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "localhost_match_wildcard_host", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Log a warning when client authentication fails", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_auth_warnings", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log debug messages for this service (debug builds only)", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log info messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_info", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log notice messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_notice", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log warning messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_warning", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": 0, │ │ │ "description": "Maximum number of connections", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_connections", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": 50, │ │ │ "description": "Session command history size", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_sescmd_history", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": "60000ms", │ │ │ "description": "How long a session can wait for a connection to become available", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "multiplex_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Network write timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "net_write_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "description": "Password for the user used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "password", │ │ │ "type": "password" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Prune old session command history if the limit is exceeded", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "prune_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "primary", │ │ │ "description": "Service rank", │ │ │ "enum_values": [ │ │ │ "primary", │ │ │ "secondary" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "rank", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "default_value": -1, │ │ │ "description": "Number of statements kept in memory", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "retain_last_statements", │ │ │ "type": "int" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable session tracing for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_trace", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "deprecated": true, │ │ │ "description": "Track session state using server responses", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_track_trx_state", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "deprecated": true, │ │ │ "description": "Strip escape characters from database names", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "strip_db_esc", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "description": "Username used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "user", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "description": "Load additional users from a file", │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file", │ │ │ "type": "path" │ │ │ }, │ │ │ { │ │ │ "default_value": "add_when_load_ok", │ │ │ "description": "When and how the user accounts file is used", │ │ │ "enum_values": [ │ │ │ "add_when_load_ok", │ │ │ "file_only_always" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file_usage", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "description": "Custom version string to use", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "version_string", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Connection idle timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "wait_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ } │ │ │ ] │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Commands │ [ │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Clear schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "clear", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/clear/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ }, │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Invalidate schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "invalidate", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/invalidate/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ } │ │ │ ] │ └─────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ shell> maxctrl show commands schemarouter ┌────────────┬────────────┬──────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├────────────┼────────────┼──────────────────────────┤ │ clear │ SERVICE │ The schemarouter service │ ├────────────┼────────────┼──────────────────────────┤ │ invalidate │ SERVICE │ The schemarouter service │ └────────────┴────────────┴──────────────────────────┘ shell> maxctrl show dbusers Sharded-Service ┌───────────────────────┬────────────────┬───────────────────────┬───────┬───────┬────────┬───────┬──────┐ │ User │ Host │ Plugin │ TLS │ Super │ Global │ Proxy │ Role │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ PUBLIC │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app │ 10.139.158.% │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mariadb.sys │ localhost │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mysql │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ root │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ └───────────────────────┴────────────────┴───────────────────────┴───────┴───────┴────────┴───────┴──────┘ shell> maxctrl show commands mariadbmon ┌───────────────────────────┬─────────────────────────────┬───────────────────────────────────────────────────────────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover-force │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ fetch-cmd-result │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cancel-cmd │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-add-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to add to ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-remove-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to remove from ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-start-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-stop-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readonly │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readwrite │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rebuild-server │ MONITOR, SERVER, [SERVER] │ Monitor name, Target server, Source server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-create-backup │ MONITOR, SERVER, STRING │ Monitor name, Source server, Backup name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-restore-from-backup │ MONITOR, SERVER, STRING │ Monitor name, Target server, Backup name │ └───────────────────────────┴─────────────────────────────┴───────────────────────────────────────────────────────────────────────────────┘
Literature / Sources
Taxonomy upgrade extras: shardingmaxscaleschemarouterload balancermulti-tenant

Sharding with MariaDB MaxScale

Shinguz - Tue, 2024-03-19 17:02
Table of contents
Overview

This feature should more or less work with MariaDB MaxScale 6.x.y, 22.08.x, 23.02.x, 23.08.x and 24.02.x. We have tested it with the latest MaxScale version 23.08.05, as we encountered problems with an older version (MXS-5026).

shell> maxscale --version MaxScale 23.08.5

We used MariaDB 10.11 as the database backend (shards).

Less than approx. 2% of all MariaDB installations known to us are what we technically understand by multi-tenant systems (each customer in its own database (also called a schema)).

This MariaDB MaxScale feature is therefore used relatively rarely and there is an increased risk of encountering bugs that no-one has come across before!

This feature is called SchemaRouter at MariadDB MaxScale and is still declared as beta quality (MXS-5025):

maxctrl> show module schemarouter ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ ...

The target topology should look like this: Each customer (client, tenant) is located in its own database (= schema). The databases are distributed across several MariaDB instances (shards). So that the application can access the database transparently, a pair of MaxScale load balancers is connected in front of it, which knows where the customer is located and forwards the traffic to the shard accordingly. To ensure that the MaxScale load balancers are designed for high availability, a virtual IP (VIP) is connected upstream, e.g. using Keepalived. If this is still too simple for you, you can design each individual shard as a master/slave or Galera cluster construct...


Preparation of the shards (MariaDB database instances)

The first problem we had with this PoC was with the test database. By deleting the test database on all shards, the problem disappeared. Alternatively, you can run mariadb-secure-installation, which you should do on production systems anyway, or you can use the MaxScale configuration parameters: ignore_tables or ignore_tables_regex to allow the same tables in different shards (MXS-5027).

See also: MaxScale Router Parameters.

Create test data

So that we have something to play with, we have created test data:

-- On shard 1: 2 customers SQL> CREATE DATABASE customer_0010; SQL> CREATE TABLE customer_0010.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0010.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0010.address VALUES (1, 'Customer 10 GmbH'); SQL> INSERT INTO customer_0010.sales VALUES (1, 'Apples', 5, 1.2, 6), (2, 'Pears', 2, 0.9, 1.8), (3, 'Bread', 1, 2.5, 2.5); SQL> CREATE DATABASE customer_0011; SQL> CREATE TABLE customer_0011.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0011.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0011.address VALUES (1, 'Customer 11 SE'); SQL> INSERT INTO customer_0011.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); -- On shard 2: 3 customers SQL> CREATE DATABASE customer_0020; SQL> CREATE TABLE customer_0020.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0020.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0020.address VALUES (1, 'Customer 20 AG'); SQL> INSERT INTO customer_0020.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); SQL> CREATE DATABASE customer_0021; SQL> CREATE TABLE customer_0021.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0021.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0021.address VALUES (1, 'Customer 21 GmbH'); SQL> INSERT INTO customer_0021.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); SQL> CREATE DATABASE customer_0022; SQL> CREATE TABLE customer_0022.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0022.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0022.address VALUES (1, 'Customer 22 Gebr.'); SQL> INSERT INTO customer_0022.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salad', 5, 1.2, 6); -- On shard 3: 1 customer SQL> CREATE DATABASE customer_0030; SQL> CREATE TABLE customer_0030.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0030.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0030.address VALUES (1, 'Customer 30 GmbH'); SQL> INSERT INTO customer_0030.sales VALUES (1, 'Pickles', 2, 2.2, 4.4), (2, 'Salad', 1, 3.1, 3.1), (3, 'Pudding', 5, 2.2, 11.0), (4, 'Asparagus', 12, .3, 3.6);
Create roles and users

Since in a sharded system, in contrast to a Galera cluster for example, the individual database instances do not know anything about each other and do not communicate with each other, we have to create the roles and users or accounts individually on EACH shard.

MariaDB MaxScale needs a user for the SchemaRouter service and the monitor (on each shard).

As the name suggests, the monitor user is responsible for monitoring and the SchemaRouter service user is responsible for collecting the user account information from the sharding backends and forwarding the queries to the correct shard.

Since a redundant system typically works with at least two MaxScale routers and we wanted to prevent the privileges of the accounts from diverging, we work with roles for both the MaxScale users and the application users.

MaxScale Monitor User SQL> CREATE ROLE maxscale_monitor_role; SQL> GRANT SELECT ON mysql.user TO 'maxscale_monitor_role'; SQL> GRANT REPLICATION CLIENT ON *.* TO 'maxscale_monitor_role'; SQL> GRANT SLAVE MONITOR ON *.* TO 'maxscale_monitor_role'; SQL> GRANT FILE ON *.* TO 'maxscale_monitor_role'; SQL> GRANT CONNECTION ADMIN ON *.* TO 'maxscale_monitor_role'; SQL> SHOW GRANTS FOR maxscale_monitor_role; +-----------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor_role | +-----------------------------------------------------------------------------------------------+ | GRANT FILE, BINLOG MONITOR, CONNECTION ADMIN, SLAVE MONITOR ON *.* TO `maxscale_monitor_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_monitor_role` | +-----------------------------------------------------------------------------------------------+ SQL> CREATE USER maxscale_monitor@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_monitor@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.210'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_monitor%'; +-----------------------+----------------+---------+-----------------------+ | User | Host | is_role | default_role | +-----------------------+----------------+---------+-----------------------+ | maxscale_monitor_role | | Y | | | maxscale_monitor | 10.139.158.210 | N | maxscale_monitor_role | | maxscale_monitor | 10.139.158.211 | N | maxscale_monitor_role | +-----------------------+----------------+---------+-----------------------+ SQL> SHOW GRANTS FOR maxscale_monitor@'10.139.158.211'; +------------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor@10.139.158.211 | +------------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_monitor_role` TO `maxscale_monitor`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_monitor`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_monitor_role` FOR `maxscale_monitor`@`10.139.158.211` | +------------------------------------------------------------------------------------------------------------------------------+
MaxScale Admin User SQL> CREATE ROLE maxscale_admin_role; SQL> GRANT SHOW DATABASES ON *.* TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.user TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.db TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.tables_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.columns_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.proxies_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.roles_mapping TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.procs_priv TO 'maxscale_admin_role'; SQL> SHOW GRANTS FOR maxscale_admin_role; +------------------------------------------------------------------+ | Grants for maxscale_admin_role | +------------------------------------------------------------------+ | GRANT USAGE ON *.* TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`roles_mapping` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`tables_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`procs_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`db` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`columns_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`proxies_priv` TO `maxscale_admin_role` | +------------------------------------------------------------------+ SQL> CREATE USER maxscale_admin@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_admin@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.210'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_admin%'; +---------------------+----------------+---------+---------------------+ | User | Host | is_role | default_role | +---------------------+----------------+---------+---------------------+ | maxscale_admin_role | | Y | | | maxscale_admin | 10.139.158.210 | N | maxscale_admin_role | | maxscale_admin | 10.139.158.211 | N | maxscale_admin_role | +---------------------+----------------+---------+---------------------+ SQL> SHOW GRANTS FOR maxscale_admin@'10.139.158.211'; +----------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_admin@10.139.158.211 | +----------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_admin_role` TO `maxscale_admin`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_admin`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_admin_role` FOR `maxscale_admin`@`10.139.158.211` | +----------------------------------------------------------------------------------------------------------------------------+

See also: SchemaRouter Configuration

Create application role and accounts

The application also requires a user, which we create here as on every shard as follows:

SQL> CREATE ROLE app_role; SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO 'app_role'; SQL> GRANT SHOW DATABASES ON *.* TO 'app_role'; SQL> GRANT CREATE, DROP, ALTER ON *.* TO 'app_role'; -- For creating new tenant databases SQL> SHOW GRANTS FOR app_role; +----------------------------------------------------------------------+ | Grants for app_role | +----------------------------------------------------------------------+ | GRANT SHOW DATABASES ON *.* TO `app_role` | | GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO `app_role` | +----------------------------------------------------------------------+ SQL> CREATE USER app@'10.139.158.%' IDENTIFIED BY 'secret'; SQL> GRANT app_role TO app@'10.139.158.%'; SQL> SET DEFAULT ROLE app_role FOR app@'10.139.158.%'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'app%'; +----------+--------------+---------+--------------+ | User | Host | is_role | default_role | +----------+--------------+---------+--------------+ | app_role | | Y | | | app | 10.139.158.% | N | app_role | +----------+--------------+---------+--------------+ SQL> SHOW GRANTS FOR app@'10.139.158.%'; +---------------------------------------------------------------------------------------------------------------+ | Grants for app@10.139.158.% | +---------------------------------------------------------------------------------------------------------------+ | GRANT `app_role` TO `app`@`10.139.158.%` | | GRANT USAGE ON *.* TO `app`@`10.139.158.%` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `app_role` FOR `app`@`10.139.158.%` | +---------------------------------------------------------------------------------------------------------------+
Proxy protocol

Load balancers and proxies have the property that they exchange the IP addresses of the clients with their own IP addresses. On the one hand, this means that you can no longer see where the client originally came from in the database and, on the other hand, you can no longer assign access authorisations to users and IPs, as the IP of the load balancer is always checked.

These two problems can be solved using the proxy protocol.

To do this, both the database and the load balancer, in this case MaxScale, must have the proxy protocol activated.

On the database side, the proxy protocol is activated as follows:

# # my.cnf # [mariadbd] proxy_protocol_networks = ::1, 10.139.158.0/24, localhost

and on the MaxScale side with:

# # /etc/maxscale.cnf # [shard] type = server proxy_protocol = true

You can check the two settings with:

SQL> SHOW GLOBAL VARIABLES LIKE 'proxy%'; +-------------------------+---------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------+ | proxy_protocol_networks | ::1, 10.139.158.0/24, localhost | +-------------------------+---------------------------------+ shell> maxctrl show server shard1 | grep proxy │ │ "proxy_protocol": true, │

Sources:


MaxScale SchemaRouter configuration

Next, we prepare the MaxScale configuration for sharding. The file recommended by MariaDB is /etc/maxscale.cnf. Whether it makes more sense to create a separate configuration file under /etc/maxscale.cnf.d/ or even to configure the entire MaxScale dynamically (/var/lib/maxscale/maxscale.cnf.d/*.cnf) remains to be seen in the long term. See also warnings below. The configuration file for this sharding PoC looks like this:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [shard1] type = server address = 10.139.158.1 port = 3363 proxy_protocol = true [shard2] type=server address=10.139.158.1 port=3364 proxy_protocol = true [shard3] type = server address = 10.139.158.1 port = 3365 proxy_protocol = true [sharding monitor] type = monitor module = galeramon servers = shard1,shard2,shard3 user = maxscale_monitor password = secret monitor_interval = 1s [Sharded-Service-Listener] type = listener service = Sharded-Service protocol = MariaDBClient port = 3306 [Sharded-Service] type = service router = schemarouter servers = shard1,shard2,shard3 user = maxscale_admin password = secret auth_all_servers = true

Note: Recommendation of the MaxScale developer: "One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon."

Starting and stopping the MaxScale Load Balancer

MaxScale is started and stopped as usual via SystemD:

shell> systemctl restart maxscale shell> systemctl status maxscale ● maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/lib/systemd/system/maxscale.service; enabled; vendor preset: enabled) Drop-In: /run/systemd/system/service.d └─zzz-lxc-service.conf Active: active (running) since Tue 2024-02-27 09:52:57 UTC; 39s ago Process: 187 ExecStart=/usr/bin/maxscale (code=exited, status=0/SUCCESS) Main PID: 188 (maxscale) Tasks: 10 (limit: 18663) Memory: 4.6M CPU: 150ms CGroup: /system.slice/maxscale.service └─188 /usr/bin/maxscale systemd[1]: Starting MariaDB MaxScale Database Proxy... maxscale[188]: Module 'galeramon' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libgaleramon.so'. maxscale[188]: Module 'schemarouter' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libschemarouter.so'. maxscale[188]: Using up to 2.3GiB of memory for query classifier cache systemd[1]: Started MariaDB MaxScale Database Proxy.

If there were errors or warnings, you can see them in the MaxScale error log:

shell> grep -v notice /var/log/maxscale/maxscale.log 2024-02-13 16:47:22 MariaDB MaxScale is shut down. ---------------------------------------------------- MariaDB MaxScale /var/log/maxscale/maxscale.log Tue Feb 13 16:47:22 2024 ---------------------------------------------------------------------------- 2024-02-27 09:52:56 warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. File is for module 'mariadbmon'. Current module is 'galeramon'. 2024-02-27 09:52:56 warning: [galeramon] Invalid 'wsrep_local_index' on server 'shard1': 18446744073709551615
Application tests Simple application tests shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='show databases' +--------------------+ | Database | +--------------------+ | customer_0010 | | customer_0011 | | customer_0020 | | customer_0021 | | customer_0022 | | customer_0030 | | information_schema | | mysql | | performance_schema | | sys | +--------------------+
New command show shards shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='show shards' | grep customer_00.* | sort | column -t customer_0010.address shard1 customer_0010.sales shard1 customer_0010. shard1 customer_0011.address shard1 customer_0011.sales shard1 customer_0011. shard1 customer_0020.address shard2 customer_0020.sales shard2 customer_0020. shard2 customer_0021.address shard2 customer_0021.sales shard2 customer_0021. shard2 customer_0022.address shard2 customer_0022.sales shard2 customer_0022. shard2 customer_0030.address shard3 customer_0030.sales shard3 customer_0030. shard3

New databases are not displayed immediately, but only when the cached data has been updated (refresh_interval (300s / 5 min)).

See also: Custom SQL commands

More general test

As a reminder:

ShardPortCustomerState #13363customer_001<n>Running #23364customer_002<n>Running #33365customer_003<n>Running #43366customer_004<n>Running
shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='use customer_0020; SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3365 | +--------+
Less simple (backup) test shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0010 > /tmp/customer_0010.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0020 > /tmp/customer_0020.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0011 > /tmp/customer_0011.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0021 > /tmp/customer_0021.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> ll /tmp/customer_00*sql -rw-rw-r-- 1 oli oli 2738 Mar 18 12:07 /tmp/customer_0010.sql -rw-rw-r-- 1 oli oli 2904 Mar 18 12:08 /tmp/customer_0011.sql -rw-rw-r-- 1 oli oli 2712 Mar 18 12:08 /tmp/customer_0020.sql -rw-rw-r-- 1 oli oli 2906 Mar 18 12:08 /tmp/customer_0021.sql -rw-rw-r-- 1 oli oli 2964 Mar 18 12:08 /tmp/customer_0030.sql shell> tail -n 1 /tmp/customer_*.sql ==> /tmp/customer_0010.sql <== -- Dump completed on 2024-02-13 14:39:21 ==> /tmp/customer_0011.sql <== -- Dump completed on 2024-02-13 14:39:35 ==> /tmp/customer_0020.sql <== -- Dump completed on 2024-02-13 14:40:15 ==> /tmp/customer_0021.sql <== -- Dump completed on 2024-02-13 14:40:42 ==> /tmp/customer_0030.sql <== -- Dump completed on 2024-02-13 14:40:52 shell> cat /tmp/customer_00*sql | grep -A1 -i insert INSERT INTO `address` VALUES (1,'Customer 10 GmbH'); -- INSERT INTO `sales` VALUES (1,'Apples',5,1.20,6.00), -- INSERT INTO `address` VALUES (1,'Customer 11 SE'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 20 AG'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 21 GmbH'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 30 GmbH'); -- INSERT INTO `sales` VALUES (1,'Pickles',2,2.20,4.40),

In MaxScale 23.08.4 there was a pretty bad bug: A return value of 0 but no data in the backup!!! See also the tickets: MXS-4966: mariadb-dump gets an error dumping schemas and MXS-4947: Tables in information_schema are treated as a normal tables. Symptoms of the bug look like this:

Error: Couldn't read status information for table address () Error: Couldn't read status information for table sales ()

We therefore strongly recommend upgrading to MaxScale 23.08.5!

More complex application tests

We have created a somewhat more complex test (./sharding_test.php) that processes the following queries:

SET NAMES utf8mb4 SHOW DATABASES use customer_ START TRANSACTION; SELECT MIN(id) AS first, MAX(id) AS last FROM `sales` INSERT INTO sales (id, product, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) INSERT INTO sales (id, product, sales, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) UPDATE sales SET product = 'Prepare to delete' WHERE id = %d DELETE FROM sales WHERE id = %d COMMIT

This test ran flawlessly. The corresponding control query:

SQL> SELECT * FROM customer_0021.sales WHERE id >= (SELECT MAX(id) - 10 FROM customer_0021.sales);

Various load scenarios can also be tested with db_bench or the Acronis perfkit. For more information, see here.

Cross-shard tests

In any case, you might come up with the idea of running cross-shard queries. This will NOT work, which should not really be surprising, firstly because it is not easy to implement and secondly because it is described here:

"Note: As the sharding solution in MaxScale is relatively simple, cross-database queries between two or more shards are not supported."

Source: Simple Sharding with Two Servers

and

"USE db1 is routed to the server with db1. If the database is divided to multiple servers, only one server will get the command."

Source: SchemaRouter.

Here is a test with UNION:

SQL> use customer_0030 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist

And here is the proof to the contrary:

SQL> use customer_0020 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist

And here is the test with JOIN:

SQL> use customer_0020 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist SQL> use customer_0030 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist
Operation of a MaxScale sharding system

In this chapter we discuss some points that can be useful for the operation of a MariaDB MaxScale sharding system.

Do-on-all-shards

Since it can always happen that O/S or database operations have to be executed on all shards, it would certainly make sense to create a script that executes the same command on all shards in turn:

shell> ./do-on-all-shards.sh --sql='SHOW DATABASES'

A script of this type should greatly reduce the error rate during operation. Operations such as the re-sharding of a tenant, as described below, are also sensibly scripted and executed centrally (MXS-5029).

Invalidating the database map cache

The invalidate command can be used to invalidate the database map cache of the MariaDB MaxScale SchemaRouter. This allows us to quickly update the cache after adding or removing tenants.

shell> maxctrl call command schemarouter invalidate Sharded-Service OK

In contrast to the invalidate command, which updates the entries after the next refresh_interval, the clear command deletes the entries and a remap is executed immediately.

If you want to invalidate the database map cache remotely with a REST API call, you can do this as follows:

shell> curl -i -X POST -u api_admin:secret http://10.139.158.211:8989/v1/maxscale/modules/schemarouter/clear?Sharded-Service HTTP/1.1 204 No Content Connection: close Date: Mon, 18 Mar 24 11:49:58 GMT X-Frame-Options: Deny X-XSS-Protection: 1 Referrer-Policy: same-origin Cache-Control: no-cache

Sources:


How to change SchemaRouter variables dynamically?

The refresh_interval specifies the lifetime of the entries in the SchemaRouter Database Map Cache. The default value is 300 s (5 min). Refresh Interval is therefore, in my opinion, an unfortunate term as it does not define the interval between two mappings but the lifetime of the cache entries (livetime?, timeout?). As soon as the entry has been deleted, a new refresh of the "database map" is triggered on each shard. The command currently looks like this:

SELECT LOWER(t.table_schema), LOWER(t.table_name) FROM information_schema.tables t UNION ALL SELECT LOWER(s.schema_name), '' FROM information_schema.schemata s

It looks like a simple connect is enough to trigger the refresh of the database map.

The current value for refresh_interval can be queried as follows:

shell> maxctrl show service Sharded-Service | grep refresh_interval | awk -F'│' '{ print $3 }' "refresh_interval": "300000ms",

The following command helps to change the value dynamically:

shell> MAXCTRL_WARNINGS=0 maxctrl alter service Sharded-Service refresh_interval=10s OK

The value should not be set too small, as all other connections are stopped during the mapping process.

Sources:


Adding and removing a tenant

Adding a new tenant to a shard is not a major problem:

SQL> CREATE DATABASE customer_0029; SQL> use customer_0029 SQL> CREATE TABLE address LIKE customer_template.address; SQL> CREATE TABLE sales LIKE customer_template.sales; shell> maxctrl call command schemarouter invalidate Sharded-Service OK

Removing a tenant from a shard, on the other hand, is somewhat more complicated and must be done in consultation with the application:

SQL> DROP DATABASE customer_0011; shell> ./sharding_test.php .....ERROR: Table 'customer_0011.sales' doesn't exist...ERROR: Unknown database 'customer_0011'.ERROR: Unknown database 'customer_0011'......ERROR: Unknown database 'customer_0011'... shell> maxctrl call command schemarouter clear Sharded-Service OK

At least I have not come up with a cleverer variant yet. See also Moving a tenant below.

Moving a tenant

The combination of adding and removing would then be moving a tenant from one shard to another shard, also known as re-sharding. This also requires a concerted action to be planned together with the application.

If this is not possible, at least the time that the application receives errors can be reduced... The following procedure can be used to move a tenant from shard 2 to shard 3:

SQL> use customer_0020; LOCK TABLES address READ, sales READ; -- On Shard 2, application will be blocked at best! shell> mariadb-dump --user=app --password=secret --host=10.139.158.1 --port=3364 --single-transaction --skip-add-locks --databases customer_0020 | mariadb --user=app --password=secret --host=10.139.158.1 --port=3365 # Copy tenant 20 from shard 2 to shard 3 SQL> DROP DATABASE customer_0020; -- Deleting tenant 20 does not work! ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction SQL> UNLOCK TABLES; DROP DATABASE customer_0020; # How to delete tenant 20. shell> maxctrl call command schemarouter clear Sharded-Service # Update MaxScale Database Map. Do it quickly!!!

Until the database map is refreshed, the following errors may occur:

error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.address' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.sales' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); Duplicate tables found, closing session.

And on the application side too:

ERROR: Error: duplicate tables found on two different shards
Adding or removing a shard

Moving a tenant from one shard to another shard is a small re-sharding operation. It becomes somewhat more complex if you want to add new shards or remove old shards. Subsequently (after the addition or before the removal), a large re-sharding would then take place. The first step is to add a shard to the cluster:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ 0-3363-26014 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

The prepared shard is made known to MaxScale:

shell> maxctrl create server shard4 10.139.158.1 3366 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-23676 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-52321 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-39751 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Down │ │ │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

The new shard is then linked to the MaxScale Monitor and the service:

shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Sharding monitor shard4 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service Sharded service shard4 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-24961 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-56215 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-45177 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-32 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Whether this second step is also absolutely necessary was not investigated.

You can follow the entire process in the MariaDB MaxScale error log:

warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files. notice : shard4 sent version string '10.11.7-MariaDB-log'. Detected type: MariaDB, version: 10.11.7. notice : Server 'shard4' charset: latin1_swedish_ci notice : Server changed state: shard4[10.139.158.1:3366]: server_up. [Down] -> [Running] warning: Saving runtime modifications to 'Sharded-Service' in '/var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf'. The modified values will override the values found in the static configuration files. notice : Added 'shard4' to 'Sharded-Service'

What we must not forget here is to also equip the new shard with the proxy protocol:

shell> maxctrl show server shard4 | grep proxy │ │ "proxy_protocol": false, │ MAXCTRL_WARNINGS=0 maxctrl alter server shard4 proxy_protocol=true OK

And now new tenants can be added to the new shard or old tenants can be moved to the new shard... In our setup, we want to move all tenants from shard 1 to shard 4 and also create a new tenant customer_0040 on shard 4. The individual steps required for this are listed above.

Once shard 1 has been emptied, it can be dismantled:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-25916 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-62887 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-54035 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-2247 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

A shard is deleted with the destroy server command. Before this works, however, a shard must be removed from the monitor and the service:

shell> MAXCTRL_WARNINGS=0 maxctrl unlink service Sharded service shard1 OK shell> MAXCTRL_WARNINGS=0 maxctrl unlink monitor Sharding monitor shard1 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ │ │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-64394 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56072 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3267 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Once the shard has been removed from the monitor and the service, it can then be deleted:

shell> maxctrl destroy server shard1 Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again. To hide these warnings, run: export MAXCTRL_WARNINGS=0 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-65018 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56886 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3648 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

And you can follow the changes in the MaxScale Error Log:

notice : Removed 'shard1' from 'Sharded-Service' warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. notice : Destroyed server 'shard1' at 10.139.158.1:3363

Important: I was informed that with destroy server --force the unlink service and unlink monitor commands are automatically executed by MaxScale.

Source:


Customising the configuration files

During the shard operations described above we received some warnings:

Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again.

and

Warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files.

The corresponding configuration files are automatically created by MaxScale when dynamic system changes are made:

shell> ll /var/lib/maxscale/maxscale.cnf.d/ /etc/maxscale.cnf -rw-r--r-- 1 root root 612 Feb 13 14:23 /etc/maxscale.cnf /var/lib/maxscale/maxscale.cnf.d/: total 12 -rw------- 1 maxscale maxscale 187 Feb 13 16:08 Sharding-Monitor.cnf -rw------- 1 maxscale maxscale 150 Feb 13 16:07 Sharded-Service.cnf -rw------- 1 maxscale maxscale 52 Feb 13 15:46 shard4.cnf cat /var/lib/maxscale/maxscale.cnf.d/* [Sharded-Service] debug=true refresh_interval=10000ms auth_all_servers=true log_debug=true password=secret router=schemarouter type=service user=maxscale_admin targets=shard2,shard3,shard4 [sharding monitor] module=galeramon monitor_interval=1000ms password=secret servers=shard2,shard3,shard4 type=monitor user=maxscale_monitor [shard4] address=10.139.158.1 port=3366 type=server

The configuration files still need to be improved accordingly. You should generally consider whether you should not configure everything dynamically via commands in a highly dynamic system...

Maintenance work on the shard

If a shard is to be taken offline for maintenance work, here in the example shard2, this can be done as follows:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-69817 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-63166 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-6902 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘ shell> maxctrl set server shard2 drain OK shell> maxctrl set server shard2 maintenance OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬──────────────────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Maintenance, Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴──────────────────────┴───────────────┴──────────────────┘

At this point, maintenance work can be carried out on the machine or the database...

Afterwards, BOTH statuses must be cleared again if both have been set (MXS-5028):

shell> maxctrl clear server shard2 maintenance OK shell> maxctrl clear server shard2 drain OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

The difference between drain and maintenance is that with drain, no new connections are allowed to the shard, but existing connections wait until they are closed. With maintenance, the connections are terminated immediately by force.

Observation of a MariaDB MaxScale sharding system

The MaxScale CLI client maxtrl can be used to query the status of the MariaDB MaxScale load balancer. There are numerous commands for this, mainly list and show:

shell> maxctrl show module schemarouter | head -n 12 ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ Parameters │ ... │ shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 4 │ Running │ 0-3364-290859 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 4 │ Running │ 0-3365-322671 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 4 │ Running │ 0-3366-140018 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

The information for the Connections column is confusing because in this case we only have 1, 1 and 2 connections open on each shard in this sharding system.

However, if you look at the situation on the respective shard with SHOW PROCESSLIST, you can see that MaxScale also establishes a connection on EACH shard for each incoming connection. So the display above is actually technically correct, just not what you would expect:

SQL> SHOW PROCESSLIST; +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | 123 | root | localhost | customer_0021 | Query | 0 | starting | show processlist | 0.000 | | 68107 | maxscale_monitor | 10.139.158.211:35418 | NULL | Sleep | 0 | | NULL | 0.000 | | 113372 | app | 10.139.158.1:47548 | NULL | Sleep | 47 | | NULL | 0.000 | | 113538 | app | 10.139.158.1:49058 | NULL | Sleep | 41 | | NULL | 0.000 | | 113662 | app | 10.139.158.1:47072 | NULL | Sleep | 37 | | NULL | 0.000 | | 114789 | app | 10.139.158.1:39574 | customer_0022 | Query | 0 | Updating | UPDATE sales SET product = 'Prepare to delete' WHERE id = 15622 | 0.000 | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+

This does not scale with large systems with hundreds or thousands of clients! Maybe the MariaDB thread pool feature is used in this case.

According to the MaxScale developer, this is expected behaviour... (MXS-4977)

shell> maxctrl list services ┌─────────────────┬──────────────┬─────────────┬───────────────────┬────────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├─────────────────┼──────────────┼─────────────┼───────────────────┼────────────────────────┤ │ Sharded-Service │ schemarouter │ 4 │ 82776 │ shard2, shard3, shard4 │ └─────────────────┴──────────────┴─────────────┴───────────────────┴────────────────────────┘ shell> maxctrl list listeners ┌──────────────────────────┬──────┬──────┬─────────┬─────────────────┐ │ Name │ Port │ Host │ State │ Service │ ├──────────────────────────┼──────┼──────┼─────────┼─────────────────┤ │ Sharded-Service-Listener │ 3306 │ :: │ Running │ Sharded-Service │ └──────────────────────────┴──────┴──────┴─────────┴─────────────────┘ shell> maxctrl list monitors ┌──────────────────┬─────────┬────────────────────────┐ │ Monitor │ State │ Servers │ ├──────────────────┼─────────┼────────────────────────┤ │ Sharding-Monitor │ Running │ shard2, shard3, shard4 │ └──────────────────┴─────────┴────────────────────────┘ shell> maxctrl show server shard2 | head -n 20 ┌─────────────────────┬──────────────────────────────────────────────┐ │ Server │ shard2 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Address │ 10.139.158.1 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Port │ 3364 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Version │ 10.11.7-MariaDB-log │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Uptime │ 178960 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Last Event │ server_down │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Triggered At │ Sun, 04 Feb 2024 07:37:17 GMT │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Services │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Monitors │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────┤ ... ├─────────────────────┼──────────────────────────────────────────────┤ │ Current Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Total Connections │ 27 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Max Connections │ 5 │ shell> maxctrl show service Sharded-Service ┌─────────────────────┬──────────────────────────────────────────────────────┐ │ Service │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Source │ /var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router │ schemarouter │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ State │ Started │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Started At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Users Loaded At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Current Connections │ 4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Total Connections │ 84590 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Max Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Cluster │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Servers │ shard2 │ │ │ shard3 │ │ │ shard4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Services │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Filters │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "auth_all_servers": true, │ │ │ "connection_keepalive": "300000ms", │ │ │ "debug": true, │ │ │ "disable_sescmd_history": false, │ │ │ "enable_root_user": false, │ │ │ "force_connection_keepalive": false, │ │ │ "idle_session_pool_time": "-1ms", │ │ │ "ignore_tables": [], │ │ │ "ignore_tables_regex": null, │ │ │ "localhost_match_wildcard_host": true, │ │ │ "log_auth_warnings": true, │ │ │ "log_debug": true, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false, │ │ │ "max_connections": 0, │ │ │ "max_sescmd_history": 50, │ │ │ "max_staleness": "150000ms", │ │ │ "multiplex_timeout": "60000ms", │ │ │ "net_write_timeout": "0ms", │ │ │ "password": "*****", │ │ │ "prune_sescmd_history": true, │ │ │ "rank": "primary", │ │ │ "refresh_databases": false, │ │ │ "refresh_interval": "10000ms", │ │ │ "retain_last_statements": -1, │ │ │ "router": "schemarouter", │ │ │ "session_trace": false, │ │ │ "strip_db_esc": true, │ │ │ "type": "service", │ │ │ "user": "maxscale_admin", │ │ │ "user_accounts_file": null, │ │ │ "user_accounts_file_usage": "add_when_load_ok", │ │ │ "version_string": null, │ │ │ "wait_timeout": "0ms" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router Diagnostics │ { │ │ │ "average_session": 0.028822357131634554, │ │ │ "longest_sescmd_chain": 4, │ │ │ "longest_session": 50, │ │ │ "queries": 761134, │ │ │ "sescmd_percentage": 44.44342257736483, │ │ │ "shard_map_hits": 84356, │ │ │ "shard_map_misses": 5, │ │ │ "shard_map_stale": 229, │ │ │ "shard_map_updates": 216, │ │ │ "shortest_session": 0, │ │ │ "times_sescmd_limit_exceeded": 0 │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────┘

See also MaxScale SchemaRouter Router diagnostics.

shell> maxctrl show monitor Sharding-Monitor ┌─────────────────────┬──────────────────────────────────────────────────────────┐ │ Monitor │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Module │ galeramon │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Servers │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "available_when_donor": false, │ │ │ "backend_connect_attempts": 1, │ │ │ "backend_connect_timeout": "3000ms", │ │ │ "backend_read_timeout": "3000ms", │ │ │ "backend_write_timeout": "3000ms", │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "disk_space_check_interval": "0ms", │ │ │ "disk_space_threshold": null, │ │ │ "events": "all,master_down,master_up,...,new_donor", │ │ │ "journal_max_age": "28800000ms", │ │ │ "module": "galeramon", │ │ │ "monitor_interval": "1000ms", │ │ │ "password": "*****", │ │ │ "root_node_as_master": false, │ │ │ "script": null, │ │ │ "script_timeout": "90000ms", │ │ │ "set_donor_nodes": false, │ │ │ "type": "monitor", │ │ │ "use_priority": false, │ │ │ "user": "maxscale_monitor" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Monitor Diagnostics │ { │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "root_node_as_master": false, │ │ │ "server_info": [ │ │ │ { │ │ │ "gtid_binlog_pos": "0-3363-26014", │ │ │ "gtid_current_pos": "0-3363-26014", │ │ │ "master_id": 0, │ │ │ "name": "shard1", │ │ │ "read_only": false, │ │ │ "server_id": 3363 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3364-240612", │ │ │ "gtid_current_pos": "0-3364-240612", │ │ │ "master_id": 0, │ │ │ "name": "shard2", │ │ │ "read_only": false, │ │ │ "server_id": 3364 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3365-289873", │ │ │ "gtid_current_pos": "0-3365-289873", │ │ │ "master_id": 0, │ │ │ "name": "shard3", │ │ │ "read_only": false, │ │ │ "server_id": 3365 │ │ │ } │ │ │ ], │ │ │ "set_donor_nodes": false, │ │ │ "use_priority": false │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────────┘ shell> maxctrl list sessions; ┌───────┬──────┬──────────────┬───────────────────────┬───────┬─────────────────┬────────┬──────────────┐ │ Id │ User │ Host │ Connected │ Idle │ Service │ Memory │ I/O-Activity │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 87240 │ app │ 10.139.158.1 │ 3/18/2024, 2:33:54 PM │ 0 │ Sharded-Service │ 68644 │ 33 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72654 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:27 PM │ 506.3 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72364 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:18 PM │ 516 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72530 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:23 PM │ 510.5 │ Sharded-Service │ 199328 │ 0 │ └───────┴──────┴──────────────┴───────────────────────┴───────┴─────────────────┴────────┴──────────────┘ shell> maxctrl show session 26 ┌───────────────────────┬───────────────────────────────────────┐ │ Id │ 26 │ ├───────────────────────┼───────────────────────────────────────┤ │ Service │ Sharded-Service │ ├───────────────────────┼───────────────────────────────────────┤ │ State │ Session started │ ├───────────────────────┼───────────────────────────────────────┤ │ User │ app │ ├───────────────────────┼───────────────────────────────────────┤ │ Host │ 10.139.158.1 │ ├───────────────────────┼───────────────────────────────────────┤ │ Port │ 42854 │ ├───────────────────────┼───────────────────────────────────────┤ │ Database │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connected │ 2/4/2024, 9:31:12 AM │ ├───────────────────────┼───────────────────────────────────────┤ │ Idle │ 610.4 │ ├───────────────────────┼───────────────────────────────────────┤ │ Parameters │ { │ │ │ "log_error": false, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Client TLS Cipher │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection attributes │ { │ │ │ "_client_name": "libmariadb", │ │ │ "_client_version": "3.3.8", │ │ │ "_os": "Linux", │ │ │ "_pid": "251037", │ │ │ "_platform": "x86_64", │ │ │ "_server_host": "10.139.158.211", │ │ │ "program_name": "mysql" │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Connections │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection IDs │ 666 │ │ │ 139 │ │ │ 138 │ ├───────────────────────┼───────────────────────────────────────┤ │ Queries │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Log │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Memory │ { │ │ │ "connection_buffers": { │ │ │ "backends": { │ │ │ "shard1": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard2": { │ │ │ "misc": 662, │ │ │ "readq": 0, │ │ │ "total": 662, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard3": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ } │ │ │ }, │ │ │ "client": { │ │ │ "misc": 654, │ │ │ "readq": 65536, │ │ │ "total": 66190, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "total": 199280 │ │ │ }, │ │ │ "exec_metadata": 0, │ │ │ "last_queries": 0, │ │ │ "sescmd_history": 48, │ │ │ "total": 199328, │ │ │ "variables": 0 │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ I/O Activity │ 0 │ └───────────────────────┴───────────────────────────────────────┘ shell> maxctrl show listener Sharded-Service-Listener ┌────────────┬───────────────────────────────────────────┐ │ Name │ Sharded-Service-Listener │ ├────────────┼───────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├────────────┼───────────────────────────────────────────┤ │ Service │ Sharded-Service │ ├────────────┼───────────────────────────────────────────┤ │ Parameters │ { │ │ │ "MariaDBProtocol": { │ │ │ "allow_replication": true │ │ │ }, │ │ │ "address": "::", │ │ │ "authenticator": null, │ │ │ "authenticator_options": null, │ │ │ "connection_init_sql_file": null, │ │ │ "connection_metadata": [ │ │ │ "character_set_client=auto", │ │ │ "character_set_connection=auto", │ │ │ "character_set_results=auto", │ │ │ "max_allowed_packet=auto", │ │ │ "system_time_zone=auto", │ │ │ "time_zone=auto", │ │ │ "tx_isolation=auto" │ │ │ ], │ │ │ "port": 3306, │ │ │ "protocol": "MariaDBProtocol", │ │ │ "proxy_protocol_networks": null, │ │ │ "service": "Sharded-Service", │ │ │ "socket": null, │ │ │ "sql_mode": "default", │ │ │ "ssl": false, │ │ │ "ssl_ca": null, │ │ │ "ssl_cert": null, │ │ │ "ssl_cert_verify_depth": 9, │ │ │ "ssl_cipher": null, │ │ │ "ssl_crl": null, │ │ │ "ssl_key": null, │ │ │ "ssl_verify_peer_certificate": false, │ │ │ "ssl_verify_peer_host": false, │ │ │ "ssl_version": "MAX", │ │ │ "type": "listener", │ │ │ "user_mapping_file": null │ │ │ } │ └────────────┴───────────────────────────────────────────┘ shell> maxctrl show module schemarouter ┌─────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Parameters │ [ │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable debug mode", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": [], │ │ │ "description": "List of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables", │ │ │ "type": "stringlist" │ │ │ }, │ │ │ { │ │ │ "description": "Regex of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables_regex", │ │ │ "type": "regex" │ │ │ }, │ │ │ { │ │ │ "default_value": "150000ms", │ │ │ "description": "Maximum allowed staleness of a database map entry before clients block and wait for an update", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_staleness", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Refresh database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_databases", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How often to refresh the database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_interval", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Retrieve users from all backend servers instead of only one", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "auth_all_servers", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How ofted idle connections are pinged", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_keepalive", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "deprecated": true, │ │ │ "description": "Alias for 'wait_timeout'", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_timeout", │ │ │ "type": "duration" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Disable session command history", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "disable_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Allow the root user to connect to this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "enable_root_user", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Ping connections unconditionally", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "force_connection_keepalive", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "-1ms", │ │ │ "description": "Put connections into pool after session has been idle for this long", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "idle_session_pool_time", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Match localhost to wildcard host", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "localhost_match_wildcard_host", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Log a warning when client authentication fails", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_auth_warnings", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log debug messages for this service (debug builds only)", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log info messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_info", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log notice messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_notice", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log warning messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_warning", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": 0, │ │ │ "description": "Maximum number of connections", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_connections", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": 50, │ │ │ "description": "Session command history size", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_sescmd_history", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": "60000ms", │ │ │ "description": "How long a session can wait for a connection to become available", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "multiplex_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Network write timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "net_write_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "description": "Password for the user used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "password", │ │ │ "type": "password" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Prune old session command history if the limit is exceeded", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "prune_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "primary", │ │ │ "description": "Service rank", │ │ │ "enum_values": [ │ │ │ "primary", │ │ │ "secondary" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "rank", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "default_value": -1, │ │ │ "description": "Number of statements kept in memory", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "retain_last_statements", │ │ │ "type": "int" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable session tracing for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_trace", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "deprecated": true, │ │ │ "description": "Track session state using server responses", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_track_trx_state", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "deprecated": true, │ │ │ "description": "Strip escape characters from database names", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "strip_db_esc", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "description": "Username used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "user", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "description": "Load additional users from a file", │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file", │ │ │ "type": "path" │ │ │ }, │ │ │ { │ │ │ "default_value": "add_when_load_ok", │ │ │ "description": "When and how the user accounts file is used", │ │ │ "enum_values": [ │ │ │ "add_when_load_ok", │ │ │ "file_only_always" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file_usage", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "description": "Custom version string to use", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "version_string", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Connection idle timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "wait_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ } │ │ │ ] │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Commands │ [ │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Clear schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "clear", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/clear/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ }, │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Invalidate schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "invalidate", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/invalidate/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ } │ │ │ ] │ └─────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ shell> maxctrl show commands schemarouter ┌────────────┬────────────┬──────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├────────────┼────────────┼──────────────────────────┤ │ clear │ SERVICE │ The schemarouter service │ ├────────────┼────────────┼──────────────────────────┤ │ invalidate │ SERVICE │ The schemarouter service │ └────────────┴────────────┴──────────────────────────┘ shell> maxctrl show dbusers Sharded-Service ┌───────────────────────┬────────────────┬───────────────────────┬───────┬───────┬────────┬───────┬──────┐ │ User │ Host │ Plugin │ TLS │ Super │ Global │ Proxy │ Role │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ PUBLIC │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app │ 10.139.158.% │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mariadb.sys │ localhost │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mysql │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ root │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ └───────────────────────┴────────────────┴───────────────────────┴───────┴───────┴────────┴───────┴──────┘ shell> maxctrl show commands mariadbmon ┌───────────────────────────┬─────────────────────────────┬───────────────────────────────────────────────────────────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover-force │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ fetch-cmd-result │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cancel-cmd │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-add-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to add to ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-remove-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to remove from ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-start-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-stop-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readonly │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readwrite │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rebuild-server │ MONITOR, SERVER, [SERVER] │ Monitor name, Target server, Source server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-create-backup │ MONITOR, SERVER, STRING │ Monitor name, Source server, Backup name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-restore-from-backup │ MONITOR, SERVER, STRING │ Monitor name, Target server, Backup name │ └───────────────────────────┴─────────────────────────────┴───────────────────────────────────────────────────────────────────────────────┘
Literature / Sources
Taxonomy upgrade extras: shardingmaxscaleschemarouterload balancermulti-tenant

Sharding mit MariaDB MaxScale

Oli Sennhauser - Mon, 2024-03-18 16:08
Inhaltsverzeichnis
Übersicht

Dieses Feature sollte mehr oder weniger mit MariaDB MaxScale 6.x.y, 22.08.x, 23.02.x, 23.08.x und 24.02.x funktionieren. Wir haben es mit der neusten MaxScale Version 23.08.05 getestet, da wir mit einer älteren Version auf Problem gestossen sind (MXS-5026).

shell> maxscale --version MaxScale 23.08.5

Als Datenbank Backend (Shards) haben wir MariaDB 10.11 verwendet.

Weniger als ca 2% aller uns bekannten MariaDB Installationen sind das, was wir technische unter Multi-Mandanten-Systeme (multi-tenant systems) verstehen (jeder Mandant (Kunde) in einer eigenen Datenbank (auch Schema genannt)).

Daher wird dieses MariaDB MaxScale Feature relativ selten genutzt und es besteht die erhöhte Gefahr auf Bugs zu stossen, auf welche vorher noch niemand gestossen ist!

Dieses Feature wird bei MariadDB MaxScale: SchemaRouter genannt und wird immer noch als Beta-Qualität deklariert (MXS-5025):

maxctrl> show module schemarouter ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ ...

Die Ziel-Topologie soll wie folgt aussehen: Jeder Kunde (Mandant, Tenant) liegt in einer eigenen Datenbank (= Schema). Die Datenbanken sind über mehrere MariaDB Instanzen (Shards) verteilt. Damit die Applikation transparent auf die Datenbank zugreifen kann wird ein Pärchen MaxScale Load Balancer davor geschaltet, welches weiss, wo der Kunde liegt und den Traffic entsprechend an das Shard weiterleitet. Damit die MaxScale Load Balancer hochverfügbar ausgelegt sind, wird noch eine virtuelle IP (VIP) z.B. mittels Keepalived vorgeschaltet. Wem das noch zu einfach ist, der kann jedes einzelne Shard noch als Master/Slave- oder Galera Cluster-Konstrukt auslegen...

Vorbereitung der Shards (MariaDB Datenbank Instanzen)

Als erstes hatten wir bei diesem PoC Problem mit der test Datenbank. Durch das Löschen der test Datenbank auf allen Shards ist das Problem verschwunden. Alternativ dazu kann man mariadb-secure-installation ausführen, was man auf Produktionsystemen sowieso machen sollte, oder man nutzt die MaxScale Konfigurationsparameter: ignore_tables oder ignore_tables_regex um gleiche Tabellen in unterschiedlichen Shards zu erlauben (MXS-5027).

Siehe auch: MaxScale Router Parameters.

Testdaten erstellen

Damit wir was zum Spielen haben, haben wir uns Testdaten erstellt:

-- Auf Shard 1: 2 Kunden SQL> CREATE DATABASE customer_0010; SQL> CREATE TABLE customer_0010.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0010.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0010.address VALUES (1, 'Customer 10 GmbH'); SQL> INSERT INTO customer_0010.sales VALUES (1, 'Apples', 5, 1.2, 6), (2, 'Pears', 2, 0.9, 1.8), (3, 'Bread', 1, 2.5, 2.5); SQL> CREATE DATABASE customer_0011; SQL> CREATE TABLE customer_0011.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0011.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0011.address VALUES (1, 'Customer 11 SE'); SQL> INSERT INTO customer_0011.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); -- Auf Shard 2: 3 Kunden SQL> CREATE DATABASE customer_0020; SQL> CREATE TABLE customer_0020.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0020.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0020.address VALUES (1, 'Customer 20 AG'); SQL> INSERT INTO customer_0020.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); SQL> CREATE DATABASE customer_0021; SQL> CREATE TABLE customer_0021.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0021.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0021.address VALUES (1, 'Customer 21 GmbH'); SQL> INSERT INTO customer_0021.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); SQL> CREATE DATABASE customer_0022; SQL> CREATE TABLE customer_0022.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0022.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0022.address VALUES (1, 'Customer 22 Gebr.'); SQL> INSERT INTO customer_0022.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); -- Auf Shard 3: 1 Kunde SQL> CREATE DATABASE customer_0030; SQL> CREATE TABLE customer_0030.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0030.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0030.address VALUES (1, 'Customer 30 GmbH'); SQL> INSERT INTO customer_0030.sales VALUES (1, 'Pickles', 2, 2.2, 4.4), (2, 'Salat', 1, 3.1, 3.1), (3, 'Pudding', 5, 2.2, 11.0), (4, 'Asparagus', 12, .3, 3.6);
Rollen und User erstellen

Da in einem geshardeten System, im Unterschied zum Beispiel zu einem Galera Cluster, die einzelnen Datenbank-Instanzen nichts voneinander wissen und nicht miteinander kommunizieren, müssen wir die Rollen und User bzw. Accounts jeweils auf JEDEM Shard einzeln anlegen.

MariaDB MaxScale braucht jeweils für den SchemaRouter Service und den Monitor einen User (auf jedem Shard).

Der Monitor User ist, wie der Name sagt fürs Monitoring zuständig und der SchemaRouter Service User um die User Account Informationen aus den Sharding-Backends einzusammeln und die Queries an das richtige Shard weiterzuleiten.

Da in einem redundanten System typischerweise mit mindestens zwei MaxScale Routern gearbeitet wird und wir dem auseinanderlaufen der Privilegien der Accounts vorbeugen wollten arbeiten wir mit Rollen sowohl für die MaxScale User als auch die applikatorischen User.

MaxScale Monitor User SQL> CREATE ROLE maxscale_monitor_role; SQL> GRANT SELECT ON mysql.user TO 'maxscale_monitor_role'; SQL> GRANT REPLICATION CLIENT ON *.* TO 'maxscale_monitor_role'; SQL> GRANT SLAVE MONITOR ON *.* TO 'maxscale_monitor_role'; SQL> GRANT FILE ON *.* TO 'maxscale_monitor_role'; SQL> GRANT CONNECTION ADMIN ON *.* TO 'maxscale_monitor_role'; SQL> SHOW GRANTS FOR maxscale_monitor_role; +-----------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor_role | +-----------------------------------------------------------------------------------------------+ | GRANT FILE, BINLOG MONITOR, CONNECTION ADMIN, SLAVE MONITOR ON *.* TO `maxscale_monitor_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_monitor_role` | +-----------------------------------------------------------------------------------------------+ SQL> CREATE USER maxscale_monitor@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_monitor@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.210'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_monitor%'; +-----------------------+----------------+---------+-----------------------+ | User | Host | is_role | default_role | +-----------------------+----------------+---------+-----------------------+ | maxscale_monitor_role | | Y | | | maxscale_monitor | 10.139.158.210 | N | maxscale_monitor_role | | maxscale_monitor | 10.139.158.211 | N | maxscale_monitor_role | +-----------------------+----------------+---------+-----------------------+ SQL> SHOW GRANTS FOR maxscale_monitor@'10.139.158.211'; +------------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor@10.139.158.211 | +------------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_monitor_role` TO `maxscale_monitor`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_monitor`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_monitor_role` FOR `maxscale_monitor`@`10.139.158.211` | +------------------------------------------------------------------------------------------------------------------------------+
MaxScale Admin User SQL> CREATE ROLE maxscale_admin_role; SQL> GRANT SHOW DATABASES ON *.* TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.user TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.db TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.tables_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.columns_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.proxies_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.roles_mapping TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.procs_priv TO 'maxscale_admin_role'; SQL> SHOW GRANTS FOR maxscale_admin_role; +------------------------------------------------------------------+ | Grants for maxscale_admin_role | +------------------------------------------------------------------+ | GRANT USAGE ON *.* TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`roles_mapping` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`tables_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`procs_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`db` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`columns_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`proxies_priv` TO `maxscale_admin_role` | +------------------------------------------------------------------+ SQL> CREATE USER maxscale_admin@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_admin@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.210'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_admin%'; +---------------------+----------------+---------+---------------------+ | User | Host | is_role | default_role | +---------------------+----------------+---------+---------------------+ | maxscale_admin_role | | Y | | | maxscale_admin | 10.139.158.210 | N | maxscale_admin_role | | maxscale_admin | 10.139.158.211 | N | maxscale_admin_role | +---------------------+----------------+---------+---------------------+ SQL> SHOW GRANTS FOR maxscale_admin@'10.139.158.211'; +----------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_admin@10.139.158.211 | +----------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_admin_role` TO `maxscale_admin`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_admin`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_admin_role` FOR `maxscale_admin`@`10.139.158.211` | +----------------------------------------------------------------------------------------------------------------------------+

Siehe dazu auch: SchemaRouter Configuration

Applikationsrolle und Accounts erstellen

Für die Applikation braucht es ebenfalls einen User, den wir hier wie auf jedem Shard wie folgt erstellen:

SQL> CREATE ROLE app_role; SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO 'app_role'; SQL> GRANT SHOW DATABASES ON *.* TO 'app_role'; SQL> GRANT CREATE, DROP, ALTER ON *.* TO 'app_role'; -- For creating new tenant databases SQL> SHOW GRANTS FOR app_role; +----------------------------------------------------------------------+ | Grants for app_role | +----------------------------------------------------------------------+ | GRANT SHOW DATABASES ON *.* TO `app_role` | | GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO `app_role` | +----------------------------------------------------------------------+ SQL> CREATE USER app@'10.139.158.%' IDENTIFIED BY 'secret'; SQL> GRANT app_role TO app@'10.139.158.%'; SQL> SET DEFAULT ROLE app_role FOR app@'10.139.158.%'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'app%'; +----------+--------------+---------+--------------+ | User | Host | is_role | default_role | +----------+--------------+---------+--------------+ | app_role | | Y | | | app | 10.139.158.% | N | app_role | +----------+--------------+---------+--------------+ SQL> SHOW GRANTS FOR app@'10.139.158.%'; +---------------------------------------------------------------------------------------------------------------+ | Grants for app@10.139.158.% | +---------------------------------------------------------------------------------------------------------------+ | GRANT `app_role` TO `app`@`10.139.158.%` | | GRANT USAGE ON *.* TO `app`@`10.139.158.%` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `app_role` FOR `app`@`10.139.158.%` | +---------------------------------------------------------------------------------------------------------------+
Proxy Protokoll

Load Balancer und Proxies haben die Eigenschaft, dass sie die IP Adressen der Clients durch ihre eigenen IP Adressen austauschen. Dies führt einerseits dazu, dass man auf der Datenbank nicht mehr sieht, woher der Client ursprünglich kommt, andererseits kann man die Zugriffsberechtigungen nicht mehr auf User und IP vergeben, da immer gegen die IP der Load Balancer geprüft wird.

Diese beiden Probleme können mittels des Proxy Protokolls gelöst werden.

Hierzu müssen einerseits die Datenbank und andererseits die Load Balancer, in diesem Fall der MaxScale, das Proxy Protokoll aktiviert haben.

Auf der Datenbank-Seite aktiviert man das Proxy Protokoll wie folgt:

# # my.cnf # [mariadbd] proxy_protocol_networks = ::1, 10.139.158.0/24, localhost

und auf MaxScale-Seite mit:

# # /etc/maxscale.cnf # [shard<n>] type = server proxy_protocol = true

Überprüfen kann man die beiden Einstellungen mit:

SQL> SHOW GLOBAL VARIABLES LIKE 'proxy%'; +-------------------------+---------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------+ | proxy_protocol_networks | ::1, 10.139.158.0/24, localhost | +-------------------------+---------------------------------+ shell> maxctrl show server shard1 | grep proxy │ │ "proxy_protocol": true, │

Quellen:


MaxScale SchemaRouter Konfiguration

Als nächstes bereiten wir die MaxScale Konfiguration fürs Sharding vor. Die von MariaDB empfohlene Datei ist /etc/maxscale.cnf. Ob es sinnvoller ist eine eigene Konfigurations-Datei unter /etc/maxscale.cnf.d/ zu erstellen oder gar den ganze MaxScale dynamisch zu konfigurieren (/var/lib/maxscale/maxscale.cnf.d/*.cnf), wird sich langfristig zeigen. Siehe auch Warnungen weiter unten. Die Konfigurationsdatei für dieses Sharding PoC sieht wie folgt aus:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [shard1] type = server address = 10.139.158.1 port = 3363 proxy_protocol = true [shard2] type=server address=10.139.158.1 port=3364 proxy_protocol = true [shard3] type = server address = 10.139.158.1 port = 3365 proxy_protocol = true [Sharding-Monitor] type = monitor module = galeramon servers = shard1,shard2,shard3 user = maxscale_monitor password = secret monitor_interval = 1s [Sharded-Service-Listener] type = listener service = Sharded-Service protocol = MariaDBClient port = 3306 [Sharded-Service] type = service router = schemarouter servers = shard1,shard2,shard3 user = maxscale_admin password = secret auth_all_servers = true

Hinweis: Empfehlung des MaxScale Entwicklers: "One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon."

Starten und Stoppen des MaxScale Load Balancers

Starten und Stoppen von MaxScale erfolgt wie üblich über SystemD:

shell> systemctl restart maxscale shell> systemctl status maxscale ● maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/lib/systemd/system/maxscale.service; enabled; vendor preset: enabled) Drop-In: /run/systemd/system/service.d └─zzz-lxc-service.conf Active: active (running) since Tue 2024-02-27 09:52:57 UTC; 39s ago Process: 187 ExecStart=/usr/bin/maxscale (code=exited, status=0/SUCCESS) Main PID: 188 (maxscale) Tasks: 10 (limit: 18663) Memory: 4.6M CPU: 150ms CGroup: /system.slice/maxscale.service └─188 /usr/bin/maxscale systemd[1]: Starting MariaDB MaxScale Database Proxy... maxscale[188]: Module 'galeramon' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libgaleramon.so'. maxscale[188]: Module 'schemarouter' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libschemarouter.so'. maxscale[188]: Using up to 2.3GiB of memory for query classifier cache systemd[1]: Started MariaDB MaxScale Database Proxy.

Falls es Fehler oder Warnungen gegeben hat, kann man diese im MaxScale Error Log sehen:

shell> grep -v notice /var/log/maxscale/maxscale.log 2024-02-13 16:47:22 MariaDB MaxScale is shut down. ---------------------------------------------------- MariaDB MaxScale /var/log/maxscale/maxscale.log Tue Feb 13 16:47:22 2024 ---------------------------------------------------------------------------- 2024-02-27 09:52:56 warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. File is for module 'mariadbmon'. Current module is 'galeramon'. 2024-02-27 09:52:56 warning: [galeramon] Invalid 'wsrep_local_index' on server 'shard1': 18446744073709551615
Applikations-Tests Einfache Applikations-Tests shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='show databases' +--------------------+ | Database | +--------------------+ | customer_0010 | | customer_0011 | | customer_0020 | | customer_0021 | | customer_0022 | | customer_0030 | | information_schema | | mysql | | performance_schema | | sys | +--------------------+
Neuer Befehl show shards shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='show shards' | grep customer_00.* | sort | column -t customer_0010.address shard1 customer_0010.sales shard1 customer_0010. shard1 customer_0011.address shard1 customer_0011.sales shard1 customer_0011. shard1 customer_0020.address shard2 customer_0020.sales shard2 customer_0020. shard2 customer_0021.address shard2 customer_0021.sales shard2 customer_0021. shard2 customer_0022.address shard2 customer_0022.sales shard2 customer_0022. shard2 customer_0030.address shard3 customer_0030.sales shard3 customer_0030. shard3

Neue Datenbanken werden nicht sofort angezeigt, sondern erst wenn die gecachten Daten upgedatet wurden (refresh_interval (300s / 5 min)).

Siehe hierzu auch: Custom SQL commands

Allgemeinere Test

Zur Erinnerung:

ShardPortCustomerState #13363customer_001<n>Running #23364customer_002<n>Running #33365customer_003<n>Running #43366customer_004<n>Running
shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='use customer_0020; SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3365 | +--------+
Weniger einfache (Backup-) Test shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0010 > /tmp/customer_0010.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0020 > /tmp/customer_0020.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0011 > /tmp/customer_0011.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0021 > /tmp/customer_0021.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> ll /tmp/customer_00*sql -rw-rw-r-- 1 oli oli 2738 Mar 18 12:07 /tmp/customer_0010.sql -rw-rw-r-- 1 oli oli 2904 Mar 18 12:08 /tmp/customer_0011.sql -rw-rw-r-- 1 oli oli 2712 Mar 18 12:08 /tmp/customer_0020.sql -rw-rw-r-- 1 oli oli 2906 Mar 18 12:08 /tmp/customer_0021.sql -rw-rw-r-- 1 oli oli 2964 Mar 18 12:08 /tmp/customer_0030.sql shell> tail -n 1 /tmp/customer_*.sql ==> /tmp/customer_0010.sql <== -- Dump completed on 2024-02-13 14:39:21 ==> /tmp/customer_0011.sql <== -- Dump completed on 2024-02-13 14:39:35 ==> /tmp/customer_0020.sql <== -- Dump completed on 2024-02-13 14:40:15 ==> /tmp/customer_0021.sql <== -- Dump completed on 2024-02-13 14:40:42 ==> /tmp/customer_0030.sql <== -- Dump completed on 2024-02-13 14:40:52 shell> cat /tmp/customer_00*sql | grep -A1 -i insert INSERT INTO `address` VALUES (1,'Customer 10 GmbH'); -- INSERT INTO `sales` VALUES (1,'Apples',5,1.20,6.00), -- INSERT INTO `address` VALUES (1,'Customer 11 SE'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 20 AG'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 21 GmbH'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 30 GmbH'); -- INSERT INTO `sales` VALUES (1,'Pickles',2,2.20,4.40),

In MaxScale 23.08.4 war ein ziemlich übler Bug drin: Ein Rückgabewert von 0 aber keine Daten im Backup!!! Siehe dazu auch die Tickets: MXS-4966: mariadb-dump gets an error dumping schemas und MXS-4947: Tables in information_schema are treated as a normal tables. Symptome des Bugs sehen wie folgt aus:

Error: Couldn't read status information for table address () Error: Couldn't read status information for table sales ()

Daher empfiehlt sich dringend ein Upgrade auf MaxScale 23.08.5.

Komplexere Applikations-Tests

Wir haben noch einen etwas komplexeren Test erstellt (./sharding_test.php), der folgen Queries abarbeitet:

SET NAMES utf8mb4 SHOW DATABASES use customer_<nnnn> START TRANSACTION; SELECT MIN(id) AS first, MAX(id) AS last FROM `sales` INSERT INTO sales (id, product, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) INSERT INTO sales (id, product, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) UPDATE sales SET product = 'Prepare to delete' WHERE id = %d DELETE FROM sales WHERE id = %d COMMIT

Dieser Test lief einwandfrei durch. Die dazu gehörende Kontroll-Abfrage:

SQL> SELECT * FROM customer_0021.sales WHERE id >= (SELECT MAX(id) - 10 FROM customer_0021.sales);

Verschiedene Last-Szenarien können zusätzlich mit db_bench oder dem Acronis perfkit getestet werden. Weitere Informationen dazu siehe hier.

Cross-Shard-Tests

Allensfalls könnte man auf die Idee kommen, Cross-Shard-Queries auszuführen. Dies wird NICHT funktionieren, was ja nicht wirklich erstaunen sollte, da erstens nicht ganz einfach zu implementieren und zweites hier beschrieben:

"Note: As the sharding solution in MaxScale is relatively simple, cross-database queries between two or more shards are not supported."

Quelle: Simple Sharding with Two Servers

und

"USE db1 is routed to the server with db1. If the database is divided to multiple servers, only one server will get the command."

Quelle: SchemaRouter

Hier ein Test mit UNION:

SQL> use customer_0030 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist

Und hier noch der Gegenbeweis:

SQL> use customer_0020 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist

Und hier noch den Test mit JOIN:

SQL> use customer_0020 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist SQL> use customer_0030 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist
Betrieb eines MaxScale Sharding-Systems

In diesem Kapitel besprechen wir einige Punkte die für den Betrieb eines MariaDB MaxScale Sharding-Systems nützlich sein können.

Do-on-all-shards

Da es immer wieder vorkommen kann, dass O/S oder Datenbank Operationen auf allen Shards ausgeführt werden müssen, wäre es sicher sinnvoll ein Skript zu erstellen, welches reihum, auf allen Shards, den selben Befehl ausführt:

shell> ./do-on-all-shards.sh --sql='SHOW DATABASES'

Ein so geartetes Skript dürfte die Fehlerrate im Betrieb stark reduzieren. Auch Operationen wie das Re-sharding eines Mandanten, wie weiter unten beschrieben, werden sinnvollerweise geskriptet und zentral ausgeführt (MXS-5029).

Invalidieren des Database Map Caches

Mit dem Befehlt invalidate kann man den Database Map Cache des MariaDB MaxScale SchemaRouters invalidieren. Dies gibt uns die Möglichkeit, nach dem Hinzufügen oder dem Entfernen von Mandanten den Cache schnell wieder auf einen aktuellen Stand zu bringen.

shell> maxctrl call command schemarouter invalidate Sharded-Service OK

Im unterschied zum Befehl invalidate, bei welchem die Einträge nach dem nächsten refresh_intervall auf den neusten Stand gebracht werden, löscht der Befehl clear die Einträge und ein Remap wird sofort ausgeführt.

Wenn man den Database Map Cache von remote mit einem REST API Call invalidieren möchte, geht das wie folgt:

shell> curl -i -X POST -u api_admin:secret http://10.139.158.211:8989/v1/maxscale/modules/schemarouter/clear?Sharded-Service HTTP/1.1 204 No Content Connection: close Date: Mon, 18 Mar 24 11:49:58 GMT X-Frame-Options: Deny X-XSS-Protection: 1 Referrer-Policy: same-origin Cache-Control: no-cache

Quellen:


Wie ändert man SchemaRouter Variablen dynamisch?

Das refresh_interval spezifiziert die Lebensdauer der Einträge im SchemaRouter Database Map Cache. Der default Wert ist 300 s (5 min). Refresh Interval ist daher, meiner Meinung nach, ein unglücklicher Begriff da es nicht das Intervall zwischen zwei Mappings definiert sondern die Lebzeit der Cache-Einträge (livetime?, timeout?). Sobald der Eintrag gelöscht wurde wir ein neuer Refresh der "Database Map" auf jedem Shard getriggert. Der Befehl sieht aktuell wie folgt aus:

SELECT LOWER(t.table_schema), LOWER(t.table_name) FROM information_schema.tables t UNION ALL SELECT LOWER(s.schema_name), '' FROM information_schema.schemata s

So wie es aussieht reicht ein simpler Connect um den Refresh der Database Map zu triggern.

Der aktuelle Wert für refresh_intervall kann wie folgt abgefragt werden.

shell> maxctrl show service Sharded-Service | grep refresh_interval | awk -F'│' '{ print $3 }' "refresh_interval": "300000ms",

Um den Wert dynamisch zu ändern hilft folgender Befehl:

shell> MAXCTRL_WARNINGS=0 maxctrl alter service Sharded-Service refresh_interval=10s OK

Der Wert sollte nicht zu klein eingestellt werden, da während des Mapping Vorgangs alle anderen Connections angehalten werden.

Quellen:


Hinzufügen und Entfernen eines Mandanten

Das Hinzufügen eines neuen Mandanten zu einem Shard stellt kein grosses Problem dar:

SQL> CREATE DATABASE customer_0029; SQL> use customer_0029 SQL> CREATE TABLE address LIKE customer_template.address; SQL> CREATE TABLE sales LIKE customer_template.sales; shell> maxctrl call command schemarouter invalidate Sharded-Service OK

Das Entfernen eines Mandanten von einem Shard ist hingegen etwas komplizierter und muss in Absprache mit der Applikation erfolgen:

SQL> DROP DATABASE customer_0011; shell> ./sharding_test.php .....ERROR: Table 'customer_0011.sales' doesn't exist...ERROR: Unknown database 'customer_0011'.ERROR: Unknown database 'customer_0011'......ERROR: Unknown database 'customer_0011'... shell> maxctrl call command schemarouter clear Sharded-Service OK

Zumindest ist mir bisher noch keine schlauere Variante eingefallen. Siehe auch Umziehen eines Mandanten weiter unten.

Umziehen eines Mandanten

Die Kombination von Hinzufügen und Entfernen wäre dann das Umziehen eines Mandanten von einem Shard auf ein anderes Shard, auch re-sharding genannt. Hierzu muss ebenfalls wieder mit der Applikation zusammen eine konzertierte Aktion geplant werden.

Falls dies nicht möglich ist, kann zumindest die Zeit, welche die Applikation Fehler erhält reduziert werden... Folgendes Vorgehen kann verwendet werden um einen Mandanten von Shard 2 auf Shard 3 umzuziehen:

SQL> use customer_0020; LOCK TABLES address READ, sales READ; -- Auf Shard 2, Applikation wir allenfalls blockiert! shell> mariadb-dump --user=app --password=secret --host=10.139.158.1 --port=3364 --single-transaction --skip-add-locks --databases customer_0020 | mariadb --user=app --password=secret --host=10.139.158.1 --port=3365 # Kopieren des Mandanten 20 von Shard 2 auf Shard 3 SQL> DROP DATABASE customer_0020; -- Löschen von Mandant 20 geht nicht! ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction SQL> UNLOCK TABLES; DROP DATABASE customer_0020; # So geht das Löschen von Mandant 20. shell> maxctrl call command schemarouter clear Sharded-Service # MaxScale Database Map aktualisieren. Schnell machen!!!
Bis zum Refresh der Database Map kann es zu folgenden Fehlern kommen: error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.address' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.sales' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); Duplicate tables found, closing session.

Und applikationsseitig zu:

ERROR: Error: duplicate tables found on two different shards
Hinzufügen oder entfernen eines Shards

Das Umziehen eines Mandanten von einem Shard auf ein anderes Shard ist das kleine re-sharding. Etwas komplexer wird es, wenn man neue Shards hinzufügen oder alte Shards entfernen möchte. Im Anschluss daran (nach dem Hinzufügen bzw. vor dem Entfernen) würde dann ein grosses re-sharding erfolgen. Zuerst das Erweitern des Clusters um ein Shard:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ 0-3363-26014 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

Das vorbereitete Shard wird MaxScale bekannt gemacht:

shell> maxctrl create server shard4 10.139.158.1 3366 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-23676 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-52321 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-39751 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Down │ │ │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Dann wird des neue Shard mit dem MaxScale Monitor und dem Service verknüpft:

shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Sharding-Monitor shard4 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service Sharded-Service shard4 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-24961 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-56215 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-45177 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-32 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Ob dieser zweite Schritt ebenfalls zwingend notwendig ist, wurde nicht untersucht.

Im MariaDB MaxScale Error Log kann man den ganzen Ablauf mitverfolgen:

warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files. notice : shard4 sent version string '10.11.7-MariaDB-log'. Detected type: MariaDB, version: 10.11.7. notice : Server 'shard4' charset: latin1_swedish_ci notice : Server changed state: shard4[10.139.158.1:3366]: server_up. [Down] -> [Running] warning: Saving runtime modifications to 'Sharded-Service' in '/var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf'. The modified values will override the values found in the static configuration files. notice : Added 'shard4' to 'Sharded-Service'

Was wir hier nicht vergessen dürfen, ist das neue Shard ebenfalls mit dem Proxy Protokoll auszustatten:

shell> maxctrl show server shard4 | grep proxy │ │ "proxy_protocol": false, │ MAXCTRL_WARNINGS=0 maxctrl alter server shard4 proxy_protocol=true OK

Und jetzt können neue Mandanten auf dem neuen Shard hinzugefügt oder alte Mandanten auf das neue Shard umgezogen werden... In unserem Set-up wollen wir alle Mandanten vom Shard 1 auf Shard 4 umziehen und zudem einen neuen Mandanten customer_0040 auf Shard 4 erstellen. Die dazu benötigten Einzelschritte sind oben aufgeführt.

Nachdem das Shard 1 leergeräumt wurde, kann es abgebaut werden:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-25916 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-62887 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-54035 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-2247 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Mit dem Befehl destroy server wir ein Shard gelöscht. Bevor das aber funktioniert muss ein Shard aus dem Monitor und dem Service entfernt werden:

shell> MAXCTRL_WARNINGS=0 maxctrl unlink service Sharded-Service shard1 OK shell> MAXCTRL_WARNINGS=0 maxctrl unlink monitor Sharding-Monitor shard1 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ │ │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-64394 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56072 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3267 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Wenn das Shard aus dem Monitor und dem Service herausgelöst ist, kann es anschliessend gelöscht werden:

shell> maxctrl destroy server shard1 Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again. To hide these warnings, run: export MAXCTRL_WARNINGS=0 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-65018 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56886 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3648 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Und im MaxScale Error Log kann man die Änderungen schön mitverfolgen:

notice : Removed 'shard1' from 'Sharded-Service' warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. notice : Destroyed server 'shard1' at 10.139.158.1:3363

Wichtig: Mir wurde mitgeteilt, dass mit destroy server --force die unlink service und unlink monitor Befehle durch MaxScale automatisch ausgeführt werden.

Quelle:


Anpassen der Konfigurationsdateien

Bei den oben beschriebenen Shard Operationen habe wir einige Warnungen erhalten:

Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again.

und

Warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files.

Die entsprechenden Konfigurationsdateien werden von MaxScale automatisch bei dynamischen System-Änderungen erstellt:

shell> ll /var/lib/maxscale/maxscale.cnf.d/ /etc/maxscale.cnf -rw-r--r-- 1 root root 612 Feb 13 14:23 /etc/maxscale.cnf /var/lib/maxscale/maxscale.cnf.d/: total 12 -rw------- 1 maxscale maxscale 187 Feb 13 16:08 Sharding-Monitor.cnf -rw------- 1 maxscale maxscale 150 Feb 13 16:07 Sharded-Service.cnf -rw------- 1 maxscale maxscale 52 Feb 13 15:46 shard4.cnf cat /var/lib/maxscale/maxscale.cnf.d/* [Sharded-Service] debug=true refresh_interval=10000ms auth_all_servers=true log_debug=true password=secret router=schemarouter type=service user=maxscale_admin targets=shard2,shard3,shard4 [Sharding-Monitor] module=galeramon monitor_interval=1000ms password=secret servers=shard2,shard3,shard4 type=monitor user=maxscale_monitor [shard4] address=10.139.158.1 port=3366 type=server

Es müssen also noch entsprechende Nachbesserungen an den Konfigurationsdateien vorgenommen werden. Man müsste sich allgemein überlegen ob man bei einem hochdynamischen System nicht alles dynamisch über Befehle konfigurieren sollte...

Wartungsarbeiten am Shard

Wenn für Wartungsarbeiten ein Shard offline genommen werden soll, hier im Beispiel shard2, kann dies wie folgt bewerkstelligt werden:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-69817 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-63166 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-6902 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘ shell> maxctrl set server shard2 drain OK shell> maxctrl set server shard2 maintenance OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬──────────────────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Maintenance, Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴──────────────────────┴───────────────┴──────────────────┘

Zu diesem Zeitpunkt können die Wartungsarbeiten an der Maschine oder der Datenbank vorgenommen werden...

Anschliessend müssen BEIDE Stati wieder gecleared werden, sofern beide gesetzt wurden (MXS-5028):

shell> maxctrl clear server shard2 maintenance OK shell> maxctrl clear server shard2 drain OK maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

Der Unterschied zwischen drain und maintenance besteht darin, dass bei drain keinen neuen Verbindungen mehr auf das Shard zugelassen werden aber bei bestehenden Verbindungen wird gewartet, bis sie geschlossen werden. Bei maintenance werden die Verbindungen sofort zwangsweise terminiert.

Beobachtung / Observierung eines MariaDB MaxScale Sharding-Systems

Mit dem MaxScale CLI Client maxtrl kann man den Zustand des MariaDB MaxScale Load Balancers abfragen hierzu gibt es zahlreiche Befehle, hauptsächlich list und show:

shell> maxctrl show module schemarouter | head -n 12 ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ Parameters │ ... │ shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 4 │ Running │ 0-3364-290859 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 4 │ Running │ 0-3365-322671 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 4 │ Running │ 0-3366-140018 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

Die Information für die Spalte Connections ist verwirrend, da wir bei diesem Sharding-System in diesem Fall auf jedem Shard nur 1, 1 und 2 Verbindungen offen haben.

Wenn man sich mit SHOW PROCESSLIST die Situation aber auf dem jeweiligen Shard anschaut, sieht man, dass MaxScale auf JEDEM Shard für jede eingehende Verbindung auch eine Verbindung auf jedes Shard aufbaut. Somit ist die Anzeige oben eigentlich technisch korrekt, nur nicht, was man erwarten würde:

SQL> SHOW PROCESSLIST; +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | 123 | root | localhost | customer_0021 | Query | 0 | starting | show processlist | 0.000 | | 68107 | maxscale_monitor | 10.139.158.211:35418 | NULL | Sleep | 0 | | NULL | 0.000 | | 113372 | app | 10.139.158.1:47548 | NULL | Sleep | 47 | | NULL | 0.000 | | 113538 | app | 10.139.158.1:49058 | NULL | Sleep | 41 | | NULL | 0.000 | | 113662 | app | 10.139.158.1:47072 | NULL | Sleep | 37 | | NULL | 0.000 | | 114789 | app | 10.139.158.1:39574 | customer_0022 | Query | 0 | Updating | UPDATE sales SET product = 'Prepare to delete' WHERE id = 15622 | 0.000 | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+

Das skaliert so nicht bei grossen Systemen mit hunderten oder tausenden von Mandanten! Ev. kommt in diesem Fall das MariaDB Thread Pool Feature zum Einsatz.

Laut dem MaxScale Entwickler ist dies erwartetes Verhalten... (MXS-4977)

shell> maxctrl list services ┌─────────────────┬──────────────┬─────────────┬───────────────────┬────────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├─────────────────┼──────────────┼─────────────┼───────────────────┼────────────────────────┤ │ Sharded-Service │ schemarouter │ 4 │ 82776 │ shard2, shard3, shard4 │ └─────────────────┴──────────────┴─────────────┴───────────────────┴────────────────────────┘ shell> maxctrl list listeners ┌──────────────────────────┬──────┬──────┬─────────┬─────────────────┐ │ Name │ Port │ Host │ State │ Service │ ├──────────────────────────┼──────┼──────┼─────────┼─────────────────┤ │ Sharded-Service-Listener │ 3306 │ :: │ Running │ Sharded-Service │ └──────────────────────────┴──────┴──────┴─────────┴─────────────────┘ shell> maxctrl list monitors ┌──────────────────┬─────────┬────────────────────────┐ │ Monitor │ State │ Servers │ ├──────────────────┼─────────┼────────────────────────┤ │ Sharding-Monitor │ Running │ shard2, shard3, shard4 │ └──────────────────┴─────────┴────────────────────────┘ shell> maxctrl show server shard2 | head -n 20 ┌─────────────────────┬──────────────────────────────────────────────┐ │ Server │ shard2 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Address │ 10.139.158.1 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Port │ 3364 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Version │ 10.11.7-MariaDB-log │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Uptime │ 178960 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Last Event │ server_down │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Triggered At │ Sun, 04 Feb 2024 07:37:17 GMT │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Services │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Monitors │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────┤ ... ├─────────────────────┼──────────────────────────────────────────────┤ │ Current Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Total Connections │ 27 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Max Connections │ 5 │ shell> maxctrl show service Sharded-Service ┌─────────────────────┬──────────────────────────────────────────────────────┐ │ Service │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Source │ /var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router │ schemarouter │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ State │ Started │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Started At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Users Loaded At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Current Connections │ 4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Total Connections │ 84590 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Max Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Cluster │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Servers │ shard2 │ │ │ shard3 │ │ │ shard4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Services │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Filters │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "auth_all_servers": true, │ │ │ "connection_keepalive": "300000ms", │ │ │ "debug": true, │ │ │ "disable_sescmd_history": false, │ │ │ "enable_root_user": false, │ │ │ "force_connection_keepalive": false, │ │ │ "idle_session_pool_time": "-1ms", │ │ │ "ignore_tables": [], │ │ │ "ignore_tables_regex": null, │ │ │ "localhost_match_wildcard_host": true, │ │ │ "log_auth_warnings": true, │ │ │ "log_debug": true, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false, │ │ │ "max_connections": 0, │ │ │ "max_sescmd_history": 50, │ │ │ "max_staleness": "150000ms", │ │ │ "multiplex_timeout": "60000ms", │ │ │ "net_write_timeout": "0ms", │ │ │ "password": "*****", │ │ │ "prune_sescmd_history": true, │ │ │ "rank": "primary", │ │ │ "refresh_databases": false, │ │ │ "refresh_interval": "10000ms", │ │ │ "retain_last_statements": -1, │ │ │ "router": "schemarouter", │ │ │ "session_trace": false, │ │ │ "strip_db_esc": true, │ │ │ "type": "service", │ │ │ "user": "maxscale_admin", │ │ │ "user_accounts_file": null, │ │ │ "user_accounts_file_usage": "add_when_load_ok", │ │ │ "version_string": null, │ │ │ "wait_timeout": "0ms" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router Diagnostics │ { │ │ │ "average_session": 0.028822357131634554, │ │ │ "longest_sescmd_chain": 4, │ │ │ "longest_session": 50, │ │ │ "queries": 761134, │ │ │ "sescmd_percentage": 44.44342257736483, │ │ │ "shard_map_hits": 84356, │ │ │ "shard_map_misses": 5, │ │ │ "shard_map_stale": 229, │ │ │ "shard_map_updates": 216, │ │ │ "shortest_session": 0, │ │ │ "times_sescmd_limit_exceeded": 0 │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────┘

Siehe hierzu auch MaxScale SchemaRouter Router diagnostics.

shell> maxctrl show monitor Sharding-Monitor ┌─────────────────────┬──────────────────────────────────────────────────────────┐ │ Monitor │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Module │ galeramon │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Servers │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "available_when_donor": false, │ │ │ "backend_connect_attempts": 1, │ │ │ "backend_connect_timeout": "3000ms", │ │ │ "backend_read_timeout": "3000ms", │ │ │ "backend_write_timeout": "3000ms", │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "disk_space_check_interval": "0ms", │ │ │ "disk_space_threshold": null, │ │ │ "events": "all,master_down,master_up,...,new_donor", │ │ │ "journal_max_age": "28800000ms", │ │ │ "module": "galeramon", │ │ │ "monitor_interval": "1000ms", │ │ │ "password": "*****", │ │ │ "root_node_as_master": false, │ │ │ "script": null, │ │ │ "script_timeout": "90000ms", │ │ │ "set_donor_nodes": false, │ │ │ "type": "monitor", │ │ │ "use_priority": false, │ │ │ "user": "maxscale_monitor" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Monitor Diagnostics │ { │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "root_node_as_master": false, │ │ │ "server_info": [ │ │ │ { │ │ │ "gtid_binlog_pos": "0-3363-26014", │ │ │ "gtid_current_pos": "0-3363-26014", │ │ │ "master_id": 0, │ │ │ "name": "shard1", │ │ │ "read_only": false, │ │ │ "server_id": 3363 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3364-240612", │ │ │ "gtid_current_pos": "0-3364-240612", │ │ │ "master_id": 0, │ │ │ "name": "shard2", │ │ │ "read_only": false, │ │ │ "server_id": 3364 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3365-289873", │ │ │ "gtid_current_pos": "0-3365-289873", │ │ │ "master_id": 0, │ │ │ "name": "shard3", │ │ │ "read_only": false, │ │ │ "server_id": 3365 │ │ │ } │ │ │ ], │ │ │ "set_donor_nodes": false, │ │ │ "use_priority": false │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────────┘ shell> maxctrl list sessions; ┌───────┬──────┬──────────────┬───────────────────────┬───────┬─────────────────┬────────┬──────────────┐ │ Id │ User │ Host │ Connected │ Idle │ Service │ Memory │ I/O-Activity │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 87240 │ app │ 10.139.158.1 │ 3/18/2024, 2:33:54 PM │ 0 │ Sharded-Service │ 68644 │ 33 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72654 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:27 PM │ 506.3 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72364 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:18 PM │ 516 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72530 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:23 PM │ 510.5 │ Sharded-Service │ 199328 │ 0 │ └───────┴──────┴──────────────┴───────────────────────┴───────┴─────────────────┴────────┴──────────────┘ shell> maxctrl show session 26 ┌───────────────────────┬───────────────────────────────────────┐ │ Id │ 26 │ ├───────────────────────┼───────────────────────────────────────┤ │ Service │ Sharded-Service │ ├───────────────────────┼───────────────────────────────────────┤ │ State │ Session started │ ├───────────────────────┼───────────────────────────────────────┤ │ User │ app │ ├───────────────────────┼───────────────────────────────────────┤ │ Host │ 10.139.158.1 │ ├───────────────────────┼───────────────────────────────────────┤ │ Port │ 42854 │ ├───────────────────────┼───────────────────────────────────────┤ │ Database │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connected │ 2/4/2024, 9:31:12 AM │ ├───────────────────────┼───────────────────────────────────────┤ │ Idle │ 610.4 │ ├───────────────────────┼───────────────────────────────────────┤ │ Parameters │ { │ │ │ "log_error": false, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Client TLS Cipher │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection attributes │ { │ │ │ "_client_name": "libmariadb", │ │ │ "_client_version": "3.3.8", │ │ │ "_os": "Linux", │ │ │ "_pid": "251037", │ │ │ "_platform": "x86_64", │ │ │ "_server_host": "10.139.158.211", │ │ │ "program_name": "mysql" │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Connections │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection IDs │ 666 │ │ │ 139 │ │ │ 138 │ ├───────────────────────┼───────────────────────────────────────┤ │ Queries │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Log │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Memory │ { │ │ │ "connection_buffers": { │ │ │ "backends": { │ │ │ "shard1": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard2": { │ │ │ "misc": 662, │ │ │ "readq": 0, │ │ │ "total": 662, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard3": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ } │ │ │ }, │ │ │ "client": { │ │ │ "misc": 654, │ │ │ "readq": 65536, │ │ │ "total": 66190, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "total": 199280 │ │ │ }, │ │ │ "exec_metadata": 0, │ │ │ "last_queries": 0, │ │ │ "sescmd_history": 48, │ │ │ "total": 199328, │ │ │ "variables": 0 │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ I/O Activity │ 0 │ └───────────────────────┴───────────────────────────────────────┘ shell> maxctrl show listener Sharded-Service-Listener ┌────────────┬───────────────────────────────────────────┐ │ Name │ Sharded-Service-Listener │ ├────────────┼───────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├────────────┼───────────────────────────────────────────┤ │ Service │ Sharded-Service │ ├────────────┼───────────────────────────────────────────┤ │ Parameters │ { │ │ │ "MariaDBProtocol": { │ │ │ "allow_replication": true │ │ │ }, │ │ │ "address": "::", │ │ │ "authenticator": null, │ │ │ "authenticator_options": null, │ │ │ "connection_init_sql_file": null, │ │ │ "connection_metadata": [ │ │ │ "character_set_client=auto", │ │ │ "character_set_connection=auto", │ │ │ "character_set_results=auto", │ │ │ "max_allowed_packet=auto", │ │ │ "system_time_zone=auto", │ │ │ "time_zone=auto", │ │ │ "tx_isolation=auto" │ │ │ ], │ │ │ "port": 3306, │ │ │ "protocol": "MariaDBProtocol", │ │ │ "proxy_protocol_networks": null, │ │ │ "service": "Sharded-Service", │ │ │ "socket": null, │ │ │ "sql_mode": "default", │ │ │ "ssl": false, │ │ │ "ssl_ca": null, │ │ │ "ssl_cert": null, │ │ │ "ssl_cert_verify_depth": 9, │ │ │ "ssl_cipher": null, │ │ │ "ssl_crl": null, │ │ │ "ssl_key": null, │ │ │ "ssl_verify_peer_certificate": false, │ │ │ "ssl_verify_peer_host": false, │ │ │ "ssl_version": "MAX", │ │ │ "type": "listener", │ │ │ "user_mapping_file": null │ │ │ } │ └────────────┴───────────────────────────────────────────┘ shell> maxctrl show module schemarouter ┌─────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Parameters │ [ │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable debug mode", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": [], │ │ │ "description": "List of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables", │ │ │ "type": "stringlist" │ │ │ }, │ │ │ { │ │ │ "description": "Regex of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables_regex", │ │ │ "type": "regex" │ │ │ }, │ │ │ { │ │ │ "default_value": "150000ms", │ │ │ "description": "Maximum allowed staleness of a database map entry before clients block and wait for an update", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_staleness", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Refresh database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_databases", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How often to refresh the database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_interval", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Retrieve users from all backend servers instead of only one", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "auth_all_servers", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How ofted idle connections are pinged", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_keepalive", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "deprecated": true, │ │ │ "description": "Alias for 'wait_timeout'", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_timeout", │ │ │ "type": "duration" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Disable session command history", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "disable_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Allow the root user to connect to this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "enable_root_user", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Ping connections unconditionally", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "force_connection_keepalive", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "-1ms", │ │ │ "description": "Put connections into pool after session has been idle for this long", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "idle_session_pool_time", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Match localhost to wildcard host", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "localhost_match_wildcard_host", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Log a warning when client authentication fails", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_auth_warnings", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log debug messages for this service (debug builds only)", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log info messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_info", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log notice messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_notice", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log warning messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_warning", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": 0, │ │ │ "description": "Maximum number of connections", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_connections", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": 50, │ │ │ "description": "Session command history size", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_sescmd_history", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": "60000ms", │ │ │ "description": "How long a session can wait for a connection to become available", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "multiplex_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Network write timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "net_write_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "description": "Password for the user used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "password", │ │ │ "type": "password" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Prune old session command history if the limit is exceeded", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "prune_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "primary", │ │ │ "description": "Service rank", │ │ │ "enum_values": [ │ │ │ "primary", │ │ │ "secondary" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "rank", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "default_value": -1, │ │ │ "description": "Number of statements kept in memory", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "retain_last_statements", │ │ │ "type": "int" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable session tracing for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_trace", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "deprecated": true, │ │ │ "description": "Track session state using server responses", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_track_trx_state", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "deprecated": true, │ │ │ "description": "Strip escape characters from database names", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "strip_db_esc", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "description": "Username used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "user", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "description": "Load additional users from a file", │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file", │ │ │ "type": "path" │ │ │ }, │ │ │ { │ │ │ "default_value": "add_when_load_ok", │ │ │ "description": "When and how the user accounts file is used", │ │ │ "enum_values": [ │ │ │ "add_when_load_ok", │ │ │ "file_only_always" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file_usage", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "description": "Custom version string to use", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "version_string", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Connection idle timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "wait_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ } │ │ │ ] │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Commands │ [ │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Clear schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "clear", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/clear/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ }, │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Invalidate schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "invalidate", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/invalidate/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ } │ │ │ ] │ └─────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ shell> maxctrl show commands schemarouter ┌────────────┬────────────┬──────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├────────────┼────────────┼──────────────────────────┤ │ clear │ SERVICE │ The schemarouter service │ ├────────────┼────────────┼──────────────────────────┤ │ invalidate │ SERVICE │ The schemarouter service │ └────────────┴────────────┴──────────────────────────┘ shell> maxctrl show dbusers Sharded-Service ┌───────────────────────┬────────────────┬───────────────────────┬───────┬───────┬────────┬───────┬──────┐ │ User │ Host │ Plugin │ TLS │ Super │ Global │ Proxy │ Role │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ PUBLIC │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app │ 10.139.158.% │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mariadb.sys │ localhost │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mysql │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ root │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ └───────────────────────┴────────────────┴───────────────────────┴───────┴───────┴────────┴───────┴──────┘ shell> maxctrl show commands mariadbmon ┌───────────────────────────┬─────────────────────────────┬───────────────────────────────────────────────────────────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover-force │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ fetch-cmd-result │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cancel-cmd │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-add-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to add to ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-remove-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to remove from ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-start-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-stop-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readonly │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readwrite │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rebuild-server │ MONITOR, SERVER, [SERVER] │ Monitor name, Target server, Source server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-create-backup │ MONITOR, SERVER, STRING │ Monitor name, Source server, Backup name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-restore-from-backup │ MONITOR, SERVER, STRING │ Monitor name, Target server, Backup name │ └───────────────────────────┴─────────────────────────────┴───────────────────────────────────────────────────────────────────────────────┘
Literatur / Quellen
Taxonomy upgrade extras: shardingmaxscaleschemarouterload balancermulti-tenant

Sharding mit MariaDB MaxScale

Oli Sennhauser - Mon, 2024-03-18 16:08
Inhaltsverzeichnis
Übersicht

Dieses Feature sollte mehr oder weniger mit MariaDB MaxScale 6 (2.6), 6.1 bis 6.4 und 22.08 (6.5), 23.02 (7.0) und 23.08 (7.1) funktionieren. Wir haben es mit der neusten MaxScale Version 23.08.05 getestet, da wir mit einer älteren Version auf Problem gestossen sind.

shell> maxscale --version MaxScale 23.08.5

Als Datenbank Backend (Shards) haben wir MariaDB 10.11 verwendet.

Weniger als ca 2% aller uns bekannten MariaDB Installationen sind das, was wir technische unter Multi-Mandanten-Systeme (multi-tenant systems) verstehen (jeder Mandant (Kunde) in einer eigenen Datenbank (auch Schema genannt)).

Daher wird dieses MariaDB MaxScale Feature relativ selten genutzt und es besteht die erhöhte Gefahr auf Bugs zu stossen, auf welche vorher noch niemand gestossen ist!

Dieses Feature wird bei MariadDB MaxScale: SchemaRouter genannt und wird immer noch als Beta-Qualität deklariert:

maxctrl> show module schemarouter ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ ...

Die Ziel-Topologie soll wie folgt aussehen: Jeder Kunde (Mandant, Tenant) liegt in einer eigenen Datenbank (= Schema). Die Datenbanken sind über mehrere MariaDB Instanzen (Shards) verteilt. Damit die Applikation transparent auf die Datenbank zugreifen kann wird ein Pärchen MaxScale Load Balancer davor geschaltet, welches weiss, wo der Kunde liegt und den Traffic entsprechend an das Shard weiterleitet. Damit die MaxScale Load Balancer hochverfügbar ausgelegt sind, wird noch eine virtuelle IP (VIP) z.B. mittels Keepalived vorgeschaltet. Wem das noch zu einfach ist, der kann jedes einzelne Shard noch als Master/Slave- oder Galera Cluster-Konstrukt auslegen...

Vorbereitung der Shards (MariaDB Datenbank Instanzen)

Als erstes hatten wir bei diesem PoC Problem mit der test Datenbank. Durch das Löschen der test Datenbank auf allen Shards ist das Problem verschwunden. Alternativ dazu kann man mariadb-secure-installation ausführen, was man auf Produktionsystemen sowieso machen sollte, oder man nutzt die MaxScale Konfigurationsparameter: ignore_tables oder ignore_tables_regex um gleiche Tabellen in unterschiedlichen Shards zu erlauben.

Siehe auch: MaxScale Router Parameters.

Testdaten erstellen

Damit wir was zum Spielen haben, haben wir uns Testdaten erstellt:

-- Auf Shard 1: 2 Kunden SQL> CREATE DATABASE customer_0010; SQL> CREATE TABLE customer_0010.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0010.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0010.address VALUES (1, 'Customer 10 GmbH'); SQL> INSERT INTO customer_0010.sales VALUES (1, 'Apples', 5, 1.2, 6), (2, 'Pears', 2, 0.9, 1.8), (3, 'Bread', 1, 2.5, 2.5); SQL> CREATE DATABASE customer_0011; SQL> CREATE TABLE customer_0011.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0011.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0011.address VALUES (1, 'Customer 11 SE'); SQL> INSERT INTO customer_0011.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); -- Auf Shard 2: 3 Kunden SQL> CREATE DATABASE customer_0020; SQL> CREATE TABLE customer_0020.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0020.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0020.address VALUES (1, 'Customer 20 AG'); SQL> INSERT INTO customer_0020.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); SQL> CREATE DATABASE customer_0021; SQL> CREATE TABLE customer_0021.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0021.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0021.address VALUES (1, 'Customer 21 GmbH'); SQL> INSERT INTO customer_0021.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); SQL> CREATE DATABASE customer_0022; SQL> CREATE TABLE customer_0022.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0022.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0022.address VALUES (1, 'Customer 22 Gebr.'); SQL> INSERT INTO customer_0022.sales VALUES (1, 'Oranges', 2, 1.7, 3.4), (2, 'Salat', 5, 1.2, 6); -- Auf Shard 3: 1 Kunde SQL> CREATE DATABASE customer_0030; SQL> CREATE TABLE customer_0030.address (id INT UNSIGNED, name VARCHAR(255)); SQL> CREATE TABLE customer_0030.sales (id INT UNSIGNED, product VARCHAR(255), sales TINYINT, amount DECIMAL(6, 2), total_amount DECIMAL(6, 2)); SQL> INSERT INTO customer_0030.address VALUES (1, 'Customer 30 GmbH'); SQL> INSERT INTO customer_0030.sales VALUES (1, 'Pickles', 2, 2.2, 4.4), (2, 'Salat', 1, 3.1, 3.1), (3, 'Pudding', 5, 2.2, 11.0), (4, 'Asparagus', 12, .3, 3.6);
Rollen und User erstellen

Da in einem geshardeten System, im Unterschied zum Beispiel zu einem Galera Cluster, die einzelnen Datenbank-Instanzen nichts voneinander wissen und nicht miteinander kommunizieren, müssen wir die Rollen und User bzw. Accounts jeweils auf JEDEM Shard einzeln anlegen.

MariaDB MaxScale braucht jeweils für den SchemaRouter Service und den Monitor einen User (auf jedem Shard).

Der Monitor User ist, wie der Name sagt fürs Monitoring zuständig und der SchemaRouter Service User um die User Account Informationen aus den Sharding-Backends einzusammeln und die Queries an das richtige Shard weiterzuleiten.

Da in einem redundanten System typischerweise mit mindestens zwei MaxScale Routern gearbeitet wird und wir dem auseinanderlaufen der Privilegien der Accounts vorbeugen wollten arbeiten wir mit Rollen sowohl für die MaxScale User als auch die applikatorischen User.

MaxScale Monitor User SQL> CREATE ROLE maxscale_monitor_role; SQL> GRANT SELECT ON mysql.user TO 'maxscale_monitor_role'; SQL> GRANT REPLICATION CLIENT ON *.* TO 'maxscale_monitor_role'; SQL> GRANT SLAVE MONITOR ON *.* TO 'maxscale_monitor_role'; SQL> GRANT FILE ON *.* TO 'maxscale_monitor_role'; SQL> GRANT CONNECTION ADMIN ON *.* TO 'maxscale_monitor_role'; SQL> SHOW GRANTS FOR maxscale_monitor_role; +-----------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor_role | +-----------------------------------------------------------------------------------------------+ | GRANT FILE, BINLOG MONITOR, CONNECTION ADMIN, SLAVE MONITOR ON *.* TO `maxscale_monitor_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_monitor_role` | +-----------------------------------------------------------------------------------------------+ SQL> CREATE USER maxscale_monitor@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_monitor@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.210'; SQL> GRANT maxscale_monitor_role TO maxscale_monitor@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_monitor_role FOR maxscale_monitor@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_monitor%'; +-----------------------+----------------+---------+-----------------------+ | User | Host | is_role | default_role | +-----------------------+----------------+---------+-----------------------+ | maxscale_monitor_role | | Y | | | maxscale_monitor | 10.139.158.210 | N | maxscale_monitor_role | | maxscale_monitor | 10.139.158.211 | N | maxscale_monitor_role | +-----------------------+----------------+---------+-----------------------+ SQL> SHOW GRANTS FOR maxscale_monitor@'10.139.158.211'; +------------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_monitor@10.139.158.211 | +------------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_monitor_role` TO `maxscale_monitor`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_monitor`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_monitor_role` FOR `maxscale_monitor`@`10.139.158.211` | +------------------------------------------------------------------------------------------------------------------------------+
MaxScale Admin User SQL> CREATE ROLE maxscale_admin_role; SQL> GRANT SHOW DATABASES ON *.* TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.user TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.db TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.tables_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.columns_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.proxies_priv TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.roles_mapping TO 'maxscale_admin_role'; SQL> GRANT SELECT ON mysql.procs_priv TO 'maxscale_admin_role'; SQL> SHOW GRANTS FOR maxscale_admin_role; +------------------------------------------------------------------+ | Grants for maxscale_admin_role | +------------------------------------------------------------------+ | GRANT USAGE ON *.* TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`user` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`roles_mapping` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`tables_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`procs_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`db` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`columns_priv` TO `maxscale_admin_role` | | GRANT SELECT ON `mysql`.`proxies_priv` TO `maxscale_admin_role` | +------------------------------------------------------------------+ SQL> CREATE USER maxscale_admin@'10.139.158.210' IDENTIFIED BY 'secret'; SQL> CREATE USER maxscale_admin@'10.139.158.211' IDENTIFIED BY 'secret'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.210'; SQL> GRANT maxscale_admin_role TO maxscale_admin@'10.139.158.211'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.210'; SQL> SET DEFAULT ROLE maxscale_admin_role FOR maxscale_admin@'10.139.158.211'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'maxscale_admin%'; +---------------------+----------------+---------+---------------------+ | User | Host | is_role | default_role | +---------------------+----------------+---------+---------------------+ | maxscale_admin_role | | Y | | | maxscale_admin | 10.139.158.210 | N | maxscale_admin_role | | maxscale_admin | 10.139.158.211 | N | maxscale_admin_role | +---------------------+----------------+---------+---------------------+ SQL> SHOW GRANTS FOR maxscale_admin@'10.139.158.211'; +----------------------------------------------------------------------------------------------------------------------------+ | Grants for maxscale_admin@10.139.158.211 | +----------------------------------------------------------------------------------------------------------------------------+ | GRANT `maxscale_admin_role` TO `maxscale_admin`@`10.139.158.211` | | GRANT USAGE ON *.* TO `maxscale_admin`@`10.139.158.211` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `maxscale_admin_role` FOR `maxscale_admin`@`10.139.158.211` | +----------------------------------------------------------------------------------------------------------------------------+

Siehe dazu auch: SchemaRouter Configuration

Applikationsrolle und Accounts erstellen

Für die Applikation braucht es ebenfalls einen User, den wir hier wie auf jedem Shard wie folgt erstellen:

SQL> CREATE ROLE app_role; SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO 'app_role'; SQL> GRANT SHOW DATABASES ON *.* TO 'app_role'; SQL> GRANT CREATE, DROP, ALTER ON *.* TO 'app_role'; -- For creating new tenant databases SQL> SHOW GRANTS FOR app_role; +----------------------------------------------------------------------+ | Grants for app_role | +----------------------------------------------------------------------+ | GRANT SHOW DATABASES ON *.* TO `app_role` | | GRANT SELECT, INSERT, UPDATE, DELETE ON `customer_%`.* TO `app_role` | +----------------------------------------------------------------------+ SQL> CREATE USER app@'10.139.158.%' IDENTIFIED BY 'secret'; SQL> GRANT app_role TO app@'10.139.158.%'; SQL> SET DEFAULT ROLE app_role FOR app@'10.139.158.%'; SQL> SELECT user, host, is_role, default_role FROM mysql.user WHERE user LIKE 'app%'; +----------+--------------+---------+--------------+ | User | Host | is_role | default_role | +----------+--------------+---------+--------------+ | app_role | | Y | | | app | 10.139.158.% | N | app_role | +----------+--------------+---------+--------------+ SQL> SHOW GRANTS FOR app@'10.139.158.%'; +---------------------------------------------------------------------------------------------------------------+ | Grants for app@10.139.158.% | +---------------------------------------------------------------------------------------------------------------+ | GRANT `app_role` TO `app`@`10.139.158.%` | | GRANT USAGE ON *.* TO `app`@`10.139.158.%` IDENTIFIED BY PASSWORD '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7' | | SET DEFAULT ROLE `app_role` FOR `app`@`10.139.158.%` | +---------------------------------------------------------------------------------------------------------------+
Proxy Protokoll

Load Balancer und Proxies haben die Eigenschaft, dass sie die IP Adressen der Clients durch ihre eigenen IP Adressen austauschen. Dies führt einerseits dazu, dass man auf der Datenbank nicht mehr sieht, woher der Client ursprünglich kommt, andererseits kann man die Zugriffsberechtigungen nicht mehr auf User und IP vergeben, da immer gegen die IP der Load Balancer geprüft wird.

Diese beiden Probleme können mittels des Proxy Protokolls gelöst werden.

Hierzu müssen einerseits die Datenbank und andererseits die Load Balancer, in diesem Fall der MaxScale, das Proxy Protokoll aktiviert haben.

Auf der Datenbank-Seite aktiviert man das Proxy Protokoll wie folgt:

# # my.cnf # [mariadbd] proxy_protocol_networks = ::1, 10.139.158.0/24, localhost

und auf MaxScale-Seite mit:

# # /etc/maxscale.cnf # [shard<n>] type = server proxy_protocol = true

Überprüfen kann man die beiden Einstellungen mit:

SQL> SHOW GLOBAL VARIABLES LIKE 'proxy%'; +-------------------------+---------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------+ | proxy_protocol_networks | ::1, 10.139.158.0/24, localhost | +-------------------------+---------------------------------+ shell> maxctrl show server shard1 | grep proxy │ │ "proxy_protocol": true, │

Quellen:


MaxScale SchemaRouter Konfiguration

Als nächstes bereiten wir die MaxScale Konfiguration fürs Sharding vor. Die von MariaDB empfohlene Datei ist /etc/maxscale.cnf. Ob es sinnvoller ist eine eigene Konfigurations-Datei unter /etc/maxscale.cnf.d/ zu erstellen oder gar den ganze MaxScale dynamisch zu konfigurieren (/var/lib/maxscale/maxscale.cnf.d/*.cnf), wird sich langfristig zeigen. Siehe auch Warnungen weiter unten. Die Konfigurationsdatei für dieses Sharding PoC sieht wie folgt aus:

# # /etc/maxscale.cnf # [maxscale] threads = auto admin_gui = false [shard1] type = server address = 10.139.158.1 port = 3363 proxy_protocol = true [shard2] type=server address=10.139.158.1 port=3364 proxy_protocol = true [shard3] type = server address = 10.139.158.1 port = 3365 proxy_protocol = true [Sharding-Monitor] type = monitor module = galeramon servers = shard1,shard2,shard3 user = maxscale_monitor password = secret monitor_interval = 1s [Sharded-Service-Listener] type = listener service = Sharded-Service protocol = MariaDBClient port = 3306 [Sharded-Service] type = service router = schemarouter servers = shard1,shard2,shard3 user = maxscale_admin password = secret auth_all_servers = true

Hinweis: Empfehlung des MaxScale Entwicklers: "One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon."

Starten und Stoppen des MaxScale Load Balancers

Starten und Stoppen von MaxScale erfolgt wie üblich über SystemD:

shell> systemctl restart maxscale shell> systemctl status maxscale ● maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/lib/systemd/system/maxscale.service; enabled; vendor preset: enabled) Drop-In: /run/systemd/system/service.d └─zzz-lxc-service.conf Active: active (running) since Tue 2024-02-27 09:52:57 UTC; 39s ago Process: 187 ExecStart=/usr/bin/maxscale (code=exited, status=0/SUCCESS) Main PID: 188 (maxscale) Tasks: 10 (limit: 18663) Memory: 4.6M CPU: 150ms CGroup: /system.slice/maxscale.service └─188 /usr/bin/maxscale systemd[1]: Starting MariaDB MaxScale Database Proxy... maxscale[188]: Module 'galeramon' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libgaleramon.so'. maxscale[188]: Module 'schemarouter' loaded from '/usr/lib/x86_64-linux-gnu/maxscale/libschemarouter.so'. maxscale[188]: Using up to 2.3GiB of memory for query classifier cache systemd[1]: Started MariaDB MaxScale Database Proxy.

Falls es Fehler oder Warnungen gegeben hat, kann man diese im MaxScale Error Log sehen:

shell> grep -v notice /var/log/maxscale/maxscale.log 2024-02-13 16:47:22 MariaDB MaxScale is shut down. ---------------------------------------------------- MariaDB MaxScale /var/log/maxscale/maxscale.log Tue Feb 13 16:47:22 2024 ---------------------------------------------------------------------------- 2024-02-27 09:52:56 warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. File is for module 'mariadbmon'. Current module is 'galeramon'. 2024-02-27 09:52:56 warning: [galeramon] Invalid 'wsrep_local_index' on server 'shard1': 18446744073709551615
Applikations-Tests Einfache Applikations-Tests shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='show databases' +--------------------+ | Database | +--------------------+ | customer_0010 | | customer_0011 | | customer_0020 | | customer_0021 | | customer_0022 | | customer_0030 | | information_schema | | mysql | | performance_schema | | sys | +--------------------+
Neuer Befehl show shards shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='show shards' | grep customer_00.* | sort | column -t customer_0010.address shard1 customer_0010.sales shard1 customer_0010. shard1 customer_0011.address shard1 customer_0011.sales shard1 customer_0011. shard1 customer_0020.address shard2 customer_0020.sales shard2 customer_0020. shard2 customer_0021.address shard2 customer_0021.sales shard2 customer_0021. shard2 customer_0022.address shard2 customer_0022.sales shard2 customer_0022. shard2 customer_0030.address shard3 customer_0030.sales shard3 customer_0030. shard3

Neue Datenbanken werden nicht sofort angezeigt, sondern erst wenn die gecachten Daten upgedatet wurden (refresh_interval (300s / 5 min)).

Siehe hierzu auch: Custom SQL commands

Allgemeinere Test

Zur Erinnerung:

ShardPortCustomerState #13363customer_001<n>Running #23364customer_002<n>Running #33365customer_003<n>Running #43366customer_004<n>Running
shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --database=customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 --execute='use customer_0020; SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0010 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3363 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0020 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3364 | +--------+ shell> mariadb --user=app --password=secret --host=10.139.158.211 --port=3306 customer_0030 --execute='SELECT @@port' +--------+ | @@port | +--------+ | 3365 | +--------+
Weniger einfache (Backup-) Test shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0010 > /tmp/customer_0010.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0020 > /tmp/customer_0020.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0011 > /tmp/customer_0011.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0021 > /tmp/customer_0021.sql shell> echo $? 0 shell> mariadb-dump --user=app --password=secret --host=10.139.158.211 --port=3306 --single-transaction --databases customer_0030 > /tmp/customer_0030.sql shell> echo $? 0 shell> ll /tmp/customer_00*sql -rw-rw-r-- 1 oli oli 2738 Mar 18 12:07 /tmp/customer_0010.sql -rw-rw-r-- 1 oli oli 2904 Mar 18 12:08 /tmp/customer_0011.sql -rw-rw-r-- 1 oli oli 2712 Mar 18 12:08 /tmp/customer_0020.sql -rw-rw-r-- 1 oli oli 2906 Mar 18 12:08 /tmp/customer_0021.sql -rw-rw-r-- 1 oli oli 2964 Mar 18 12:08 /tmp/customer_0030.sql shell> tail -n 1 /tmp/customer_*.sql ==> /tmp/customer_0010.sql <== -- Dump completed on 2024-02-13 14:39:21 ==> /tmp/customer_0011.sql <== -- Dump completed on 2024-02-13 14:39:35 ==> /tmp/customer_0020.sql <== -- Dump completed on 2024-02-13 14:40:15 ==> /tmp/customer_0021.sql <== -- Dump completed on 2024-02-13 14:40:42 ==> /tmp/customer_0030.sql <== -- Dump completed on 2024-02-13 14:40:52 shell> cat /tmp/customer_00*sql | grep -A1 -i insert INSERT INTO `address` VALUES (1,'Customer 10 GmbH'); -- INSERT INTO `sales` VALUES (1,'Apples',5,1.20,6.00), -- INSERT INTO `address` VALUES (1,'Customer 11 SE'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 20 AG'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 21 GmbH'); -- INSERT INTO `sales` VALUES (1,'Oranges',2,1.70,3.40), -- INSERT INTO `address` VALUES (1,'Customer 30 GmbH'); -- INSERT INTO `sales` VALUES (1,'Pickles',2,2.20,4.40),

In MaxScale 23.08.4 war ein ziemlich übler Bug drin: Ein Rückgabewert von 0 aber keine Daten im Backup!!! Siehe dazu auch die Tickets: MXS-4966: mariadb-dump gets an error dumping schemas und MXS-4947: Tables in information_schema are treated as a normal tables. Symptome des Bugs sehen wie folgt aus:

Error: Couldn't read status information for table address () Error: Couldn't read status information for table sales ()

Daher empfiehlt sich dringend ein Upgrade auf MaxScale 23.08.5.

Komplexere Applikations-Tests

Wir haben noch einen etwas komplexeren Test erstellt (./sharding_test.php), der folgen Queries abarbeitet:

SET NAMES utf8mb4 SHOW DATABASES use customer_<nnnn> START TRANSACTION; SELECT MIN(id) AS first, MAX(id) AS last FROM `sales` INSERT INTO sales (id, product, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) INSERT INTO sales (id, product, sales, amount, total_amount) VALUES (%d, '%s', %f, %f, %f) UPDATE sales SET product = 'Prepare to delete' WHERE id = %d DELETE FROM sales WHERE id = %d COMMIT

Dieser Test lief einwandfrei durch. Die dazu gehörende Kontroll-Abfrage:

SQL> SELECT * FROM customer_0021.sales WHERE id >= (SELECT MAX(id) - 10 FROM customer_0021.sales);

Verschiedene Last-Szenarien können zusätzlich mit db_bench oder dem Acronis perfkit getestet werden. Weitere Informationen dazu siehe hier.

Cross-Shard-Tests

Allensfalls könnte man auf die Idee kommen, Cross-Shard-Queries auszuführen. Dies wird NICHT funktionieren, was ja nicht wirklich erstaunen sollte, da erstens nicht ganz einfach zu implementieren und zweites hier beschrieben:

"Note: As the sharding solution in MaxScale is relatively simple, cross-database queries between two or more shards are not supported."

Quelle: Simple Sharding with Two Servers

und

"USE db1 is routed to the server with db1. If the database is divided to multiple servers, only one server will get the command."

Quelle: SchemaRouter

Hier ein Test mit UNION:

SQL> use customer_0030 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist

Und hier noch der Gegenbeweis:

SQL> use customer_0020 Database changed SQL> SELECT * FROM customer_0020.sales UNION SELECT * FROM customer_0030.sales; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist

Und hier noch den Test mit JOIN:

SQL> use customer_0020 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0030.sales' doesn't exist SQL> use customer_0030 SQL> SELECT * FROM customer_0020.sales a JOIN customer_0030.sales b ON a.id = b.id WHERE a.sales > 1 ; ERROR 1146 (42S02): Table 'customer_0020.sales' doesn't exist
Betrieb eines MaxScale Sharding-Systems

In diesem Kapitel besprechen wir einige Punkte die für den Betrieb eines MariaDB MaxScale Sharding-Systems nützlich sein können.

Do-on-all-shards

Da es immer wieder vorkommen kann, dass O/S oder Datenbank Operationen auf allen Shards ausgeführt werden müssen, wäre es sicher sinnvoll ein Skript zu erstellen, welches reihum, auf allen Shards, den selben Befehl ausführt:

shell> ./do-on-all-shards.sh --sql='SHOW DATABASES'

Ein so geartetes Skript dürfte die Fehlerrate im Betrieb stark reduzieren. Auch Operationen wie das Re-sharding eines Mandanten, wie weiter unten beschrieben, werden sinnvollerweise geskriptet und zentral ausgeführt.

Invalidieren des Database Map Caches

Mit dem Befehlt invalidate kann man den Database Map Cache des MariaDB MaxScale SchemaRouters invalidieren. Dies gibt uns die Möglichkeit, nach dem Hinzufügen oder dem Entfernen von Mandanten den Cache schnell wieder auf einen aktuellen Stand zu bringen.

shell> maxctrl call command schemarouter invalidate Sharded-Service OK

Im unterschied zum Befehl invalidate, bei welchem die Einträge nach dem nächsten refresh_intervall auf den neusten Stand gebracht werden, löscht der Befehl clear die Einträge und ein Remap wird sofort ausgeführt.

Wenn man den Database Map Cache von remote mit einem REST API Call invalidieren möchte, geht das wie folgt:

shell> curl -i -X POST -u api_admin:secret http://10.139.158.211:8989/v1/maxscale/modules/schemarouter/clear?Sharded-Service HTTP/1.1 204 No Content Connection: close Date: Mon, 18 Mar 24 11:49:58 GMT X-Frame-Options: Deny X-XSS-Protection: 1 Referrer-Policy: same-origin Cache-Control: no-cache

Quellen:


Wie ändert man SchemaRouter Variablen dynamisch?

Das refresh_interval spezifiziert die Lebensdauer der Einträge im SchemaRouter Database Map Cache. Der default Wert ist 300 s (5 min). Refresh Interval ist daher, meiner Meinung nach, ein unglücklicher Begriff da es nicht das Intervall zwischen zwei Mappings definiert sondern die Lebzeit der Cache-Einträge (livetime?, timeout?). Sobald der Eintrag gelöscht wurde wir ein neuer Refresh der "Database Map" auf jedem Shard getriggert. Der Befehl sieht aktuell wie folgt aus:

SELECT LOWER(t.table_schema), LOWER(t.table_name) FROM information_schema.tables t UNION ALL SELECT LOWER(s.schema_name), '' FROM information_schema.schemata s

So wie es aussieht reicht ein simpler Connect um den Refresh der Database Map zu triggern.

Der aktuelle Wert für refresh_intervall kann wie folgt abgefragt werden.

shell> maxctrl show service Sharded-Service | grep refresh_interval | awk -F'│' '{ print $3 }' "refresh_interval": "300000ms",

Um den Wert dynamisch zu ändern hilft folgender Befehl:

shell> MAXCTRL_WARNINGS=0 maxctrl alter service Sharded-Service refresh_interval=10s OK

Der Wert sollte nicht zu klein eingestellt werden, da während des Mapping Vorgangs alle anderen Connections angehalten werden.

Quellen:


Hinzufügen und Entfernen eines Mandanten

Das Hinzufügen eines neuen Mandanten zu einem Shard stellt kein grosses Problem dar:

SQL> CREATE DATABASE customer_0029; SQL> use customer_0029 SQL> CREATE TABLE address LIKE customer_template.address; SQL> CREATE TABLE sales LIKE customer_template.sales; shell> maxctrl call command schemarouter invalidate Sharded-Service OK

Das Entfernen eines Mandanten von einem Shard ist hingegen etwas komplizierter und muss in Absprache mit der Applikation erfolgen:

SQL> DROP DATABASE customer_0011; shell> ./sharding_test.php .....ERROR: Table 'customer_0011.sales' doesn't exist...ERROR: Unknown database 'customer_0011'.ERROR: Unknown database 'customer_0011'......ERROR: Unknown database 'customer_0011'... shell> maxctrl call command schemarouter clear Sharded-Service OK

Zumindest ist mir bisher noch keine schlauere Variante eingefallen. Siehe auch Umziehen eines Mandanten weiter unten.

Umziehen eines Mandanten

Die Kombination von Hinzufügen und Entfernen wäre dann das Umziehen eines Mandanten von einem Shard auf ein anderes Shard, auch re-sharding genannt. Hierzu muss ebenfalls wieder mit der Applikation zusammen eine konzertierte Aktion geplant werden.

Falls dies nicht möglich ist, kann zumindest die Zeit, welche die Applikation Fehler erhält reduziert werden... Folgendes Vorgehen kann verwendet werden um einen Mandanten von Shard 2 auf Shard 3 umzuziehen:

SQL> use customer_0020; LOCK TABLES address READ, sales READ; -- Auf Shard 2, Applikation wir allenfalls blockiert! shell> mariadb-dump --user=app --password=secret --host=10.139.158.1 --port=3364 --single-transaction --skip-add-locks --databases customer_0020 | mariadb --user=app --password=secret --host=10.139.158.1 --port=3365 # Kopieren des Mandanten 20 von Shard 2 auf Shard 3 SQL> DROP DATABASE customer_0020; -- Löschen von Mandant 20 geht nicht! ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction SQL> UNLOCK TABLES; DROP DATABASE customer_0020; # So geht das Löschen von Mandant 20. shell> maxctrl call command schemarouter clear Sharded-Service # MaxScale Database Map aktualisieren. Schnell machen!!!
Bis zum Refresh der Database Map kann es zu folgenden Fehlern kommen: error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.address' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); 'customer_0020.sales' found on servers 'shard2','shard3' for user 'app'@'10.139.158.1'. error : (47621) [schemarouter] (Sharded-Service); Duplicate tables found, closing session.

Und applikationsseitig zu:

ERROR: Error: duplicate tables found on two different shards
Hinzufügen oder entfernen eines Shards

Das Umziehen eines Mandanten von einem Shard auf ein anderes Shard ist das kleine re-sharding. Etwas komplexer wird es, wenn man neue Shards hinzufügen oder alte Shards entfernen möchte. Im Anschluss daran (nach dem Hinzufügen bzw. vor dem Entfernen) würde dann ein grosses re-sharding erfolgen. Zuerst das Erweitern des Clusters um ein Shard:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ 0-3363-26014 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

Das vorbereitete Shard wird MaxScale bekannt gemacht:

shell> maxctrl create server shard4 10.139.158.1 3366 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-23676 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-52321 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-39751 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Down │ │ │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Dann wird des neue Shard mit dem MaxScale Monitor und dem Service verknüpft:

shell> MAXCTRL_WARNINGS=0 maxctrl link monitor Sharding-Monitor shard4 OK shell> MAXCTRL_WARNINGS=0 maxctrl link service Sharded-Service shard4 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-24961 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-56215 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-45177 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-32 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Ob dieser zweite Schritt ebenfalls zwingend notwendig ist, wurde nicht untersucht.

Im MariaDB MaxScale Error Log kann man den ganzen Ablauf mitverfolgen:

warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files. notice : shard4 sent version string '10.11.7-MariaDB-log'. Detected type: MariaDB, version: 10.11.7. notice : Server 'shard4' charset: latin1_swedish_ci notice : Server changed state: shard4[10.139.158.1:3366]: server_up. [Down] -> [Running] warning: Saving runtime modifications to 'Sharded-Service' in '/var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf'. The modified values will override the values found in the static configuration files. notice : Added 'shard4' to 'Sharded-Service'

Was wir hier nicht vergessen dürfen, ist das neue Shard ebenfalls mit dem Proxy Protokoll auszustatten:

shell> maxctrl show server shard4 | grep proxy │ │ "proxy_protocol": false, │ MAXCTRL_WARNINGS=0 maxctrl alter server shard4 proxy_protocol=true OK

Und jetzt können neue Mandanten auf dem neuen Shard hinzugefügt oder alte Mandanten auf das neue Shard umgezogen werden... In unserem Set-up wollen wir alle Mandanten vom Shard 1 auf Shard 4 umziehen und zudem einen neuen Mandanten customer_0040 auf Shard 4 erstellen. Die dazu benötigten Einzelschritte sind oben aufgeführt.

Nachdem das Shard 1 leergeräumt wurde, kann es abgebaut werden:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 1 │ Running │ 0-3363-25916 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-62887 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-54035 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-2247 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Mit dem Befehl destroy server wir ein Shard gelöscht. Bevor das aber funktioniert muss ein Shard aus dem Monitor und dem Service entfernt werden:

shell> MAXCTRL_WARNINGS=0 maxctrl unlink service Sharded-Service shard1 OK shell> MAXCTRL_WARNINGS=0 maxctrl unlink monitor Sharding-Monitor shard1 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard1 │ 10.139.158.1 │ 3363 │ 0 │ Running │ │ │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-64394 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56072 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3267 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Wenn das Shard aus dem Monitor und dem Service herausgelöst ist, kann es anschliessend gelöscht werden:

shell> maxctrl destroy server shard1 Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again. To hide these warnings, run: export MAXCTRL_WARNINGS=0 OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-65018 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-56886 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-3648 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘

Und im MaxScale Error Log kann man die Änderungen schön mitverfolgen:

notice : Removed 'shard1' from 'Sharded-Service' warning: Discarding journal file '/var/lib/maxscale/Sharding-Monitor_journal.json'. Servers described in the journal are different from the ones configured on the current monitor. notice : Destroyed server 'shard1' at 10.139.158.1:3363

Quelle:


Anpassen der Konfigurationsdateien

Bei den oben beschriebenen Shard Operationen habe wir einige Warnungen erhalten:

Warning: Object 'shard1' is defined in a static configuration file and cannot be permanently deleted. If MaxScale is restarted, the object will appear again.

und

Warning: Saving runtime modifications to 'Sharding-Monitor' in '/var/lib/maxscale/maxscale.cnf.d/Sharding-Monitor.cnf'. The modified values will override the values found in the static configuration files.

Die entsprechenden Konfigurationsdateien werden von MaxScale automatisch bei dynamischen System-Änderungen erstellt:

shell> ll /var/lib/maxscale/maxscale.cnf.d/ /etc/maxscale.cnf -rw-r--r-- 1 root root 612 Feb 13 14:23 /etc/maxscale.cnf /var/lib/maxscale/maxscale.cnf.d/: total 12 -rw------- 1 maxscale maxscale 187 Feb 13 16:08 Sharding-Monitor.cnf -rw------- 1 maxscale maxscale 150 Feb 13 16:07 Sharded-Service.cnf -rw------- 1 maxscale maxscale 52 Feb 13 15:46 shard4.cnf cat /var/lib/maxscale/maxscale.cnf.d/* [Sharded-Service] debug=true refresh_interval=10000ms auth_all_servers=true log_debug=true password=secret router=schemarouter type=service user=maxscale_admin targets=shard2,shard3,shard4 [Sharding-Monitor] module=galeramon monitor_interval=1000ms password=secret servers=shard2,shard3,shard4 type=monitor user=maxscale_monitor [shard4] address=10.139.158.1 port=3366 type=server

Es müssen also noch entsprechende Nachbesserungen an den Konfigurationsdateien vorgenommen werden. Man müsste sich allgemein überlegen ob man bei einem hochdynamischen System nicht alles dynamisch über Befehle konfigurieren sollte...

Wartungsarbeiten am Shard

Wenn für Wartungsarbeiten ein Shard offline genommen werden soll, hier im Beispiel shard2, kann dies wie folgt bewerkstelligt werden:

shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬──────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Running │ 0-3364-69817 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-63166 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼──────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-6902 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴──────────────┴──────────────────┘ shell> maxctrl set server shard2 drain OK shell> maxctrl set server shard2 maintenance OK shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬──────────────────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Maintenance, Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴──────────────────────┴───────────────┴──────────────────┘

Zu diesem Zeitpunkt können die Wartungsarbeiten an der Maschine oder der Datenbank vorgenommen werden...

Anschliessend müssen BEIDE Stati wieder gecleared werden, sofern bei gesetzt wurden:

shell> maxctrl clear server shard2 maintenance OK shell> maxctrl clear server shard2 drain OK maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 0 │ Running │ 0-3364-240612 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 0 │ Running │ 0-3365-289873 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 0 │ Running │ 0-3366-119848 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

Der Unterschied zwischen drain und maintenance besteht darin, dass bei drain keinen neuen Verbindungen mehr auf das Shard zugelassen werden aber bei bestehenden Verbindungen wird gewartet, bis sie geschlossen werden. Bei maintenance werden die Verbindungen sofort zwangsweise terminiert.

Beobachtung / Observierung eines MariaDB MaxScale Sharding-Systems

Mit dem MaxScale CLI Client maxtrl kann man den Zustand des MariaDB MaxScale Load Balancers abfragen hierzu gibt es zahlreiche Befehle, hauptsächlich list und show:

shell> maxctrl show module schemarouter | head -n 12 ┌─────────────┬────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼────────────────────────────────────────────────┤ │ Parameters │ ... │ shell> maxctrl list servers ┌────────┬──────────────┬──────┬─────────────┬─────────┬───────────────┬──────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard2 │ 10.139.158.1 │ 3364 │ 4 │ Running │ 0-3364-290859 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard3 │ 10.139.158.1 │ 3365 │ 4 │ Running │ 0-3365-322671 │ Sharding-Monitor │ ├────────┼──────────────┼──────┼─────────────┼─────────┼───────────────┼──────────────────┤ │ shard4 │ 10.139.158.1 │ 3366 │ 4 │ Running │ 0-3366-140018 │ Sharding-Monitor │ └────────┴──────────────┴──────┴─────────────┴─────────┴───────────────┴──────────────────┘

Die Information für die Spalte Connections ist verwirrend, da wir bei diesem Sharding-System in diesem Fall auf jedem Shard nur 1, 1 und 2 Verbindungen offen haben.

Wenn man sich mit SHOW PROCESSLIST die Situation aber auf dem jeweiligen Shard anschaut, sieht man, dass MaxScale auf JEDEM Shard für jede eingehende Verbindung auch eine Verbindung auf jedes Shard aufbaut. Somit ist die Anzeige oben eigentlich technisch korrekt, nur nicht, was man erwarten würde:

SQL> SHOW PROCESSLIST; +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+ | 123 | root | localhost | customer_0021 | Query | 0 | starting | show processlist | 0.000 | | 68107 | maxscale_monitor | 10.139.158.211:35418 | NULL | Sleep | 0 | | NULL | 0.000 | | 113372 | app | 10.139.158.1:47548 | NULL | Sleep | 47 | | NULL | 0.000 | | 113538 | app | 10.139.158.1:49058 | NULL | Sleep | 41 | | NULL | 0.000 | | 113662 | app | 10.139.158.1:47072 | NULL | Sleep | 37 | | NULL | 0.000 | | 114789 | app | 10.139.158.1:39574 | customer_0022 | Query | 0 | Updating | UPDATE sales SET product = 'Prepare to delete' WHERE id = 15622 | 0.000 | +--------+------------------+----------------------+---------------+---------+------+----------+-----------------------------------------------------------------+----------+

Das skaliert so nicht bei grossen Systemen mit hunderten oder tausenden von Mandanten! Ev. kommt in diesem Fall das MariaDB Thread Pool Feature zum Einsatz.

Laut dem MaxScale Entwickler ist dies gewünschtes/beabsichtigtes Verhalten...

shell> maxctrl list services ┌─────────────────┬──────────────┬─────────────┬───────────────────┬────────────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Targets │ ├─────────────────┼──────────────┼─────────────┼───────────────────┼────────────────────────┤ │ Sharded-Service │ schemarouter │ 4 │ 82776 │ shard2, shard3, shard4 │ └─────────────────┴──────────────┴─────────────┴───────────────────┴────────────────────────┘ shell> maxctrl list listeners ┌──────────────────────────┬──────┬──────┬─────────┬─────────────────┐ │ Name │ Port │ Host │ State │ Service │ ├──────────────────────────┼──────┼──────┼─────────┼─────────────────┤ │ Sharded-Service-Listener │ 3306 │ :: │ Running │ Sharded-Service │ └──────────────────────────┴──────┴──────┴─────────┴─────────────────┘ shell> maxctrl list monitors ┌──────────────────┬─────────┬────────────────────────┐ │ Monitor │ State │ Servers │ ├──────────────────┼─────────┼────────────────────────┤ │ Sharding-Monitor │ Running │ shard2, shard3, shard4 │ └──────────────────┴─────────┴────────────────────────┘ shell> maxctrl show server shard2 | head -n 20 ┌─────────────────────┬──────────────────────────────────────────────┐ │ Server │ shard2 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Address │ 10.139.158.1 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Port │ 3364 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Version │ 10.11.7-MariaDB-log │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Uptime │ 178960 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Last Event │ server_down │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Triggered At │ Sun, 04 Feb 2024 07:37:17 GMT │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Services │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Monitors │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────┤ ... ├─────────────────────┼──────────────────────────────────────────────┤ │ Current Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Total Connections │ 27 │ ├─────────────────────┼──────────────────────────────────────────────┤ │ Max Connections │ 5 │ shell> maxctrl show service Sharded-Service ┌─────────────────────┬──────────────────────────────────────────────────────┐ │ Service │ Sharded-Service │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Source │ /var/lib/maxscale/maxscale.cnf.d/Sharded-Service.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router │ schemarouter │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ State │ Started │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Started At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Users Loaded At │ 3/18/2024, 1:52:30 PM │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Current Connections │ 4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Total Connections │ 84590 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Max Connections │ 5 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Cluster │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Servers │ shard2 │ │ │ shard3 │ │ │ shard4 │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Services │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Filters │ │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "auth_all_servers": true, │ │ │ "connection_keepalive": "300000ms", │ │ │ "debug": true, │ │ │ "disable_sescmd_history": false, │ │ │ "enable_root_user": false, │ │ │ "force_connection_keepalive": false, │ │ │ "idle_session_pool_time": "-1ms", │ │ │ "ignore_tables": [], │ │ │ "ignore_tables_regex": null, │ │ │ "localhost_match_wildcard_host": true, │ │ │ "log_auth_warnings": true, │ │ │ "log_debug": true, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false, │ │ │ "max_connections": 0, │ │ │ "max_sescmd_history": 50, │ │ │ "max_staleness": "150000ms", │ │ │ "multiplex_timeout": "60000ms", │ │ │ "net_write_timeout": "0ms", │ │ │ "password": "*****", │ │ │ "prune_sescmd_history": true, │ │ │ "rank": "primary", │ │ │ "refresh_databases": false, │ │ │ "refresh_interval": "10000ms", │ │ │ "retain_last_statements": -1, │ │ │ "router": "schemarouter", │ │ │ "session_trace": false, │ │ │ "strip_db_esc": true, │ │ │ "type": "service", │ │ │ "user": "maxscale_admin", │ │ │ "user_accounts_file": null, │ │ │ "user_accounts_file_usage": "add_when_load_ok", │ │ │ "version_string": null, │ │ │ "wait_timeout": "0ms" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────┤ │ Router Diagnostics │ { │ │ │ "average_session": 0.028822357131634554, │ │ │ "longest_sescmd_chain": 4, │ │ │ "longest_session": 50, │ │ │ "queries": 761134, │ │ │ "sescmd_percentage": 44.44342257736483, │ │ │ "shard_map_hits": 84356, │ │ │ "shard_map_misses": 5, │ │ │ "shard_map_stale": 229, │ │ │ "shard_map_updates": 216, │ │ │ "shortest_session": 0, │ │ │ "times_sescmd_limit_exceeded": 0 │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────┘

Siehe hierzu auch MaxScale SchemaRouter Router diagnostics.

shell> maxctrl show monitor Sharding-Monitor ┌─────────────────────┬──────────────────────────────────────────────────────────┐ │ Monitor │ Sharding-Monitor │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Module │ galeramon │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ State │ Running │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Servers │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Parameters │ { │ │ │ "available_when_donor": false, │ │ │ "backend_connect_attempts": 1, │ │ │ "backend_connect_timeout": "3000ms", │ │ │ "backend_read_timeout": "3000ms", │ │ │ "backend_write_timeout": "3000ms", │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "disk_space_check_interval": "0ms", │ │ │ "disk_space_threshold": null, │ │ │ "events": "all,master_down,master_up,...,new_donor", │ │ │ "journal_max_age": "28800000ms", │ │ │ "module": "galeramon", │ │ │ "monitor_interval": "1000ms", │ │ │ "password": "*****", │ │ │ "root_node_as_master": false, │ │ │ "script": null, │ │ │ "script_timeout": "90000ms", │ │ │ "set_donor_nodes": false, │ │ │ "type": "monitor", │ │ │ "use_priority": false, │ │ │ "user": "maxscale_monitor" │ │ │ } │ ├─────────────────────┼──────────────────────────────────────────────────────────┤ │ Monitor Diagnostics │ { │ │ │ "disable_master_failback": false, │ │ │ "disable_master_role_setting": false, │ │ │ "root_node_as_master": false, │ │ │ "server_info": [ │ │ │ { │ │ │ "gtid_binlog_pos": "0-3363-26014", │ │ │ "gtid_current_pos": "0-3363-26014", │ │ │ "master_id": 0, │ │ │ "name": "shard1", │ │ │ "read_only": false, │ │ │ "server_id": 3363 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3364-240612", │ │ │ "gtid_current_pos": "0-3364-240612", │ │ │ "master_id": 0, │ │ │ "name": "shard2", │ │ │ "read_only": false, │ │ │ "server_id": 3364 │ │ │ }, │ │ │ { │ │ │ "gtid_binlog_pos": "0-3365-289873", │ │ │ "gtid_current_pos": "0-3365-289873", │ │ │ "master_id": 0, │ │ │ "name": "shard3", │ │ │ "read_only": false, │ │ │ "server_id": 3365 │ │ │ } │ │ │ ], │ │ │ "set_donor_nodes": false, │ │ │ "use_priority": false │ │ │ } │ └─────────────────────┴──────────────────────────────────────────────────────────┘ shell> maxctrl list sessions; ┌───────┬──────┬──────────────┬───────────────────────┬───────┬─────────────────┬────────┬──────────────┐ │ Id │ User │ Host │ Connected │ Idle │ Service │ Memory │ I/O-Activity │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 87240 │ app │ 10.139.158.1 │ 3/18/2024, 2:33:54 PM │ 0 │ Sharded-Service │ 68644 │ 33 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72654 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:27 PM │ 506.3 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72364 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:18 PM │ 516 │ Sharded-Service │ 199328 │ 0 │ ├───────┼──────┼──────────────┼───────────────────────┼───────┼─────────────────┼────────┼──────────────┤ │ 72530 │ app │ 10.139.158.1 │ 3/18/2024, 2:25:23 PM │ 510.5 │ Sharded-Service │ 199328 │ 0 │ └───────┴──────┴──────────────┴───────────────────────┴───────┴─────────────────┴────────┴──────────────┘ shell> maxctrl show session 26 ┌───────────────────────┬───────────────────────────────────────┐ │ Id │ 26 │ ├───────────────────────┼───────────────────────────────────────┤ │ Service │ Sharded-Service │ ├───────────────────────┼───────────────────────────────────────┤ │ State │ Session started │ ├───────────────────────┼───────────────────────────────────────┤ │ User │ app │ ├───────────────────────┼───────────────────────────────────────┤ │ Host │ 10.139.158.1 │ ├───────────────────────┼───────────────────────────────────────┤ │ Port │ 42854 │ ├───────────────────────┼───────────────────────────────────────┤ │ Database │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connected │ 2/4/2024, 9:31:12 AM │ ├───────────────────────┼───────────────────────────────────────┤ │ Idle │ 610.4 │ ├───────────────────────┼───────────────────────────────────────┤ │ Parameters │ { │ │ │ "log_error": false, │ │ │ "log_info": false, │ │ │ "log_notice": false, │ │ │ "log_warning": false │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Client TLS Cipher │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection attributes │ { │ │ │ "_client_name": "libmariadb", │ │ │ "_client_version": "3.3.8", │ │ │ "_os": "Linux", │ │ │ "_pid": "251037", │ │ │ "_platform": "x86_64", │ │ │ "_server_host": "10.139.158.211", │ │ │ "program_name": "mysql" │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ Connections │ shard1 │ │ │ shard2 │ │ │ shard3 │ ├───────────────────────┼───────────────────────────────────────┤ │ Connection IDs │ 666 │ │ │ 139 │ │ │ 138 │ ├───────────────────────┼───────────────────────────────────────┤ │ Queries │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Log │ │ ├───────────────────────┼───────────────────────────────────────┤ │ Memory │ { │ │ │ "connection_buffers": { │ │ │ "backends": { │ │ │ "shard1": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard2": { │ │ │ "misc": 662, │ │ │ "readq": 0, │ │ │ "total": 662, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "shard3": { │ │ │ "misc": 678, │ │ │ "readq": 65536, │ │ │ "total": 66214, │ │ │ "writeq": 0 │ │ │ } │ │ │ }, │ │ │ "client": { │ │ │ "misc": 654, │ │ │ "readq": 65536, │ │ │ "total": 66190, │ │ │ "writeq": 0 │ │ │ }, │ │ │ "total": 199280 │ │ │ }, │ │ │ "exec_metadata": 0, │ │ │ "last_queries": 0, │ │ │ "sescmd_history": 48, │ │ │ "total": 199328, │ │ │ "variables": 0 │ │ │ } │ ├───────────────────────┼───────────────────────────────────────┤ │ I/O Activity │ 0 │ └───────────────────────┴───────────────────────────────────────┘ shell> maxctrl show listener Sharded-Service-Listener ┌────────────┬───────────────────────────────────────────┐ │ Name │ Sharded-Service-Listener │ ├────────────┼───────────────────────────────────────────┤ │ Source │ /etc/maxscale.cnf │ ├────────────┼───────────────────────────────────────────┤ │ Service │ Sharded-Service │ ├────────────┼───────────────────────────────────────────┤ │ Parameters │ { │ │ │ "MariaDBProtocol": { │ │ │ "allow_replication": true │ │ │ }, │ │ │ "address": "::", │ │ │ "authenticator": null, │ │ │ "authenticator_options": null, │ │ │ "connection_init_sql_file": null, │ │ │ "connection_metadata": [ │ │ │ "character_set_client=auto", │ │ │ "character_set_connection=auto", │ │ │ "character_set_results=auto", │ │ │ "max_allowed_packet=auto", │ │ │ "system_time_zone=auto", │ │ │ "time_zone=auto", │ │ │ "tx_isolation=auto" │ │ │ ], │ │ │ "port": 3306, │ │ │ "protocol": "MariaDBProtocol", │ │ │ "proxy_protocol_networks": null, │ │ │ "service": "Sharded-Service", │ │ │ "socket": null, │ │ │ "sql_mode": "default", │ │ │ "ssl": false, │ │ │ "ssl_ca": null, │ │ │ "ssl_cert": null, │ │ │ "ssl_cert_verify_depth": 9, │ │ │ "ssl_cipher": null, │ │ │ "ssl_crl": null, │ │ │ "ssl_key": null, │ │ │ "ssl_verify_peer_certificate": false, │ │ │ "ssl_verify_peer_host": false, │ │ │ "ssl_version": "MAX", │ │ │ "type": "listener", │ │ │ "user_mapping_file": null │ │ │ } │ └────────────┴───────────────────────────────────────────┘ shell> maxctrl show module schemarouter ┌─────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Module │ schemarouter │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Type │ Router │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Version │ V1.0.0 │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Maturity │ Beta │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Description │ A database sharding router for simple sharding │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Parameters │ [ │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable debug mode", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": [], │ │ │ "description": "List of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables", │ │ │ "type": "stringlist" │ │ │ }, │ │ │ { │ │ │ "description": "Regex of tables to ignore when checking for duplicates", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "ignore_tables_regex", │ │ │ "type": "regex" │ │ │ }, │ │ │ { │ │ │ "default_value": "150000ms", │ │ │ "description": "Maximum allowed staleness of a database map entry before clients block and wait for an update", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_staleness", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Refresh database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_databases", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How often to refresh the database mapping information", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "refresh_interval", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Retrieve users from all backend servers instead of only one", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "auth_all_servers", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "300000ms", │ │ │ "description": "How ofted idle connections are pinged", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_keepalive", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "deprecated": true, │ │ │ "description": "Alias for 'wait_timeout'", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "connection_timeout", │ │ │ "type": "duration" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Disable session command history", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "disable_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Allow the root user to connect to this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "enable_root_user", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Ping connections unconditionally", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "force_connection_keepalive", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "-1ms", │ │ │ "description": "Put connections into pool after session has been idle for this long", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "idle_session_pool_time", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Match localhost to wildcard host", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "localhost_match_wildcard_host", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Log a warning when client authentication fails", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_auth_warnings", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log debug messages for this service (debug builds only)", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_debug", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log info messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_info", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log notice messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_notice", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Log warning messages for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "log_warning", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": 0, │ │ │ "description": "Maximum number of connections", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_connections", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": 50, │ │ │ "description": "Session command history size", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "max_sescmd_history", │ │ │ "type": "count" │ │ │ }, │ │ │ { │ │ │ "default_value": "60000ms", │ │ │ "description": "How long a session can wait for a connection to become available", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "multiplex_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Network write timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "net_write_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ }, │ │ │ { │ │ │ "description": "Password for the user used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "password", │ │ │ "type": "password" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "description": "Prune old session command history if the limit is exceeded", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "prune_sescmd_history", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": "primary", │ │ │ "description": "Service rank", │ │ │ "enum_values": [ │ │ │ "primary", │ │ │ "secondary" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "rank", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "default_value": -1, │ │ │ "description": "Number of statements kept in memory", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "retain_last_statements", │ │ │ "type": "int" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "description": "Enable session tracing for this service", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_trace", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": false, │ │ │ "deprecated": true, │ │ │ "description": "Track session state using server responses", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "session_track_trx_state", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "default_value": true, │ │ │ "deprecated": true, │ │ │ "description": "Strip escape characters from database names", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "strip_db_esc", │ │ │ "type": "bool" │ │ │ }, │ │ │ { │ │ │ "description": "Username used to retrieve database users", │ │ │ "mandatory": true, │ │ │ "modifiable": true, │ │ │ "name": "user", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "description": "Load additional users from a file", │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file", │ │ │ "type": "path" │ │ │ }, │ │ │ { │ │ │ "default_value": "add_when_load_ok", │ │ │ "description": "When and how the user accounts file is used", │ │ │ "enum_values": [ │ │ │ "add_when_load_ok", │ │ │ "file_only_always" │ │ │ ], │ │ │ "mandatory": false, │ │ │ "modifiable": false, │ │ │ "name": "user_accounts_file_usage", │ │ │ "type": "enum" │ │ │ }, │ │ │ { │ │ │ "description": "Custom version string to use", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "version_string", │ │ │ "type": "string" │ │ │ }, │ │ │ { │ │ │ "default_value": "0ms", │ │ │ "description": "Connection idle timeout", │ │ │ "mandatory": false, │ │ │ "modifiable": true, │ │ │ "name": "wait_timeout", │ │ │ "type": "duration", │ │ │ "unit": "ms" │ │ │ } │ │ │ ] │ ├─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Commands │ [ │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Clear schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "clear", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/clear/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ }, │ │ │ { │ │ │ "attributes": { │ │ │ "arg_max": 1, │ │ │ "arg_min": 1, │ │ │ "description": "Invalidate schemarouter shard map cache", │ │ │ "method": "POST", │ │ │ "parameters": [ │ │ │ { │ │ │ "description": "The schemarouter service", │ │ │ "required": true, │ │ │ "type": "SERVICE" │ │ │ } │ │ │ ] │ │ │ }, │ │ │ "id": "invalidate", │ │ │ "links": { │ │ │ "self": "http://127.0.0.1:8989/v1/modules/schemarouter/invalidate/" │ │ │ }, │ │ │ "type": "module_command" │ │ │ } │ │ │ ] │ └─────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ shell> maxctrl show commands schemarouter ┌────────────┬────────────┬──────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├────────────┼────────────┼──────────────────────────┤ │ clear │ SERVICE │ The schemarouter service │ ├────────────┼────────────┼──────────────────────────┤ │ invalidate │ SERVICE │ The schemarouter service │ └────────────┴────────────┴──────────────────────────┘ shell> maxctrl show dbusers Sharded-Service ┌───────────────────────┬────────────────┬───────────────────────┬───────┬───────┬────────┬───────┬──────┐ │ User │ Host │ Plugin │ TLS │ Super │ Global │ Proxy │ Role │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ PUBLIC │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app │ 10.139.158.% │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ app_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mariadb.sys │ localhost │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_admin_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.210 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor │ 10.139.158.211 │ mysql_native_password │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ maxscale_monitor_role │ │ │ false │ false │ false │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ mysql │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ ├───────────────────────┼────────────────┼───────────────────────┼───────┼───────┼────────┼───────┼──────┤ │ root │ localhost │ mysql_native_password │ false │ true │ true │ false │ │ └───────────────────────┴────────────────┴───────────────────────┴───────┴───────┴────────┴───────┴──────┘ shell> maxctrl show commands mariadbmon ┌───────────────────────────┬─────────────────────────────┬───────────────────────────────────────────────────────────────────────────────┐ │ Command │ Parameters │ Descriptions │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ switchover-force │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-switchover │ MONITOR, [SERVER], [SERVER] │ Monitor name, New primary (optional), Current primary (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-failover │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rejoin │ MONITOR, SERVER │ Monitor name, Joining server │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-reset-replication │ MONITOR, [SERVER] │ Monitor name, Primary server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-release-locks │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ fetch-cmd-result │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cancel-cmd │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-add-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to add to ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-remove-node │ MONITOR, STRING, STRING │ Monitor name, Hostname/IP of node to remove from ColumnStore cluster, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-get-status │ MONITOR │ Monitor name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-start-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-stop-cluster │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readonly │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-cs-set-readwrite │ MONITOR, STRING │ Monitor name, Timeout │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-rebuild-server │ MONITOR, SERVER, [SERVER] │ Monitor name, Target server, Source server (optional) │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-create-backup │ MONITOR, SERVER, STRING │ Monitor name, Source server, Backup name │ ├───────────────────────────┼─────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤ │ async-restore-from-backup │ MONITOR, SERVER, STRING │ Monitor name, Target server, Backup name │ └───────────────────────────┴─────────────────────────────┴───────────────────────────────────────────────────────────────────────────────┘
Literatur / Quellen
Taxonomy upgrade extras: shardingmaxscaleschemarouterload balancermulti-tenant

dbstat for MariaDB (and MySQL)

Shinguz - Thu, 2024-03-14 15:36
Table of contents

An idea that I have been thinking about for a long time and have now, thanks to a customer, finally tackled is dbstat for MariaDB/MySQL. The idea is based on sar/sysstat by Sebastien Godard:

sar - Collect, report, or save system activity information.

and Oracle Statspack:

Statspack is a performance tuning tool ... to quickly gather detailed analysis of the performance of that database instance.

Functionality of dbstat

Although we have had the performance schema for some time, it does not cover some points that we see as a problem in practice and that are requested by customers:

  • The table_size module collects data on the growth of tables. This allows statements to be made about the growth of individual tables, databases, future MariaDB Catalogs or the entire instance. This is interesting for users who are using multi-tenant systems or are otherwise struggling with uncontrolled growth.
  • The processlist module takes a snapshot of the process list at regular intervals and saves it. This information is useful for post-mortem analyses if the user was too slow to save his process list or to understand how a problem has built up.
  • The problem is often caused by long-running transactions, row locks or metadata locks. These are recorded and saved by the trx_and_lck and metadata_lock modules. This means that we can see problems that we did not even notice before or we can see what led to the problem after the accident (analogous to a tachograph in a vehicle).
  • Another question that we sometimes encounter in practice is: When was which database variable changed and what did it look like before? This is covered by the global_variables module. Unfortunately, it is not possible to find out who changed the variable or why. Operational processes are required for this.
  • The last module, global_status, actually covers what sar/sysstat does. It collects the values from SHOW GLOBAL STATUS; and saves them for later analysis purposes or to simply create graphs.

How does dbstat work

dbstat uses the database Event Scheduler as a scheduler. This must first be switched on for MariaDB (event_scheduler = ON). With MySQL it is already switched on by default. The Event Scheduler has the advantage that we can activate the jobs at a finer granularity, for example 10 s, which would not be possible with the crontab.

The Event Scheduler then executes SQL/PSM code to collect the data on the one hand and to delete the data on the other, so that the dbstat database does not grow immeasurably.

The following jobs are currently planned:

ModuleCollectDeleteQuantity structureRemarks table_size1/d at 02:0412/h, 1000 rows, > 31 d1000 tab × 31 d = 31k rowsShould work up to 288k tables. processlist1/min1/min, 1000 rows, > 7 d1000 con × 1440 min × 7 d = 10M rowsShould work up to 1000 concurrent connections. trx_and_lck1/min1/min, 1000 rows, > 7 d100 lck × 1440 min × 7 d = 1M rowsDepends very much on the application. metadata_lock1/min12/h, 1000 rows, > 30 d100 mdl × 1440 × 30 d = 4M rowsDepends very much on the application. global_variables1/minnever1000 rowsNormally this table should not grow. global_status1/min1/min, 1000 rows, > 30 d1000 rows × 1440 × 30 d = 40MRows can become large?
How to install dbstat

dbstat can be downloaded from Github and is licensed under GPLv2.

The installation is simple: First execute the SQL file create_user_and_db.sql. Then execute the corresponding create_*.sql files for the respective modules in the dbstat database. There are currently no direct dependencies between the modules. If you want to use a different user or a different database than dbstat, you have to take care of this yourself.

Query dbstat

Some possible queries on the data have already been prepared. They can be found in the query_*.sql files. Here are a few examples:

table_size SELECT `table_schema`, `table_name`, `ts`, `table_rows`, `data_length`, `index_length` FROM `table_size` WHERE `table_catalog` = 'def' AND `table_schema` = 'dbstat' AND `table_name` = 'table_size' ORDER BY `ts` ASC ; +--------------+------------+---------------------+------------+-------------+--------------+ | table_schema | table_name | ts | table_rows | data_length | index_length | +--------------+------------+---------------------+------------+-------------+--------------+ | dbstat | table_size | 2024-03-09 20:01:00 | 0 | 16384 | 16384 | | dbstat | table_size | 2024-03-10 17:26:33 | 310 | 65536 | 16384 | | dbstat | table_size | 2024-03-11 08:28:12 | 622 | 114688 | 49152 | | dbstat | table_size | 2024-03-12 08:02:38 | 934 | 114688 | 49152 | | dbstat | table_size | 2024-03-13 08:08:55 | 1247 | 278528 | 81920 | +--------------+------------+---------------------+------------+-------------+--------------+
processlist SELECT connection_id, ts, time, state, SUBSTR(REGEXP_REPLACE(REPLACE(query, "\n", ' '), '\ +', ' '), 1, 64) AS query FROM processlist WHERE command != 'Sleep' AND connection_id = @connection_id ORDER BY ts ASC LIMIT 5 ; +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | connection_id | ts | time | state | query | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | 14956 | 2024-03-09 20:21:12 | 13.042 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:22:12 | 73.045 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:23:12 | 133.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:24:12 | 193.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:25:12 | 253.041 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+
trx_and_lck SELECT * FROM trx_and_lck\G *************************** 1. row *************************** machine_name: connection_id: 14815 trx_id: 269766 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Query time: 41.000 running_since: 2024-03-09 20:05:16 state: Statistics info: select * from test where id = 6 for update trx_state: LOCK WAIT trx_started: 2024-03-09 20:05:15 trx_requested_lock_id: 269766:821:5:7 trx_tables_in_use: 1 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 0 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6 *************************** 2. row *************************** machine_name: connection_id: 14817 trx_id: 269760 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Sleep time: 60.000 running_since: 2024-03-09 20:04:57 state: info: trx_state: RUNNING trx_started: 2024-03-09 20:04:56 trx_requested_lock_id: NULL trx_tables_in_use: 0 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 1 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6
metadata_lock SELECT lock_mode, ts, user, host, lock_type, table_schema, table_name, time, started, state, query FROM metadata_lock WHERE connection_id = 14347 ORDER BY started DESC LIMIT 5 ; +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | lock_mode | ts | user | host | lock_type | table_schema | table_name | time | started | state | query | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | MDL_SHARED_WRITE | 2024-03-13 10:27:33 | root | localhost | Table metadata lock | test | test | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_TRANS_DML | 2024-03-13 10:27:33 | root | localhost | Backup lock | | | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_ALTER_COPY | 2024-03-13 10:22:33 | root | localhost | Backup lock | | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_SHARED_UPGRADABLE | 2024-03-13 10:22:33 | root | localhost | Table metadata lock | test | test | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_INTENTION_EXCLUSIVE | 2024-03-13 10:22:33 | root | localhost | Schema metadata lock | test | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
global_variables SELECT variable_name, COUNT(*) AS cnt FROM global_variables GROUP BY variable_name HAVING COUNT(*) > 1 ; +-------------------------+-----+ | variable_name | cnt | +-------------------------+-----+ | innodb_buffer_pool_size | 7 | +-------------------------+-----+ SELECT variable_name, ts, variable_value FROM global_variables WHERE variable_name = 'innodb_buffer_pool_size' ; +-------------------------+---------------------+----------------+ | variable_name | ts | variable_value | +-------------------------+---------------------+----------------+ | innodb_buffer_pool_size | 2024-03-09 21:36:28 | 134217728 | | innodb_buffer_pool_size | 2024-03-09 21:40:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:48:14 | 134217728 | +-------------------------+---------------------+----------------+
global_status SELECT s1.ts , s1.variable_value AS 'table_open_cache_misses' , s2.variable_value AS 'table_open_cache_hits' FROM global_status AS s1 JOIN global_status AS s2 ON s1.ts = s2.ts WHERE s1.variable_name = 'table_open_cache_misses' AND s2.variable_name = 'table_open_cache_hits' AND s1.ts BETWEEN '2024-03-13 11:55:00' AND '2024-03-13 12:05:00' ORDER BY ts ASC ; +---------------------+-------------------------+-----------------------+ | ts | table_open_cache_misses | table_open_cache_hits | +---------------------+-------------------------+-----------------------+ | 2024-03-13 11:55:47 | 1001 | 60711 | | 2024-03-13 11:56:47 | 1008 | 61418 | | 2024-03-13 11:57:47 | 1015 | 62125 | | 2024-03-13 11:58:47 | 1022 | 62829 | | 2024-03-13 11:59:47 | 1029 | 63533 | | 2024-03-13 12:00:47 | 1036 | 64237 | | 2024-03-13 12:01:47 | 1043 | 64944 | | 2024-03-13 12:02:47 | 1050 | 65651 | | 2024-03-13 12:03:47 | 1057 | 66355 | | 2024-03-13 12:04:47 | 1064 | 67059 | +---------------------+-------------------------+-----------------------+
Testing

We have currently rolled out dbstat on our test and production systems to test it and see whether our assumptions regarding stability and calculations of the quantity structure are correct. In addition, using it ourselves is the best way to find out if something is missing or if the handling is impractical (Eat your own dog food).

Sources
Taxonomy upgrade extras: performancemonitoringperformance monitoringmetadata locklocklockingperformance_schema

dbstat for MariaDB (and MySQL)

Shinguz - Thu, 2024-03-14 15:36
Table of contents

An idea that I have been thinking about for a long time and have now, thanks to a customer, finally tackled is dbstat for MariaDB/MySQL. The idea is based on sar/sysstat by Sebastien Godard:

sar - Collect, report, or save system activity information.

and Oracle Statspack:

Statspack is a performance tuning tool ... to quickly gather detailed analysis of the performance of that database instance.

Functionality of dbstat

Although we have had the performance schema for some time, it does not cover some points that we see as a problem in practice and that are requested by customers:

  • The table_size module collects data on the growth of tables. This allows statements to be made about the growth of individual tables, databases, future MariaDB Catalogs or the entire instance. This is interesting for users who are using multi-tenant systems or are otherwise struggling with uncontrolled growth.
  • The processlist module takes a snapshot of the process list at regular intervals and saves it. This information is useful for post-mortem analyses if the user was too slow to save his process list or to understand how a problem has built up.
  • The problem is often caused by long-running transactions, row locks or metadata locks. These are recorded and saved by the trx_and_lck and metadata_lock modules. This means that we can see problems that we did not even notice before or we can see what led to the problem after the accident (analogous to a tachograph in a vehicle).
  • Another question that we sometimes encounter in practice is: When was which database variable changed and what did it look like before? This is covered by the global_variables module. Unfortunately, it is not possible to find out who changed the variable or why. Operational processes are required for this.
  • The last module, global_status, actually covers what sar/sysstat does. It collects the values from SHOW GLOBAL STATUS; and saves them for later analysis purposes or to simply create graphs.

How does dbstat work

dbstat uses the database Event Scheduler as a scheduler. This must first be switched on for MariaDB (event_scheduler = ON). With MySQL it is already switched on by default. The Event Scheduler has the advantage that we can activate the jobs at a finer granularity, for example 10 s, which would not be possible with the crontab.

The Event Scheduler then executes SQL/PSM code to collect the data on the one hand and to delete the data on the other, so that the dbstat database does not grow immeasurably.

The following jobs are currently planned:

ModuleCollectDeleteQuantity structureRemarks table_size1/d at 02:0412/h, 1000 rows, > 31 d1000 tab × 31 d = 31k rowsShould work up to 288k tables. processlist1/min1/min, 1000 rows, > 7 d1000 con × 1440 min × 7 d = 10M rowsShould work up to 1000 concurrent connections. trx_and_lck1/min1/min, 1000 rows, > 7 d100 lck × 1440 min × 7 d = 1M rowsDepends very much on the application. metadata_lock1/min12/h, 1000 rows, > 30 d100 mdl × 1440 × 30 d = 4M rowsDepends very much on the application. global_variables1/minnever1000 rowsNormally this table should not grow. global_status1/min1/min, 1000 rows, > 30 d1000 rows × 1440 × 30 d = 40MRows Can become large?
How to install dbstat

dbstat can be downloaded from Github and is licensed under GPLv2.

The installation is simple: First execute the SQL file create_user_and_db.sql. Then execute the corresponding create_*.sql files for the respective modules in the dbstat database. There are currently no direct dependencies between the modules. If you want to use a different user or a different database than dbstat, you have to take care of this yourself.

Query dbstat

Some possible queries on the data have already been prepared. They can be found in the query_*.sql files. Here are a few examples:

table_size SELECT `table_schema`, `table_name`, `ts`, `table_rows`, `data_length`, `index_length` FROM `table_size` WHERE `table_catalog` = 'def' AND `table_schema` = 'dbstat' AND `table_name` = 'table_size' ORDER BY `ts` ASC ; +--------------+------------+---------------------+------------+-------------+--------------+ | table_schema | table_name | ts | table_rows | data_length | index_length | +--------------+------------+---------------------+------------+-------------+--------------+ | dbstat | table_size | 2024-03-09 20:01:00 | 0 | 16384 | 16384 | | dbstat | table_size | 2024-03-10 17:26:33 | 310 | 65536 | 16384 | | dbstat | table_size | 2024-03-11 08:28:12 | 622 | 114688 | 49152 | | dbstat | table_size | 2024-03-12 08:02:38 | 934 | 114688 | 49152 | | dbstat | table_size | 2024-03-13 08:08:55 | 1247 | 278528 | 81920 | +--------------+------------+---------------------+------------+-------------+--------------+
processlist SELECT connection_id, ts, time, state, SUBSTR(REGEXP_REPLACE(REPLACE(query, "\n", ' '), '\ +', ' '), 1, 64) AS query FROM processlist WHERE command != 'Sleep' AND connection_id = @connection_id ORDER BY ts ASC LIMIT 5 ; +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | connection_id | ts | time | state | query | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | 14956 | 2024-03-09 20:21:12 | 13.042 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:22:12 | 73.045 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:23:12 | 133.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:24:12 | 193.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:25:12 | 253.041 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+
trx_and_lck SELECT * FROM trx_and_lck\G *************************** 1. row *************************** machine_name: connection_id: 14815 trx_id: 269766 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Query time: 41.000 running_since: 2024-03-09 20:05:16 state: Statistics info: select * from test where id = 6 for update trx_state: LOCK WAIT trx_started: 2024-03-09 20:05:15 trx_requested_lock_id: 269766:821:5:7 trx_tables_in_use: 1 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 0 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6 *************************** 2. row *************************** machine_name: connection_id: 14817 trx_id: 269760 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Sleep time: 60.000 running_since: 2024-03-09 20:04:57 state: info: trx_state: RUNNING trx_started: 2024-03-09 20:04:56 trx_requested_lock_id: NULL trx_tables_in_use: 0 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 1 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6
metadata_lock SELECT lock_mode, ts, user, host, lock_type, table_schema, table_name, time, started, state, query FROM metadata_lock WHERE connection_id = 14347 ORDER BY started DESC LIMIT 5 ; +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | lock_mode | ts | user | host | lock_type | table_schema | table_name | time | started | state | query | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | MDL_SHARED_WRITE | 2024-03-13 10:27:33 | root | localhost | Table metadata lock | test | test | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_TRANS_DML | 2024-03-13 10:27:33 | root | localhost | Backup lock | | | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_ALTER_COPY | 2024-03-13 10:22:33 | root | localhost | Backup lock | | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_SHARED_UPGRADABLE | 2024-03-13 10:22:33 | root | localhost | Table metadata lock | test | test | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_INTENTION_EXCLUSIVE | 2024-03-13 10:22:33 | root | localhost | Schema metadata lock | test | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
global_variables SELECT variable_name, COUNT(*) AS cnt FROM global_variables GROUP BY variable_name HAVING COUNT(*) > 1 ; +-------------------------+-----+ | variable_name | cnt | +-------------------------+-----+ | innodb_buffer_pool_size | 7 | +-------------------------+-----+ SELECT variable_name, ts, variable_value FROM global_variables WHERE variable_name = 'innodb_buffer_pool_size' ; +-------------------------+---------------------+----------------+ | variable_name | ts | variable_value | +-------------------------+---------------------+----------------+ | innodb_buffer_pool_size | 2024-03-09 21:36:28 | 134217728 | | innodb_buffer_pool_size | 2024-03-09 21:40:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:48:14 | 134217728 | +-------------------------+---------------------+----------------+
global_status SELECT s1.ts , s1.variable_value AS 'table_open_cache_misses' , s2.variable_value AS 'table_open_cache_hits' FROM global_status AS s1 JOIN global_status AS s2 ON s1.ts = s2.ts WHERE s1.variable_name = 'table_open_cache_misses' AND s2.variable_name = 'table_open_cache_hits' AND s1.ts BETWEEN '2024-03-13 11:55:00' AND '2024-03-13 12:05:00' ORDER BY ts ASC ; +---------------------+-------------------------+-----------------------+ | ts | table_open_cache_misses | table_open_cache_hits | +---------------------+-------------------------+-----------------------+ | 2024-03-13 11:55:47 | 1001 | 60711 | | 2024-03-13 11:56:47 | 1008 | 61418 | | 2024-03-13 11:57:47 | 1015 | 62125 | | 2024-03-13 11:58:47 | 1022 | 62829 | | 2024-03-13 11:59:47 | 1029 | 63533 | | 2024-03-13 12:00:47 | 1036 | 64237 | | 2024-03-13 12:01:47 | 1043 | 64944 | | 2024-03-13 12:02:47 | 1050 | 65651 | | 2024-03-13 12:03:47 | 1057 | 66355 | | 2024-03-13 12:04:47 | 1064 | 67059 | +---------------------+-------------------------+-----------------------+
Testing

We have currently rolled out dbstat on our test and production systems to test it and see whether our assumptions regarding stability and calculations of the quantity structure are correct. In addition, using it ourselves is the best way to find out if something is missing or if the handling is impractical (Eat your own dog food).

Sources
Taxonomy upgrade extras: performancemonitoringperformance monitoringmetadata locklocklockingperformance_schema

dbstat für MariaDB (und MySQL)

Oli Sennhauser - Thu, 2024-03-14 11:33
Inhaltsverzeichnis

Eine Idee, die ich schon lange ins Auge gefasst und jetzt endlich, dank eines Kunden, in Angriff genommen habe, ist dbstat für MariaDB/MySQL. Die Idee ist angelehnt an sar/sysstat von Sebastien Godard:

sar - Collect, report, or save system activity information.

und Oracle Statspack:

Statspack is a performance tuning tool ... to quickly gather detailed analysis of the performance of that database instance.

Funktionalität

Zwar haben wir seit längerem das Performance Schema, aber dieses deckt einige Punkte nicht ab, die wir in der Praxis als Problem sehen und von Kunden gewünscht werden:

  • Das Modul table_size sammelt Daten über das Wachstum von Tabellen. Somit können Aussagen über das Wachstum einzelner Tabellen, Datenbanken, die zukünftigen MariaDB Kataloge oder die ganze Instanz gemacht werden. Dies ist interessant für Nutzer welche Multi-Mandanten-Systeme (multi-tenant) im Einsatz oder sonst wie mit unkontrolliertem Wachstum zu kämpfen haben.
  • Das Modul processlist macht in regelmässigen Abständen einen Snapshot der Prozessliste und speichert diese ab. Diese Informationen sind nützlich bei post-mortem Analysen wenn der Nutzer zu langsam war, seine Prozessliste wegzuspeichern oder um zu verstehen, wie sich ein Problem aufgebaut hat.
  • Oft baut sich das Problem durch langlaufende Transaktion, Row Locks oder Metadata Locks auf. Diese werden durch die Module trx_and_lck sowie metadata_lock festgehalten und abgespeichert. Somit können wir Probleme sehen, die wir vorher gar nicht gespürt haben oder wird sehen nach dem Unglück, was zum Problem geführt hat (analog zu einem Fahrtenschreiber im Fahrzeug).
  • Eine weitere Fragestellung, die wir in der Praxis manchmal antreffen, ist: Wann wurde welche Datenbankvariable geändert und wie sah sie vorher aus. Dies wird durch das Modul global_variables abgedeckt. Wer oder warum die Variable geändert hat, kann leider datenbankseitig nicht eruiert werden. Dazu sind betriebliche Prozesse notwendig.
  • Das letzte Modul, global_status deckt eigentlich das ab, was sar/sysstat tut. Es sammelt die Werte von SHOW GLOBAL STATUS; ein und speichert sie ab für spätere Analysezwecke oder um damit einfach Graphen erstellen zu können.

Wie funktioniert dbstat

dbstat nutzt als Scheduler den Datenbank Event Scheduler. Dieser muss bei MariaDB zuerst eingeschaltet werden (event_scheduler = ON). Bei MySQL ist er bereits per default eingeschaltet. Der Event Scheduler hat den Vorteil, dass wir die Jobs feingranularer aktivieren können, zum Beispiel 10 s, was mit der Crontab nicht möglich wäre.

Der Event Scheduler führt dann SQL/PSM Code aus um einerseits die Daten zu sammeln und andererseits die Daten auch wieder zu löschen, damit die dbstat Datenbank nicht ins unermessliche wächst.

Aktuell sind folgende Jobs vorgesehen:

ModulSammelnLöschenMengengerüstBemerkungen table_size1/d um 02:0412/h, 1000 rows, > 31 d1000 tab x 31 d = 31k rowsSollte bis 288k Tabellen funktionieren. processlist1/min1/min, 1000 rows, > 7 d1000 con x 1440 min x 7 d = 10M rowsSollte bis 1000 Concurrent Connections funktionieren. trx_and_lck1/min1/min, 1000 rows, > 7 d100 lck x 1440 min x 7 d = 1M rowsHängt sehr stark von der Anwendung ab. metadata_lock1/min12/h, 1000 rows, > 30 d100 mdl x 1440 x 30 d = 4M rowsHängt sehr stark von der Anwendung ab. global_variables1/minnie1000 rowsIm Normalfall sollte diese Tabelle nicht anwachsen. global_status1/min1/min, 1000 rows, > 30 d1000 rows x 1440 x 30 d = 40M rowsKann gross werden?
Installation

dbstat kann von Github heruntergeladen werden und steht unter der GPLv2.

Die Installation ist einfach: Als erstes die SQL Datei create_user_and_db.sql ausführen. Dann in der Datenbank dbstat die entsprechenden create_*.sql Dateien für die jeweiligen Module ausführen. Es gibt zur Zeit keine direkten Abhängigkeiten zwischen den Modulen. Wenn ein anderer User oder eine andere Datenbank als dbstat verwendet werden soll, muss man sich selber drum kümmern.

Abfrage

Einige mögliche Abfragen auf die Daten wurden bereits vorbereitet. Sie sind zu finden in den Dateien query_*.sql. Hier ein paar Beispiele:

table_size SELECT `table_schema`, `table_name`, `ts`, `table_rows`, `data_length`, `index_length` FROM `table_size` WHERE `table_catalog` = 'def' AND `table_schema` = 'dbstat' AND `table_name` = 'table_size' ORDER BY `ts` ASC ; +--------------+------------+---------------------+------------+-------------+--------------+ | table_schema | table_name | ts | table_rows | data_length | index_length | +--------------+------------+---------------------+------------+-------------+--------------+ | dbstat | table_size | 2024-03-09 20:01:00 | 0 | 16384 | 16384 | | dbstat | table_size | 2024-03-10 17:26:33 | 310 | 65536 | 16384 | | dbstat | table_size | 2024-03-11 08:28:12 | 622 | 114688 | 49152 | | dbstat | table_size | 2024-03-12 08:02:38 | 934 | 114688 | 49152 | | dbstat | table_size | 2024-03-13 08:08:55 | 1247 | 278528 | 81920 | +--------------+------------+---------------------+------------+-------------+--------------+
processlist SELECT connection_id, ts, time, state, SUBSTR(REGEXP_REPLACE(REPLACE(query, "\n", ' '), '\ +', ' '), 1, 64) AS query FROM processlist WHERE command != 'Sleep' AND connection_id = @connection_id ORDER BY ts ASC LIMIT 5 ; +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | connection_id | ts | time | state | query | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | 14956 | 2024-03-09 20:21:12 | 13.042 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:22:12 | 73.045 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:23:12 | 133.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:24:12 | 193.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:25:12 | 253.041 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+
trx_and_lck SELECT * FROM trx_and_lck\G *************************** 1. row *************************** machine_name: connection_id: 14815 trx_id: 269766 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Query time: 41.000 running_since: 2024-03-09 20:05:16 state: Statistics info: select * from test where id = 6 for update trx_state: LOCK WAIT trx_started: 2024-03-09 20:05:15 trx_requested_lock_id: 269766:821:5:7 trx_tables_in_use: 1 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 0 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6 *************************** 2. row *************************** machine_name: connection_id: 14817 trx_id: 269760 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Sleep time: 60.000 running_since: 2024-03-09 20:04:57 state: info: trx_state: RUNNING trx_started: 2024-03-09 20:04:56 trx_requested_lock_id: NULL trx_tables_in_use: 0 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 1 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6
metadata_lock SELECT lock_mode, ts, user, host, lock_type, table_schema, table_name, time, started, state, query FROM metadata_lock WHERE connection_id = 14347 ORDER BY started DESC LIMIT 5 ; +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | lock_mode | ts | user | host | lock_type | table_schema | table_name | time | started | state | query | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | MDL_SHARED_WRITE | 2024-03-13 10:27:33 | root | localhost | Table metadata lock | test | test | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_TRANS_DML | 2024-03-13 10:27:33 | root | localhost | Backup lock | | | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_ALTER_COPY | 2024-03-13 10:22:33 | root | localhost | Backup lock | | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_SHARED_UPGRADABLE | 2024-03-13 10:22:33 | root | localhost | Table metadata lock | test | test | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_INTENTION_EXCLUSIVE | 2024-03-13 10:22:33 | root | localhost | Schema metadata lock | test | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
global_variables SELECT variable_name, COUNT(*) AS cnt FROM global_variables GROUP BY variable_name HAVING COUNT(*) > 1 ; +-------------------------+-----+ | variable_name | cnt | +-------------------------+-----+ | innodb_buffer_pool_size | 7 | +-------------------------+-----+ SELECT variable_name, ts, variable_value FROM global_variables WHERE variable_name = 'innodb_buffer_pool_size' ; +-------------------------+---------------------+----------------+ | variable_name | ts | variable_value | +-------------------------+---------------------+----------------+ | innodb_buffer_pool_size | 2024-03-09 21:36:28 | 134217728 | | innodb_buffer_pool_size | 2024-03-09 21:40:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:41:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:42:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:43:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:44:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:48:14 | 134217728 | +-------------------------+---------------------+----------------+
global_status SELECT s1.ts , s1.variable_value AS 'table_open_cache_misses' , s2.variable_value AS 'table_open_cache_hits' FROM global_status AS s1 JOIN global_status AS s2 ON s1.ts = s2.ts WHERE s1.variable_name = 'table_open_cache_misses' AND s2.variable_name = 'table_open_cache_hits' AND s1.ts BETWEEN '2024-03-13 11:55:00' AND '2024-03-13 12:05:00' ORDER BY ts ASC ; +---------------------+-------------------------+-----------------------+ | ts | table_open_cache_misses | table_open_cache_hits | +---------------------+-------------------------+-----------------------+ | 2024-03-13 11:55:47 | 1001 | 60711 | | 2024-03-13 11:56:47 | 1008 | 61418 | | 2024-03-13 11:57:47 | 1015 | 62125 | | 2024-03-13 11:58:47 | 1022 | 62829 | | 2024-03-13 11:59:47 | 1029 | 63533 | | 2024-03-13 12:00:47 | 1036 | 64237 | | 2024-03-13 12:01:47 | 1043 | 64944 | | 2024-03-13 12:02:47 | 1050 | 65651 | | 2024-03-13 12:03:47 | 1057 | 66355 | | 2024-03-13 12:04:47 | 1064 | 67059 | +---------------------+-------------------------+-----------------------+
Testen

Zur Zeit haben wir dbstat auf unseren Test- und Produktionssystemen ausgerollt um es zu testen und zu sehen ob unsere Annahmen bezüglich Stabilität und Berechnungen des Mengengerüsts zutreffen. Zudem stellen wir beim selber nutzen am besten fest, wenn es noch was fehlt oder die Handhabung unpraktisch ist (Eat your own dog food).

Quellen
Taxonomy upgrade extras: performancemonitoringperformance monitoringmetadata locklocklockingperformance_schema

dbstat für MariaDB (und MySQL)

Oli Sennhauser - Thu, 2024-03-14 11:33
Inhaltsverzeichnis

Eine Idee, die ich schon lange ins Auge gefasst und jetzt endlich, dank eines Kunden, in Angriff genommen habe, ist dbstat für MariaDB/MySQL. Die Idee ist angelehnt an sar/sysstat von Sebastien Godard:

sar - Collect, report, or save system activity information.

und Oracle Statspack:

Statspack is a performance tuning tool ... to quickly gather detailed analysis of the performance of that database instance.

Funktionalität

Zwar haben wir seit längerem das Performance Schema, aber dieses deckt einige Punkte nicht ab, die wir in der Praxis als Problem sehen und von Kunden gewünscht werden:

  • Das Modul table_size sammelt Daten über das Wachstum von Tabellen. Somit können Aussagen über das Wachstum einzelner Tabellen, Datenbanken, die zukünftigen MariaDB Kataloge oder die ganze Instanz gemacht werden. Dies ist interessant für Nutzer welche Multi-Mandanten-Systeme (multi-tenant) im Einsatz oder sonst wie mit unkontrolliertem Wachstum zu kämpfen haben.
  • Das Modul processlist macht in regelmässigen Abständen einen Snapshot der Prozessliste und speichert diese ab. Diese Informationen sind nützlich bei post-mortem Analysen wenn der Nutzer zu langsam war, seine Prozessliste wegzuspeichern oder um zu verstehen, wie sich ein Problem aufgebaut hat.
  • Oft baut sich das Problem durch langlaufende Transaktion, Row Locks oder Metadata Locks auf. Diese werden durch die Module trx_and_lck sowie metadata_lock festgehalten und abgespeichert. Somit können wir Probleme sehen, die wir vorher gar nicht gespürt haben oder wird sehen nach dem Unglück, was zum Problem geführt hat (analog zu einem Fahrtenschreiber im Fahrzeug).
  • Eine weitere Fragestellung, die wir in der Praxis manchmal antreffen, ist: Wann wurde welche Datenbankvariable geändert und wie sah sie vorher aus. Dies wird durch das Modul global_variables abgedeckt. Wer oder warum die Variable geändert hat, kann leider datenbankseitig nicht eruiert werden. Dazu sind betriebliche Prozesse notwendig.
  • Das letzte Modul, global_status deckt eigentlich das ab, was sar/sysstat tut. Es sammelt die Werte von SHOW GLOBAL STATUS; ein und speichert sie ab für spätere Analysezwecke oder um damit einfach Graphen erstellen zu können.

Wie funktioniert dbstat

dbstat nutzt als Scheduler den Datenbank Event Scheduler. Dieser muss bei MariaDB zuerst eingeschaltet werden (event_scheduler = ON). Bei MySQL ist er bereits per default eingeschaltet. Der Event Scheduler hat den Vorteil, dass wir die Jobs feingranularer aktivieren können, zum Beispiel 10 s, was mit der Crontab nicht möglich wäre.

Der Event Scheduler führt dann SQL/PSM Code aus um einerseits die Daten zu sammeln und andererseits die Daten auch wieder zu löschen, damit die dbstat Datenbank nicht ins unermessliche wächst.

Aktuell sind folgende Jobs vorgesehen:

ModulSammelnLöschenMengengerüstBemerkungen table_size1/d um 02:0412/h, 1000 rows, > 31 d1000 tab x 31 d = 31k rowsSollte bis 288k Tabellen funktionieren. processlist1/min1/min, 1000 rows, > 7 d1000 con x 1440 min x 7 d = 10M rowsSollte bis 1000 Concurrent Connections funktionieren. trx_and_lck1/min1/min, 1000 rows, > 7 d100 lck x 1440 min x 7 d = 1M rowsHängt sehr stark von der Anwendung ab. metadata_lock1/min12/h, 1000 rows, > 30 d100 mdl x 1440 x 30 d = 4M rowsHängt sehr stark von der Anwendung ab. global_variables1/minnie1000 rowsIm Normalfall sollte diese Tabelle nicht anwachsen. global_status1/min1/min, 1000 rows, > 30 d1000 rows x 1440 x 30 d = 40M rowsKann gross werden?
Installation

dbstat kann von Github heruntergeladen werden und steht unter der GPLv2.

Die Installation ist einfach: Als erstes die SQL Datei create_user_and_db.sql ausführen. Dann in der Datenbank dbstat die entsprechenden create_*.sql Dateien für die jeweiligen Module ausführen. Es gibt zur Zeit keine direkten Abhängigkeiten zwischen den Modulen. Wenn ein anderer User oder eine andere Datenbank als dbstat verwendet werden soll, muss man sich selber drum kümmern.

Abfrage

Einige mögliche Abfragen auf die Daten wurden bereits vorbereitet. Sie sind zu finden in den Dateien query_*.sql. Hier ein paar Beispiele:

table_size SELECT `table_schema`, `table_name`, `ts`, `table_rows`, `data_length`, `index_length` FROM `table_size` WHERE `table_catalog` = 'def' AND `table_schema` = 'dbstat' AND `table_name` = 'table_size' ORDER BY `ts` ASC ; +--------------+------------+---------------------+------------+-------------+--------------+ | table_schema | table_name | ts | table_rows | data_length | index_length | +--------------+------------+---------------------+------------+-------------+--------------+ | dbstat | table_size | 2024-03-09 20:01:00 | 0 | 16384 | 16384 | | dbstat | table_size | 2024-03-10 17:26:33 | 310 | 65536 | 16384 | | dbstat | table_size | 2024-03-11 08:28:12 | 622 | 114688 | 49152 | | dbstat | table_size | 2024-03-12 08:02:38 | 934 | 114688 | 49152 | | dbstat | table_size | 2024-03-13 08:08:55 | 1247 | 278528 | 81920 | +--------------+------------+---------------------+------------+-------------+--------------+
processlist SELECT connection_id, ts, time, state, SUBSTR(REGEXP_REPLACE(REPLACE(query, "\n", ' '), '\ +', ' '), 1, 64) AS query FROM processlist WHERE command != 'Sleep' AND connection_id = @connection_id ORDER BY ts ASC LIMIT 5 ; +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | connection_id | ts | time | state | query | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | 14956 | 2024-03-09 20:21:12 | 13.042 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:22:12 | 73.045 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:23:12 | 133.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:24:12 | 193.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:25:12 | 253.041 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+
trx_and_lck SELECT * FROM trx_and_lck\G *************************** 1. row *************************** machine_name: connection_id: 14815 trx_id: 269766 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Query time: 41.000 running_since: 2024-03-09 20:05:16 state: Statistics info: select * from test where id = 6 for update trx_state: LOCK WAIT trx_started: 2024-03-09 20:05:15 trx_requested_lock_id: 269766:821:5:7 trx_tables_in_use: 1 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 0 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6 *************************** 2. row *************************** machine_name: connection_id: 14817 trx_id: 269760 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Sleep time: 60.000 running_since: 2024-03-09 20:04:57 state: info: trx_state: RUNNING trx_started: 2024-03-09 20:04:56 trx_requested_lock_id: NULL trx_tables_in_use: 0 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 1 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6
metadata_lock SELECT lock_mode, ts, user, host, lock_type, table_schema, table_name, time, started, state, query FROM metadata_lock WHERE connection_id = 14347 ORDER BY started DESC LIMIT 5 ; +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | lock_mode | ts | user | host | lock_type | table_schema | table_name | time | started | state | query | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | MDL_SHARED_WRITE | 2024-03-13 10:27:33 | root | localhost | Table metadata lock | test | test | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_TRANS_DML | 2024-03-13 10:27:33 | root | localhost | Backup lock | | | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_ALTER_COPY | 2024-03-13 10:22:33 | root | localhost | Backup lock | | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_SHARED_UPGRADABLE | 2024-03-13 10:22:33 | root | localhost | Table metadata lock | test | test | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_INTENTION_EXCLUSIVE | 2024-03-13 10:22:33 | root | localhost | Schema metadata lock | test | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
global_variables SELECT variable_name, COUNT(*) AS cnt FROM global_variables GROUP BY variable_name HAVING COUNT(*) > 1 ; +-------------------------+-----+ | variable_name | cnt | +-------------------------+-----+ | innodb_buffer_pool_size | 7 | +-------------------------+-----+ SELECT variable_name, ts, variable_value FROM global_variables WHERE variable_name = 'innodb_buffer_pool_size' ; +-------------------------+---------------------+----------------+ | variable_name | ts | variable_value | +-------------------------+---------------------+----------------+ | innodb_buffer_pool_size | 2024-03-09 21:36:28 | 134217728 | | innodb_buffer_pool_size | 2024-03-09 21:40:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:41:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:42:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:43:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:44:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:48:14 | 134217728 | +-------------------------+---------------------+----------------+
global_status SELECT s1.ts , s1.variable_value AS 'table_open_cache_misses' , s2.variable_value AS 'table_open_cache_hits' FROM global_status AS s1 JOIN global_status AS s2 ON s1.ts = s2.ts WHERE s1.variable_name = 'table_open_cache_misses' AND s2.variable_name = 'table_open_cache_hits' AND s1.ts BETWEEN '2024-03-13 11:55:00' AND '2024-03-13 12:05:00' ORDER BY ts ASC ; +---------------------+-------------------------+-----------------------+ | ts | table_open_cache_misses | table_open_cache_hits | +---------------------+-------------------------+-----------------------+ | 2024-03-13 11:55:47 | 1001 | 60711 | | 2024-03-13 11:56:47 | 1008 | 61418 | | 2024-03-13 11:57:47 | 1015 | 62125 | | 2024-03-13 11:58:47 | 1022 | 62829 | | 2024-03-13 11:59:47 | 1029 | 63533 | | 2024-03-13 12:00:47 | 1036 | 64237 | | 2024-03-13 12:01:47 | 1043 | 64944 | | 2024-03-13 12:02:47 | 1050 | 65651 | | 2024-03-13 12:03:47 | 1057 | 66355 | | 2024-03-13 12:04:47 | 1064 | 67059 | +---------------------+-------------------------+-----------------------+
Testen

Zur Zeit haben wir dbstat auf unseren Test- und Produktionssystemen ausgerollt um es zu testen und zu sehen ob unsere Annahmen bezüglich Stabilität und Berechnungen des Mengengerüsts zutreffen. Zudem stellen wir beim selber nutzen am besten fest, wenn es noch was fehlt oder die Handhabung unpraktisch ist (Eat your own dog food).

Quellen
Taxonomy upgrade extras: performancemonitoringperformance monitoringmetadata locklocklockingperformance_schema

MariaDB/MySQL Environment MyEnv 2.1.0 has been released

Shinguz - Wed, 2024-02-28 17:22

FromDual has the pleasure to announce the release of the new version 2.1.0 of its popular MariaDB, Galera Cluster and MySQL multi-instance environment MyEnv.

The new MyEnv can be downloaded here. How to install MyEnv is described in the MyEnv Installation Guide.

In the inconceivable case that you find a bug in the MyEnv please report it to the FromDual bug tracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.1.x to 2.0

Please look at the MyEnv 2.0.0 Release Notes.

Upgrade from 2.0.x to 2.1.0 shell> cd ${HOME}/product shell> tar xf /download/myenv-2.1.0.tar.gz shell> rm -f myenv shell> ln -s myenv-2.1.0 myenv
Plug-ins

If you are using plug-ins for showMyEnvStatus create all the links in the new directory structure:

shell> cd ${HOME}/product/myenv shell> ln -s ../../utl/oem_agent.php plg/showMyEnvStatus/
Upgrade of the instance directory structure

From MyEnv 1.0 to 2.0 the directory structure of instances has fundamentally changed. Nevertheless MyEnv 2.0 works fine with MyEnv 1.0 directory structures.

Changes in MyEnv 2.1.0 MyEnv
  • Removed hard coded parts for running MyEnv under O/S user mariadb.
  • Function substitute_path was refactored.
  • Branch guessing improved.
  • Warnings and errors are in color now.
  • MyEnv log file is now touched to avoid problems with O/S user root.
  • O/S user mysql removed in start/stop script.
  • Checks for DB start improved.
  • /var/run replaced by the more modern location /run.
  • Should now be completely MariaDB compatible (mariadbd vs. mysqld).
  • Wrapper mysqld_safe was extended to mariadbd-safe.
  • Replaced getVersionFromMysqld by getVersionAndBranchFromDaemon and extended functionality of this function.
  • LD_LIBRARY_PATH was set to the wrong directory.
  • Reverting Commit: fcc93c5 from v2.0.3 related to CDPATH. Break commands like cd log or cd etc.
  • Database mysql_innodb_cluster_metadata is hidden now.
  • Database #innodb_redo is suppressed now as well for MySQL 8.0, and hideschema is not added to every new instance any more to not overwrite the default.
  • Bug while stopping instance with missing my.cnf fixed.
  • Function getDistribution cleaned-up.
  • MySQL should now also be detected correctly from Ubuntu repository.
  • Function my_exec rewritten.
  • Debian GNU/Linux tag added for distros.
  • Function extractBranch made better to work on Debian and Ubuntu with distribution packages.
  • Oracle Linux is considered as well now.
  • Made scripts ready for new MariaDB behaviour.
  • my.cnf template adapted to newest knowledge.
  • Directory changed from /tmp to /var/tmp, code cleaned-up and renewal, PID file code and message improved in stopInstance.
  • Distributions cleaned-up and cloudlinux, rocky linux and almalinux added as centos compatible distros.

MyEnv Installer
  • Debian 10 and 11 do not support PHP 8.0 yet, fixed.
  • Unit file is copied now correctly.
  • MyEnv instance installation is automatizable now.
  • Instance creation automation added.
  • my.cnf template together with installMyenv should now work without errors or warnings for MariaDB 10.5 - 11.2 and MySQL 8.0 - 8.3.
  • Command yum replaced by dnf.
  • Command apt-get comments replaced by apt.

MyEnv Utilities
  • Client utility adapted in *monitor scripts.
  • InnoDB cluster monitor added.
  • wsrep_last_committed was added in galera_monitor.sh.
  • AWR added, sharding stuff added, lock and trx analysis scripts added.
  • Memory analysis added, NUMA maps output made ready for new variables.
  • connect_maxout utility added.

For subscriptions of commercial use of MyEnv please get in contact with us.

Taxonomy upgrade extras: MyEnvmulti-instancevirtualizationconsolidationSaaSOperationsreleasemysqld_multi

MariaDB/MySQL Environment MyEnv 2.1.0 has been released

Shinguz - Wed, 2024-02-28 17:22

FromDual has the pleasure to announce the release of the new version 2.1.0 of its popular MariaDB, Galera Cluster and MySQL multi-instance environment MyEnv.

The new MyEnv can be downloaded here. How to install MyEnv is described in the MyEnv Installation Guide.

In the inconceivable case that you find a bug in the MyEnv please report it to the FromDual bug tracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.1.x to 2.0

Please look at the MyEnv 2.0.0 Release Notes.

Upgrade from 2.0.x to 2.1.0 shell> cd ${HOME}/product shell> tar xf /download/myenv-2.1.0.tar.gz shell> rm -f myenv shell> ln -s myenv-2.1.0 myenv
Plug-ins

If you are using plug-ins for showMyEnvStatus create all the links in the new directory structure:

shell> cd ${HOME}/product/myenv shell> ln -s ../../utl/oem_agent.php plg/showMyEnvStatus/
Upgrade of the instance directory structure

From MyEnv 1.0 to 2.0 the directory structure of instances has fundamentally changed. Nevertheless MyEnv 2.0 works fine with MyEnv 1.0 directory structures.

Changes in MyEnv 2.1.0 MyEnv
  • Removed hard coded parts for running MyEnv under O/S user mariadb.
  • Function substitute_path was refactored.
  • Branch guessing improved.
  • Warnings and errors are in color now.
  • MyEnv log file is now touched to avoid problems with O/S user root.
  • O/S user mysql removed in start/stop script.
  • Checks for DB start improved.
  • /var/run replaced by the more modern location /run.
  • Should now be completely MariaDB compatible (mariadbd vs. mysqld).
  • Wrapper mysqld_safe was extended to mariadbd-safe.
  • Replaced getVersionFromMysqld by getVersionAndBranchFromDaemon and extended functionality of this function.
  • LD_LIBRARY_PATH was set to the wrong directory.
  • Reverting Commit: fcc93c5 from v2.0.3 related to CDPATH. Break commands like cd log or cd etc.
  • Database mysql_innodb_cluster_metadata is hidden now.
  • Database #innodb_redo is suppressed now as well for MySQL 8.0, and hideschema is not added to every new instance any more to not overwrite the default.
  • Bug while stopping instance with missing my.cnf fixed.
  • Function getDistribution cleaned-up.
  • MySQL should now also be detected correctly from Ubuntu repository.
  • Function my_exec rewritten.
  • Debian GNU/Linux tag added for distros.
  • Function extractBranch made better to work on Debian and Ubuntu with distribution packages.
  • Oracle Linux is considered as well now.
  • Made scripts ready for new MariaDB behaviour.
  • my.cnf template adapted to newest knowledge.
  • Directory changed from /tmp to /var/tmp, code cleaned-up and renewal, PID file code and message improved in stopInstance.
  • Distributions cleaned-up and cloudlinux, rocky linux and almalinux added as centos compatible distros.

MyEnv Installer
  • Debian 10 and 11 do not support PHP 8.0 yet, fixed.
  • Unit file is copied now correctly.
  • MyEnv instance installation is automatizable now.
  • Instance creation automation added.
  • my.cnf template together with installMyenv should now work without errors or warnings for MariaDB 10.5 - 11.2 and MySQL 8.0 - 8.3.
  • Command yum replaced by dnf.
  • Command apt-get comments replaced by apt.

MyEnv Utilities
  • Client utility adapted in *monitor scripts.
  • InnoDB cluster monitor added.
  • wsrep_last_committed was added in galera_monitor.sh.
  • AWR added, sharding stuff added, lock and trx analysis scripts added.
  • Memory analysis added, NUMA maps output made ready for new variables.
  • connect_maxout utility added.

For subscriptions of commercial use of MyEnv please get in contact with us.

Taxonomy upgrade extras: MyEnvmulti-instancevirtualizationconsolidationSaaSOperationsreleasemysqld_multi

We build a data warehouse from the General Query Log

Shinguz - Wed, 2024-01-31 16:41

The design of a data warehouse differs from relational design. Data warehouses are often designed according to the concept of the star schema.

When building a data warehouse, you usually put the cart before the horse:

  • What questions should my data warehouse be able to answer?
  • How do I have to design my model so that my questions can be answered easily?
  • Where do I get the data to populate the model?
  • How do I fill my model with the data?

For training purposes, we have investigated an issue that arises from time to time with our support team: The system suddenly and unexpectedly starts to behave unusually, nobody has done anything and nobody knows why. Example with a customer last week: The system starts to become unstable at 3 pm, is then restarted hard and then stabilises again from 4 pm...

The easiest thing to do in such a case would be to quickly look at the database with the SHOW PROCESSLIST command and then it often becomes immediately clear where the problem lies. But customers often forget this or they are not fast enough. The General Query Log was already switched on for this customer, so this would be a great case for our General Query Log Data Warehouse!

What questions should my data warehouse be able to answer?

The generic question for this problem should be something like: "Who or what caused my system to behave abnormally?"

In technical terms, the question would be something like:

  • Who: Which user or account was on the database with how many connections at the time in question? What was unusual about it?
  • What: Which queries were running in which schema on the system at the time in question? Which of these queries were unusual?

What should my model look like?

We can already derive some facts and dimensions from the question:

  • User or account (user + host)
  • Time
  • Connections
  • Schema
  • Queries

And this also results in 4 dimensions and the fact table:

Data source

Where the data comes from is relatively easy to answer in this case: The customer provides his General Query Logs or you can also use the General Query Logs of our own systems for testing purposes.

How is the model populated?

Technically, this is known as an ETL process (Extract-Transform-Load). In our case, we have built a General Query Log parser that reads the General Query Log, prepares the data accordingly and saves it in the model.

Checking the model

And then we come to checking the model. We used test data from one of our systems for this:

  • Which user was on the system at the time in question?
  • Which user had how many connections open at the time in question?

SELECT td.time, cd.user, COUNT(*) AS count FROM connection_dim cd JOIN query_fact AS qf ON qf.connection_id = cd.connection_id JOIN time_dim AS td ON td.time_id = qf.time_id JOIN date_dim AS dd ON dd.date_id = qf.date_id WHERE td.time BETWEEN '17:00' AND '18:30' AND dd.date = '2019-08-02' GROUP BY td.time, cd.user ORDER BY td.time ASC, cd.user ; +----------+---------------+-------+ | time | user | count | +----------+---------------+-------+ | 17:58:00 | UNKNOWN USER | 1 | | 17:59:00 | brman | 58 | | 17:59:00 | brman_catalog | 18 | | 17:59:00 | root | 5 | | 18:00:00 | brman | 296 | | 18:00:00 | brman_catalog | 7 | | 18:00:00 | root | 3 | | 18:01:00 | brman_catalog | 18 | | 18:01:00 | root | 3 | | 18:06:00 | brman | 266 | | 18:06:00 | brman_catalog | 6 | | 18:07:00 | brman | 88 | | 18:07:00 | brman_catalog | 7 | | 18:10:00 | brman | 211 | | 18:10:00 | brman_catalog | 18 | | 18:10:00 | root | 4 | | 18:11:00 | brman | 141 | | 18:11:00 | root | 3 | | 18:13:00 | brman | 4 | | 18:14:00 | brman | 348 | | 18:17:00 | brman | 354 | | 18:17:00 | brman_catalog | 12 | | 18:17:00 | root | 1 | +----------+---------------+-------+
  • Which account was on the system at the time in question?
  • Which account had how many connections open at the time in question?

SELECT td.time, cd.user, cd.hostname, COUNT(*) AS count FROM connection_dim cd JOIN query_fact AS qf ON qf.connection_id = cd.connection_id JOIN time_dim AS td ON td.time_id = qf.time_id JOIN date_dim AS dd ON dd.date_id = qf.date_id WHERE td.time BETWEEN '17:00' AND '18:30' AND dd.date = '2019-08-02' GROUP BY td.time, cd.user, cd.hostname ORDER BY td.time ASC, cd.user ; +----------+---------------+--------------+-------+ | time | user | hostname | count | +----------+---------------+--------------+-------+ | 17:58:00 | UNKNOWN USER | UNKNOWN HOST | 1 | | 17:59:00 | brman | localhost | 58 | | 17:59:00 | brman_catalog | localhost | 18 | | 17:59:00 | root | localhost | 5 | | 18:00:00 | brman | localhost | 296 | | 18:00:00 | brman_catalog | localhost | 7 | | 18:00:00 | root | localhost | 3 | | 18:01:00 | brman_catalog | localhost | 18 | | 18:01:00 | root | localhost | 3 | | 18:06:00 | brman | localhost | 266 | | 18:06:00 | brman_catalog | localhost | 6 | | 18:07:00 | brman | localhost | 88 | | 18:07:00 | brman_catalog | localhost | 7 | | 18:10:00 | brman | localhost | 211 | | 18:10:00 | brman_catalog | localhost | 18 | | 18:10:00 | root | localhost | 4 | | 18:11:00 | brman | localhost | 141 | | 18:11:00 | root | localhost | 3 | | 18:13:00 | brman | localhost | 4 | | 18:14:00 | brman | localhost | 348 | | 18:17:00 | brman | localhost | 354 | | 18:17:00 | brman_catalog | localhost | 12 | | 18:17:00 | root | localhost | 1 | +----------+---------------+--------------+-------+
  • What was unusual about it?

SELECT cd.user, td.time, COUNT(*) AS count FROM connection_dim cd JOIN query_fact AS qf ON qf.connection_id = cd.connection_id JOIN time_dim AS td ON td.time_id = qf.time_id JOIN date_dim AS dd ON dd.date_id = qf.date_id WHERE td.time BETWEEN '17:00' AND '18:30' AND dd.date = '2019-08-02' GROUP BY td.time, cd.user ORDER BY cd.user ASC, td.time ASC ; +---------------+----------+-------+ | user | time | count | +---------------+----------+-------+ | brman | 17:59:00 | 58 | | brman | 18:00:00 | 296 | | brman | 18:06:00 | 266 | | brman | 18:07:00 | 88 | | brman | 18:10:00 | 211 | | brman | 18:11:00 | 141 | | brman | 18:13:00 | 4 | | brman | 18:14:00 | 348 | | brman | 18:17:00 | 354 | | brman_catalog | 17:59:00 | 18 | | brman_catalog | 18:00:00 | 7 | | brman_catalog | 18:01:00 | 18 | | brman_catalog | 18:06:00 | 6 | | brman_catalog | 18:07:00 | 7 | | brman_catalog | 18:10:00 | 18 | | brman_catalog | 18:17:00 | 12 | | root | 17:59:00 | 5 | | root | 18:00:00 | 3 | | root | 18:01:00 | 3 | | root | 18:10:00 | 4 | | root | 18:11:00 | 3 | | root | 18:17:00 | 1 | | UNKNOWN USER | 17:58:00 | 1 | +---------------+----------+-------+

One could deduce here, for example, that the user brman had a relatively large number of open connections during the period in question. Whether this is unusual, we have too little data or the time period is too short.

  • Which queries were running on the system at the time in question and in which schema?
  • Which of these queries were unusual?

SELECT sd.schema_name, td.time, SUBSTR(std.statement_text, 1, 128) AS query FROM query_fact AS qf JOIN time_dim AS td ON td.time_id = qf.time_id JOIN schema_dim AS sd ON sd.schema_id = qf.schema_id JOIN statement_dim AS std ON std.statement_id = qf.statement_id WHERE td.time BETWEEN '17:00' AND '18:30' AND sd.schema_name = 'brman_catalog' AND std.command = 'Query' ORDER BY td.time, qf.statement_id LIMIT 10 ; +---------------+----------+----------------------------------------------------------------------------------------------------------------------------------+ | schema_name | time | query | +---------------+----------+----------------------------------------------------------------------------------------------------------------------------------+ | brman_catalog | 17:59:00 | SET NAMES `utf8` | | brman_catalog | 17:59:00 | SELECT COUNT ( * ) AS `cnt` FROM `information_schema` . `tables` WHERE `table_schema` = ? AND TABLE_NAME = ? | | brman_catalog | 17:59:00 | SELECT COUNT ( * ) AS `cnt` FROM `information_schema` . `tables` WHERE `table_schema` = ? AND TABLE_NAME = ? | | brman_catalog | 17:59:00 | CREATE TABLE `metadata` ( `id` TINYINT UNSIGNED NOT NULL AUTO_INCREMENT , `key` VARCHARACTER (?) NOT NULL , `value` VARCHARACTER | | brman_catalog | 17:59:00 | INSERT INTO `metadata` ( `key` , `value` ) VALUES (...) | | brman_catalog | 17:59:00 | INSERT INTO `metadata` ( `key` , `value` ) VALUES (...) | | brman_catalog | 17:59:00 | CREATE TABLE `backups` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT , `instance_name` VARCHARACTER (?) NOT NULL , `start_ts` | | brman_catalog | 17:59:00 | CREATE TABLE `backup_details` ( `backup_id` INTEGER UNSIGNED NOT NULL , `hostname` VARCHARACTER (?) NULL , `binlog_file` VARCHAR | | brman_catalog | 17:59:00 | CREATE TABLE `files` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT , `schema_name` VARCHARACTER (?) NULL , `original_name` VAR | | brman_catalog | 17:59:00 | CREATE TABLE `binary_logs` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT , `filename` VARCHARACTER (?) NOT NULL , `begin_ts` I | +---------------+----------+----------------------------------------------------------------------------------------------------------------------------------+
Suggestions for improvement

Based on this first iteration of the model, you can already see which questions the model cannot yet answer or where the model is too imprecise. This can then be improved in a second round....

Examples of this are:

  • The granularity of the time dimension may be too coarse with an accuracy of minutes. Would it make more sense to use seconds?
  • The question of how long a connection was open is not so easy to answer. Perhaps a further fact table would be appropriate here? SELECT cd.connection_number, cd.user, cd.hostname, tdf.time AS time_from, tdt.time AS time_to, (UNIX_TIMESTAMP(tdt.time) - UNIX_TIMESTAMP(tdf.time)) AS duration FROM connection_dim AS cd JOIN query_fact AS qf1 ON cd.connection_id = qf1.connection_id JOIN time_dim AS tdf ON tdf.time_id = qf1.time_id JOIN statement_dim AS sdf ON sdf.statement_id = qf1.statement_id JOIN query_fact AS qf2 ON cd.connection_id = qf2.connection_id JOIN time_dim AS tdt ON tdt.time_id = qf2.time_id JOIN statement_dim AS sdt ON sdt.statement_id = qf2.statement_id WHERE tdf.time BETWEEN '17:00' AND '18:30' AND sdf.command = 'Connect' AND sdt.command = 'Quit' AND (UNIX_TIMESTAMP(tdt.time) - UNIX_TIMESTAMP(tdf.time)) > 0 ORDER BY tdf.time ; +-------------------+-------+-----------+-----------+----------+----------+ | connection_number | user | hostname | time_from | time_to | duration | +-------------------+-------+-----------+-----------+----------+----------+ | 211 | brman | localhost | 17:59:00 | 18:00:00 | 60 | | 215 | root | localhost | 18:00:00 | 18:17:00 | 1020 | | 219 | brman | localhost | 18:06:00 | 18:07:00 | 60 | | 225 | brman | localhost | 18:10:00 | 18:11:00 | 60 | | 226 | brman | localhost | 18:13:00 | 18:14:00 | 60 | +-------------------+-------+-----------+-----------+----------+----------+
  • Of course, it would be exciting if an AI were used to solve the problem. How do you train it correctly and does it find the problem once it has been trained?

So much for the little gimmick of building a data warehouse...

Taxonomy upgrade extras: data warehousegeneral query log

We build a data warehouse from the General Query Log

Shinguz - Wed, 2024-01-31 16:41

The design of a data warehouse differs from relational design. Data warehouses are often designed according to the concept of the star schema.

When building a data warehouse, you usually put the cart before the horse:

  • What questions should my data warehouse be able to answer?
  • How do I have to design my model so that my questions can be answered easily?
  • Where do I get the data to populate the model?
  • How do I fill my model with the data?

For training purposes, we have investigated an issue that arises from time to time with our support team: The system suddenly and unexpectedly starts to behave unusually, nobody has done anything and nobody knows why. Example with a customer last week: The system starts to become unstable at 3 pm, is then restarted hard and then stabilises again from 4 pm...

The easiest thing to do in such a case would be to quickly look at the database with the SHOW PROCESSLIST command and then it often becomes immediately clear where the problem lies. But customers often forget this or they are not fast enough. The General Query Log was already switched on for this customer, so this would be a great case for our General Query Log Data Warehouse!

What questions should my data warehouse be able to answer?

The generic question for this problem should be something like: "Who or what caused my system to behave abnormally?"

In technical terms, the question would be something like:

  • Who: Which user or account was on the database with how many connections at the time in question? What was unusual about it?
  • What: Which queries were running in which schema on the system at the time in question? Which of these queries were unusual?

What should my model look like?

We can already derive some facts and dimensions from the question:

  • User or account (user + host)
  • Time
  • Connections
  • Schema
  • Queries

And this also results in 4 dimensions and the fact table:

Data source

Where the data comes from is relatively easy to answer in this case: The customer provides his General Query Logs or you can also use the General Query Logs of our own systems for testing purposes.

How is the model populated?

Technically, this is known as an ETL process (Extract-Transform-Load). In our case, we have built a General Query Log parser that reads the General Query Log, prepares the data accordingly and saves it in the model.

Checking the model

And then we come to checking the model. We used test data from one of our systems for this:

  • Which user was on the system at the time in question?
  • Which user had how many connections open at the time in question?

SELECT td.time, cd.user, COUNT(*) AS count FROM connection_dim cd JOIN query_fact AS qf ON qf.connection_id = cd.connection_id JOIN time_dim AS td ON td.time_id = qf.time_id JOIN date_dim AS dd ON dd.date_id = qf.date_id WHERE td.time BETWEEN '17:00' AND '18:30' AND dd.date = '2019-08-02' GROUP BY td.time, cd.user ORDER BY td.time ASC, cd.user ; +----------+---------------+-------+ | time | user | count | +----------+---------------+-------+ | 17:58:00 | UNKNOWN USER | 1 | | 17:59:00 | brman | 58 | | 17:59:00 | brman_catalog | 18 | | 17:59:00 | root | 5 | | 18:00:00 | brman | 296 | | 18:00:00 | brman_catalog | 7 | | 18:00:00 | root | 3 | | 18:01:00 | brman_catalog | 18 | | 18:01:00 | root | 3 | | 18:06:00 | brman | 266 | | 18:06:00 | brman_catalog | 6 | | 18:07:00 | brman | 88 | | 18:07:00 | brman_catalog | 7 | | 18:10:00 | brman | 211 | | 18:10:00 | brman_catalog | 18 | | 18:10:00 | root | 4 | | 18:11:00 | brman | 141 | | 18:11:00 | root | 3 | | 18:13:00 | brman | 4 | | 18:14:00 | brman | 348 | | 18:17:00 | brman | 354 | | 18:17:00 | brman_catalog | 12 | | 18:17:00 | root | 1 | +----------+---------------+-------+
  • Which account was on the system at the time in question?
  • Which account had how many connections open at the time in question?

SELECT td.time, cd.user, cd.hostname, COUNT(*) AS count FROM connection_dim cd JOIN query_fact AS qf ON qf.connection_id = cd.connection_id JOIN time_dim AS td ON td.time_id = qf.time_id JOIN date_dim AS dd ON dd.date_id = qf.date_id WHERE td.time BETWEEN '17:00' AND '18:30' AND dd.date = '2019-08-02' GROUP BY td.time, cd.user, cd.hostname ORDER BY td.time ASC, cd.user ; +----------+---------------+--------------+-------+ | time | user | hostname | count | +----------+---------------+--------------+-------+ | 17:58:00 | UNKNOWN USER | UNKNOWN HOST | 1 | | 17:59:00 | brman | localhost | 58 | | 17:59:00 | brman_catalog | localhost | 18 | | 17:59:00 | root | localhost | 5 | | 18:00:00 | brman | localhost | 296 | | 18:00:00 | brman_catalog | localhost | 7 | | 18:00:00 | root | localhost | 3 | | 18:01:00 | brman_catalog | localhost | 18 | | 18:01:00 | root | localhost | 3 | | 18:06:00 | brman | localhost | 266 | | 18:06:00 | brman_catalog | localhost | 6 | | 18:07:00 | brman | localhost | 88 | | 18:07:00 | brman_catalog | localhost | 7 | | 18:10:00 | brman | localhost | 211 | | 18:10:00 | brman_catalog | localhost | 18 | | 18:10:00 | root | localhost | 4 | | 18:11:00 | brman | localhost | 141 | | 18:11:00 | root | localhost | 3 | | 18:13:00 | brman | localhost | 4 | | 18:14:00 | brman | localhost | 348 | | 18:17:00 | brman | localhost | 354 | | 18:17:00 | brman_catalog | localhost | 12 | | 18:17:00 | root | localhost | 1 | +----------+---------------+--------------+-------+
  • What was unusual about it?

SELECT cd.user, td.time, COUNT(*) AS count FROM connection_dim cd JOIN query_fact AS qf ON qf.connection_id = cd.connection_id JOIN time_dim AS td ON td.time_id = qf.time_id JOIN date_dim AS dd ON dd.date_id = qf.date_id WHERE td.time BETWEEN '17:00' AND '18:30' AND dd.date = '2019-08-02' GROUP BY td.time, cd.user ORDER BY cd.user ASC, td.time ASC ; +---------------+----------+-------+ | user | time | count | +---------------+----------+-------+ | brman | 17:59:00 | 58 | | brman | 18:00:00 | 296 | | brman | 18:06:00 | 266 | | brman | 18:07:00 | 88 | | brman | 18:10:00 | 211 | | brman | 18:11:00 | 141 | | brman | 18:13:00 | 4 | | brman | 18:14:00 | 348 | | brman | 18:17:00 | 354 | | brman_catalog | 17:59:00 | 18 | | brman_catalog | 18:00:00 | 7 | | brman_catalog | 18:01:00 | 18 | | brman_catalog | 18:06:00 | 6 | | brman_catalog | 18:07:00 | 7 | | brman_catalog | 18:10:00 | 18 | | brman_catalog | 18:17:00 | 12 | | root | 17:59:00 | 5 | | root | 18:00:00 | 3 | | root | 18:01:00 | 3 | | root | 18:10:00 | 4 | | root | 18:11:00 | 3 | | root | 18:17:00 | 1 | | UNKNOWN USER | 17:58:00 | 1 | +---------------+----------+-------+

One could deduce here, for example, that the user brman had a relatively large number of open connections during the period in question. Whether this is unusual, we have too little data or the time period is too short.

  • Which queries were running on the system at the time in question and in which schema?
  • Which of these queries were unusual?

SELECT sd.schema_name, td.time, SUBSTR(std.statement_text, 1, 128) AS query FROM query_fact AS qf JOIN time_dim AS td ON td.time_id = qf.time_id JOIN schema_dim AS sd ON sd.schema_id = qf.schema_id JOIN statement_dim AS std ON std.statement_id = qf.statement_id WHERE td.time BETWEEN '17:00' AND '18:30' AND sd.schema_name = 'brman_catalog' AND std.command = 'Query' ORDER BY td.time, qf.statement_id LIMIT 10 ; +---------------+----------+----------------------------------------------------------------------------------------------------------------------------------+ | schema_name | time | query | +---------------+----------+----------------------------------------------------------------------------------------------------------------------------------+ | brman_catalog | 17:59:00 | SET NAMES `utf8` | | brman_catalog | 17:59:00 | SELECT COUNT ( * ) AS `cnt` FROM `information_schema` . `tables` WHERE `table_schema` = ? AND TABLE_NAME = ? | | brman_catalog | 17:59:00 | SELECT COUNT ( * ) AS `cnt` FROM `information_schema` . `tables` WHERE `table_schema` = ? AND TABLE_NAME = ? | | brman_catalog | 17:59:00 | CREATE TABLE `metadata` ( `id` TINYINT UNSIGNED NOT NULL AUTO_INCREMENT , `key` VARCHARACTER (?) NOT NULL , `value` VARCHARACTER | | brman_catalog | 17:59:00 | INSERT INTO `metadata` ( `key` , `value` ) VALUES (...) | | brman_catalog | 17:59:00 | INSERT INTO `metadata` ( `key` , `value` ) VALUES (...) | | brman_catalog | 17:59:00 | CREATE TABLE `backups` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT , `instance_name` VARCHARACTER (?) NOT NULL , `start_ts` | | brman_catalog | 17:59:00 | CREATE TABLE `backup_details` ( `backup_id` INTEGER UNSIGNED NOT NULL , `hostname` VARCHARACTER (?) NULL , `binlog_file` VARCHAR | | brman_catalog | 17:59:00 | CREATE TABLE `files` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT , `schema_name` VARCHARACTER (?) NULL , `original_name` VAR | | brman_catalog | 17:59:00 | CREATE TABLE `binary_logs` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT , `filename` VARCHARACTER (?) NOT NULL , `begin_ts` I | +---------------+----------+----------------------------------------------------------------------------------------------------------------------------------+
Suggestions for improvement

Based on this first iteration of the model, you can already see which questions the model cannot yet answer or where the model is too imprecise. This can then be improved in a second round....

Examples of this are:

  • The granularity of the time dimension may be too coarse with an accuracy of minutes. Would it make more sense to use seconds?
  • The question of how long a connection was open is not so easy to answer. Perhaps a further fact table would be appropriate here? SELECT cd.connection_number, cd.user, cd.hostname, tdf.time AS time_from, tdt.time AS time_to, (UNIX_TIMESTAMP(tdt.time) - UNIX_TIMESTAMP(tdf.time)) AS duration FROM connection_dim AS cd JOIN query_fact AS qf1 ON cd.connection_id = qf1.connection_id JOIN time_dim AS tdf ON tdf.time_id = qf1.time_id JOIN statement_dim AS sdf ON sdf.statement_id = qf1.statement_id JOIN query_fact AS qf2 ON cd.connection_id = qf2.connection_id JOIN time_dim AS tdt ON tdt.time_id = qf2.time_id JOIN statement_dim AS sdt ON sdt.statement_id = qf2.statement_id WHERE tdf.time BETWEEN '17:00' AND '18:30' AND sdf.command = 'Connect' AND sdt.command = 'Quit' AND (UNIX_TIMESTAMP(tdt.time) - UNIX_TIMESTAMP(tdf.time)) > 0 ORDER BY tdf.time ; +-------------------+-------+-----------+-----------+----------+----------+ | connection_number | user | hostname | time_from | time_to | duration | +-------------------+-------+-----------+-----------+----------+----------+ | 211 | brman | localhost | 17:59:00 | 18:00:00 | 60 | | 215 | root | localhost | 18:00:00 | 18:17:00 | 1020 | | 219 | brman | localhost | 18:06:00 | 18:07:00 | 60 | | 225 | brman | localhost | 18:10:00 | 18:11:00 | 60 | | 226 | brman | localhost | 18:13:00 | 18:14:00 | 60 | +-------------------+-------+-----------+-----------+----------+----------+
  • Of course, it would be exciting if an AI were used to solve the problem. How do you train it correctly and does it find the problem once it has been trained?

So much for the little gimmick of building a data warehouse...

Taxonomy upgrade extras: data warehousegeneral query log

Pages

Subscribe to FromDual aggregator