You are here
dbstat for MariaDB (and MySQL)
Table of contents
An idea that I have been thinking about for a long time and have now, thanks to a customer, finally tackled is dbstat
for MariaDB/MySQL. The idea is based on sar/sysstat
by Sebastien Godard:
sar - Collect, report, or save system activity information.
and Oracle Statspack:
Statspack is a performance tuning tool ... to quickly gather detailed analysis of the performance of that database instance.
Functionality of dbstat
Although we have had the performance schema for some time, it does not cover some points that we see as a problem in practice and that are requested by customers:
- The
table_size
module collects data on the growth of tables. This allows statements to be made about the growth of individual tables, databases, future MariaDB Catalogs or the entire instance. This is interesting for users who are using multi-tenant systems or are otherwise struggling with uncontrolled growth. - The
processlist
module takes a snapshot of the process list at regular intervals and saves it. This information is useful for post-mortem analyses if the user was too slow to save his process list or to understand how a problem has built up. - The problem is often caused by long-running transactions, row locks or metadata locks. These are recorded and saved by the
trx_and_lck
andmetadata_lock
modules. This means that we can see problems that we did not even notice before or we can see what led to the problem after the accident (analogous to a tachograph in a vehicle). - Another question that we sometimes encounter in practice is: When was which database variable changed and what did it look like before? This is covered by the
global_variables
module. Unfortunately, it is not possible to find out who changed the variable or why. Operational processes are required for this. - The last module,
global_status
, actually covers whatsar/sysstat does
. It collects the values fromSHOW GLOBAL STATUS;
and saves them for later analysis purposes or to simply create graphs.
How does dbstat
work
dbstat
uses the database Event Scheduler as a scheduler. This must first be switched on for MariaDB (event_scheduler = ON
). With MySQL it is already switched on by default. The Event Scheduler has the advantage that we can activate the jobs at a finer granularity, for example 10 s, which would not be possible with the crontab.
The Event Scheduler then executes SQL/PSM code to collect the data on the one hand and to delete the data on the other, so that the dbstat
database does not grow immeasurably.
The following jobs are currently planned:
Module | Collect | Delete | Quantity structure | Remarks |
---|---|---|---|---|
table_size | 1/d at 02:04 | 12/h, 1000 rows, > 31 d | 1000 tab × 31 d = 31k rows | Should work up to 288k tables. |
processlist | 1/min | 1/min, 1000 rows, > 7 d | 1000 con × 1440 min × 7 d = 10M rows | Should work up to 1000 concurrent connections. |
trx_and_lck | 1/min | 1/min, 1000 rows, > 7 d | 100 lck × 1440 min × 7 d = 1M rows | Depends very much on the application. |
metadata_lock | 1/min | 12/h, 1000 rows, > 30 d | 100 mdl × 1440 × 30 d = 4M rows | Depends very much on the application. |
global_variables | 1/min | never | 1000 rows | Normally this table should not grow. |
global_status | 1/min | 1/min, 1000 rows, > 30 d | 1000 rows × 1440 × 30 d = 40M | Rows can become large? |
How to install dbstat
dbstat
can be downloaded from Github and is licensed under GPLv2.
The installation is simple: First execute the SQL file create_user_and_db.sql
. Then execute the corresponding create_*.sql
files for the respective modules in the dbstat
database. There are currently no direct dependencies between the modules. If you want to use a different user or a different database than dbstat, you have to take care of this yourself.
Query dbstat
Some possible queries on the data have already been prepared. They can be found in the query_*.sql
files. Here are a few examples:
table_size
SELECT `table_schema`, `table_name`, `ts`, `table_rows`, `data_length`, `index_length`
FROM `table_size`
WHERE `table_catalog` = 'def'
AND `table_schema` = 'dbstat'
AND `table_name` = 'table_size'
ORDER BY `ts` ASC
;
+--------------+------------+---------------------+------------+-------------+--------------+
| table_schema | table_name | ts | table_rows | data_length | index_length |
+--------------+------------+---------------------+------------+-------------+--------------+
| dbstat | table_size | 2024-03-09 20:01:00 | 0 | 16384 | 16384 |
| dbstat | table_size | 2024-03-10 17:26:33 | 310 | 65536 | 16384 |
| dbstat | table_size | 2024-03-11 08:28:12 | 622 | 114688 | 49152 |
| dbstat | table_size | 2024-03-12 08:02:38 | 934 | 114688 | 49152 |
| dbstat | table_size | 2024-03-13 08:08:55 | 1247 | 278528 | 81920 |
+--------------+------------+---------------------+------------+-------------+--------------+
processlist
SELECT connection_id, ts, time, state, SUBSTR(REGEXP_REPLACE(REPLACE(query, "\n", ' '), '\ +', ' '), 1, 64) AS query FROM processlist WHERE command != 'Sleep' AND connection_id = @connection_id ORDER BY ts ASC LIMIT 5 ; +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | connection_id | ts | time | state | query | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+ | 14956 | 2024-03-09 20:21:12 | 13.042 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:22:12 | 73.045 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:23:12 | 133.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:24:12 | 193.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | | 14956 | 2024-03-09 20:25:12 | 253.041 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 | +---------------+---------------------+---------+---------------------------------+---------------------------------------------+
trx_and_lck
SELECT * FROM trx_and_lck\G *************************** 1. row *************************** machine_name: connection_id: 14815 trx_id: 269766 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Query time: 41.000 running_since: 2024-03-09 20:05:16 state: Statistics info: select * from test where id = 6 for update trx_state: LOCK WAIT trx_started: 2024-03-09 20:05:15 trx_requested_lock_id: 269766:821:5:7 trx_tables_in_use: 1 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 0 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6 *************************** 2. row *************************** machine_name: connection_id: 14817 trx_id: 269760 ts: 2024-03-09 20:05:57 user: root host: localhost db: test command: Sleep time: 60.000 running_since: 2024-03-09 20:04:57 state: info: trx_state: RUNNING trx_started: 2024-03-09 20:04:56 trx_requested_lock_id: NULL trx_tables_in_use: 0 trx_tables_locked: 1 trx_lock_structs: 2 trx_rows_locked: 1 trx_rows_modified: 1 lock_mode: X lock_type: RECORD lock_table_schema: test lock_table_name: test lock_index: PRIMARY lock_space: 821 lock_page: 5 lock_rec: 7 lock_data: 6
metadata_lock
SELECT lock_mode, ts, user, host, lock_type, table_schema, table_name, time, started, state, query FROM metadata_lock WHERE connection_id = 14347 ORDER BY started DESC LIMIT 5 ; +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | lock_mode | ts | user | host | lock_type | table_schema | table_name | time | started | state | query | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+ | MDL_SHARED_WRITE | 2024-03-13 10:27:33 | root | localhost | Table metadata lock | test | test | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_TRANS_DML | 2024-03-13 10:27:33 | root | localhost | Backup lock | | | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) | | MDL_BACKUP_ALTER_COPY | 2024-03-13 10:22:33 | root | localhost | Backup lock | | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_SHARED_UPGRADABLE | 2024-03-13 10:22:33 | root | localhost | Table metadata lock | test | test | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | | MDL_INTENTION_EXCLUSIVE | 2024-03-13 10:22:33 | root | localhost | Schema metadata lock | test | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) | +-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
global_variables
SELECT variable_name, COUNT(*) AS cnt FROM global_variables GROUP BY variable_name HAVING COUNT(*) > 1 ; +-------------------------+-----+ | variable_name | cnt | +-------------------------+-----+ | innodb_buffer_pool_size | 7 | +-------------------------+-----+ SELECT variable_name, ts, variable_value FROM global_variables WHERE variable_name = 'innodb_buffer_pool_size' ; +-------------------------+---------------------+----------------+ | variable_name | ts | variable_value | +-------------------------+---------------------+----------------+ | innodb_buffer_pool_size | 2024-03-09 21:36:28 | 134217728 | | innodb_buffer_pool_size | 2024-03-09 21:40:25 | 268435456 | | innodb_buffer_pool_size | 2024-03-09 21:48:14 | 134217728 | +-------------------------+---------------------+----------------+
global_status
SELECT s1.ts , s1.variable_value AS 'table_open_cache_misses' , s2.variable_value AS 'table_open_cache_hits' FROM global_status AS s1 JOIN global_status AS s2 ON s1.ts = s2.ts WHERE s1.variable_name = 'table_open_cache_misses' AND s2.variable_name = 'table_open_cache_hits' AND s1.ts BETWEEN '2024-03-13 11:55:00' AND '2024-03-13 12:05:00' ORDER BY ts ASC ; +---------------------+-------------------------+-----------------------+ | ts | table_open_cache_misses | table_open_cache_hits | +---------------------+-------------------------+-----------------------+ | 2024-03-13 11:55:47 | 1001 | 60711 | | 2024-03-13 11:56:47 | 1008 | 61418 | | 2024-03-13 11:57:47 | 1015 | 62125 | | 2024-03-13 11:58:47 | 1022 | 62829 | | 2024-03-13 11:59:47 | 1029 | 63533 | | 2024-03-13 12:00:47 | 1036 | 64237 | | 2024-03-13 12:01:47 | 1043 | 64944 | | 2024-03-13 12:02:47 | 1050 | 65651 | | 2024-03-13 12:03:47 | 1057 | 66355 | | 2024-03-13 12:04:47 | 1064 | 67059 | +---------------------+-------------------------+-----------------------+
Testing
We have currently rolled out dbstat
on our test and production systems to test it and see whether our assumptions regarding stability and calculations of the quantity structure are correct. In addition, using it ourselves is the best way to find out if something is missing or if the handling is impractical (Eat your own dog food).