You are here

innodb_deadlock_detect - Rather Hands off!

Recently we had a new customer who has had from time to time massive database problems which he did not understand. When we reviewed the MySQL configuration file (my.cnf) we found, that this customer had disabled the InnoDB Deadlock detection (innodb_deadlock_detect).

Because we have advised against doing this so far, but I never stumbled upon this problem in practice, I have investigated a bit more about the MySQL variable innodb_deadlock_detect.

The MySQL documentation tells us the following [1]:

Disabling Deadlock Detection
On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs. Deadlock detection can be disabled using the innodb_deadlock_detect configuration option.

And about the parameter innodb_deadlock_detect itself [2] itself:

This option is used to disable deadlock detection. On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs.

The problem is, that every time, when MySQL is doing a (Row) Lock or a Table Lock, it checks, if the Lock causes a Deadlock. This check is quite expensive. By the way: The feature disabling InnoDB Deadlock detection was developed by Facebook for WebScaleSQL [3].

The relevant functions can be found in [4]:

class DeadlockChecker, method check_and_resolve (DeadlockChecker::check_and_resolve)

Every InnoDB (row) Lock (for mode LOCK_S or LOCK_X) and type ORed with LOCK_GAP or
LOCK_REC_NOT_GAP, ORed with LOCK_INSERT_INTENTION

Enqueue a waiting request for a lock which cannot be granted immediately.

lock_rec_enqueue_waiting()

and

Every (InnoDB) Table Lock

Enqueues a waiting request for a table lock which cannot be granted immediately. Checks for deadlocks.

lock_table_enqueue_waiting()

This means if the variable innodb_deadlock_detect is enabled (= default) for every Lock (Row or Table) it is checked, if it causes a Deadlock. If the variable is disabled, the check is NOT done (which is faster) and the transaction hangs in (Dead-)Lock until the Lock is freed or the time innodb_lock_wait_timeout (default 50 seconds) is exceeded. Then the InnoDB Lock Wait Timeout (detector?) strikes and kills the transaction.

SQL> SHOW GLOBAL VARIABLES LIKE 'innodb_lock_wait%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_lock_wait_timeout | 50    |
+--------------------------+-------+

This means, deactivating InnoDB Deadlock detection is interesting, if you have many (like Facebook!?!) short and small transactions where you expect little to now conflicts.
Further it is recommended, to set the MySQL variable innodb_lock_wait_timeout to a very small value (some seconds).

Because most of our customers do not have the size of Facebook and because they have rather not so many concurrent short and small transactions but few but long transactions (with probably many Locks and therefore a high Deadlock probability), I can imagine, disabling this parameter was responsible for the hickup (Locks are piling up) of the customer system. Which leads to exceeding max_connections and finally the whole system sticks.

Therefore I strongly recommend, to let InnoDB Deadlock detection enabled. Except you know exactly what your are doing (after about 2 weeks of extensive testing and measuring).

Literature

Comments

The InnoDB deadlock detector was rewritten in MySQL 5.7.1, in a merge commit that does not mention any WL or bug number, nor any reviewer. I did not review those changes, and I cannot say whether the algorithmic complexity of the deadlock checker was reduced. Possibly the rewrite replaced procedural recursion with iteration over a data structure that explicitly represents a stack.

I think that the InnoDB locking performance could be significantly improved by extending the use of implicit locking to UPDATE and DELETE operations, as noted in MDEV-16232. It could also be helpful to merge multiple record log bitmaps into a single bitmap per page, with 4 bits per record, as noted in MDEV-16406.

Marko Mäkelä, Lead Developer InnoDB
MariaDB Corporation

Marko Mäkeläcomment