You are here

Beware of large MySQL max_sort_length parameter

Today we had a very interesting phenomena at a customer. He complained that MySQL always get some errors of the following type:

[ERROR] mysqld: Sort aborted: Error writing file '/tmp/MYGbBrpA' (Errcode: 28 - No space left on device)

After a first investigation we found that df -h /tmp shows from time to time a full disk but we could not see any file with ls -la /tmp/MY*.

After some more investigation we found even the query from the Slow Query Log which was producing the same problem. It looked similar to this query:

SELECT * FROM test ORDER BY field5, field4, field3, field2, field1;

Now we were capable to simulate the problem at will with the following table:

CREATE TABLE `test` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `data` varchar(64) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `field1` varchar(16) DEFAULT NULL,
  `field2` varchar(16) DEFAULT NULL,
  `field3` varchar(255) DEFAULT NULL,
  `field4` varchar(255) DEFAULT NULL,
  `field5` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8912746 DEFAULT CHARSET=utf8
;

An we have seen the query in SHOW PROCESSLIST:

| Query   |   26 | Creating sort index | select * from test order by field5, field4, field3, field2, field1 |

But we were still not capable to see who or better how the hell mysqld is filling our disk!

I remembered further that I have seen some strange settings in the my.cnf before when we did the review of the database configuration. But I ignored them somehow.

[mysqld]
max_sort_length  = 8M
sort_buffer_size = 20M

Now I remembered again these settings. We changed max_sort_length back to default 1k and suddenly our space problems disappeared!

We played a bit around with different values of max_sort_length and got the following execution times for our query:

max_sort_lengthexecution time [s]comment
64 8.8 s
128 8.2 s
256 9.3 s
512 11.8 s
1k 14.9 s
2k 20.0 s
8k129.0 s
8M 75.0 sdisk full (50 G)

Conclusion

We set the values of max_sort_length back to the defaults. Our problems disappeared and we got working and much faster SELECT queries.

Do not needlessly change default values of MySQL without proving the impact. It can become worse than before!!!

The default value of max_sort_length is a good compromise between performance and an appropriate sort length.

Addendum

What I really did not like on this solution was, that I did not understand the way the problem occurred. So I did some more investigation in this. We were discussing forth and back if this could be because of XFS, because of sparse files or some kind of memory mapped files (see also man mmap).

At the end I had the idea to look at the lsof command during my running query:

mysql> SELECT * FROM test ORDER BY field5, field4, field3, field2, field1;
ERROR 3 (HY000): Error writing file '/tmp/MYBuWcXP' (Errcode: 28 - No space left on device)

shell> lsof -p 14733

COMMAND   PID  USER   FD   TYPE             DEVICE   SIZE/OFF     NODE NAME
mysqld  14733 mysql   32u   REG               8,18 9705619456 30147474 /tmp/MYck8vf4 (deleted)
mysqld  14733 mysql   49u   REG               8,18  749797376 30147596 /tmp/MYBuWcXP (deleted)

So it looks like that there were some deleted files which were growing!

Further information from the IRC channel led me to the libc temporary files (see also man 3 tmpfile).

And some hints from MadMerlin|work pointed me to:

shell> ls /proc//fd

Where you can also see those temporary files.

Thanks to MadMerlin|work for the hints!

Taxonomy upgrade extras: 

Comments

For this phenomenon, the relevant documentation is here: deleting files. Close to the top, see this sentence:
The unlink function deletes the file name filename.
If this is a file’s sole name, the file itself is also deleted.
(Actually, if any process has the file open when this happens,
deletion is postponed until all processes have closed the file.)

This is the standard programming technique to ensure that temporary files are automatically cleaned up on a program or machine crash. Often this subject comes up as the difference in the output of "du" (follows directory entries) and "df" (checks file system space information).

Jörg Brühe

joergcomment

Another idea we discussed (but does not fit to the problem) is lost space due to over-mounted directories. If you have some files in this directory they are not shown any more (= hidden) because they are over-mounted but they are still there an used space. To look into an over-mounted directory to see the files in there the following recipe might help:
cd /mnt
touch 1 2 3
cd
mount /dev/sdd1 /mnt -> files are hidden now

mount --bind / /mnt2
ls /mnt2/mnt
1  2  3
Shinguzcomment