MySQL Cluster overview
This is a chaotic collection of my MySQL Cluster experience…
Table of Contents
- config.ini template
- my.cnf template
- General Rules and/or experience
- MySQL Cluster restore
- Skript for converting tables to NDB (alter_engine.pl)
- MySQL Cluster memory sizing
config.ini template
A generic MySQL Cluster configuration file (config.ini) to start with. It is pretty much what the MySQL Cluster experts recommend right now:
#
# config.ini
#
# This configuration file is fore MySQL Clusters 6.2 and above...
# ----------------------------------------------------------------------
[TCP DEFAULT]
# Default is too small!
SendBufferMemory = 2M
ReceiveBufferMemory = 2M
# When this is configured together with section above ndb_mgmd will
# return with erro -1 (255). This is a bug and should be fixed earlier
# or later!
# When you move this section to the bottom it should work.
# You need one TCP section for EACH cluster node pair!
# (for example: 4 nodes = 6 sections)
# [TCP]
#
# NodeId1: 10
# NodeId2: 11
# Hostname1: 10.40.219.161
# Hostname2: 10.40.219.162
# ----------------------------------------------------------------------
[NDB_MGMD DEFAULT]
DataDir = /var/lib/mysql/cluster
# ----------------------------------------------------------------------
[NDB_MGMD]
Id = 1
HostName = 192.168.1.33
#[NDB_MGMD]
#
# Id = 2
# HostName = 192.168.1.34
# ----------------------------------------------------------------------
[NDBD DEFAULT]
# needs root privileges?
LockPagesInMainMemory = 1
MemReportFrequency = 600
# Everything else is not supported
NoOfReplicas = 2
# Should be the same on all data node for eas of maintenance
DataDir = /var/lib/mysql/cluster
# needs root privileges?
ODirect = 1
# Memory sizing either from ndb_size.pl, calculation or ALL DUMP 1000
DataMemory = 128M
# IndexMemory = DataMemory / (5 - 10)
IndexMemory = 16M
# REDO log sizing
# 256M is the new recommendation
FragmentLogFileSize = 256M
# 6 x DataMemory / (4 x FragmentLogFileSize)
# >= 6.4
# 6 x DataMemory x NO_OF_LQH_THREADS / (4 * FragmentLogFileSize)
NoOfFragmentLogFiles = 3
# should be around 16 - 64 MB default is too small (128 MB in extreme cases)
RedoBuffer = 32M
# No calculation available
SharedGlobalMemory = 64M
# 1 second data loss in max.
TimeBetweenGlobalCheckpoints = 1000
# leave as default to start with
# this is too big under low load (better 6-10)
TimeBetweenLocalCheckpoints = 20
# On a good I/O system you can set this value higher
Diskcheckpointspeed = 10M
# Backup and LCP related parameters
# This means full speed
Diskcheckpointspeedinrestart = 100M
BackupMaxWriteSize = 1M
BackupLogBufferSize = 4M
BackupDataBufferSize = 16M
BackupMemory = 20M
# # of expected concurrent trx / # data nodes
# This memory is pre-allocated, thus keep it low!
# leave as default to start with
MaxNoOfConcurrentTransactions = 256
# This memory is pre-allocated, thus keep it low! The default (32k) is much too high!
# Every INSERT/UPDATE/DELETE (Record = Operation) uses 1 kbyte!!!
# MaxNoOfConcurrentOperations > Operations/s (10k - 100k) / number data nodes
# MaxNoOfConcurrentOperations > MaxNoOfConcurrentTransactions x Operations/Transaction / 4
# Leave as default to start with. Use LIMIT <n> to limit number of operation records
# If you want to be on the safe side set it to 100k - 250k
MaxNoOfConcurrentOperations = 2048
# Defaults of 64 can be fine.
# Each index consumes approx. 15 kbyte per node!
MaxNoOfUniqueHashIndexes = 160
# Each object consumes approx. 10 kbyte per node!
# Use ndb_size.pl or calculate the number of OI indexes
# Also PK and UK create 1 OI each. Thus this values is at least as big
# as MaxNoOfTables. Usually 2 - 4 times bigger.
# The default of 128 can be too small soon.
MaxNoOfOrderedIndexes = 256
# Not that each OI, UI and BLOB counts towards one table object
# Usually safe
# Each table consumes approx. 20 kbyte per node!
MaxNoOfTables = 128
# Each attribute consumes around 200 bytes of storage per node
# a) 6 x MaxNoOfTables
# b) NoOfTables x AvgAttributsPerTable
MaxNoOfAttributes = 1024
# Default of 768 is often fine.
# Minmal value = #NoOfTables + #OI + 3 x #UI
MaxNoOfTriggers = 768
# Default too big.
MaxNoOfLocalScans = 64
# TimeBetweenEpochs = 100
# CGE >= 6.3.7
# CompressedLCP = 1
# CompressedBackup = 1
# GCE >= 6.3.4
# RealTimeScheduler = 1
# SchedulerSpinTimer = 400
# SchedulerExecutionTimer = 100
# ----------------------------------------------------------------------
[NDBD]
Id = 10
HostName = 192.168.1.37
# CGE >= 6.3.7
# LockExecuteThreadToCPU = 1
# LockMaintThreadsToCPU = 0 # on same CPU as eth interupts occur
[NDBD]
Id = 11
HostName = 192.168.1.36
# CGE >= 6.3.7
# LockExecuteThreadToCPU = 1
# LockMaintThreadsToCPU = 0 # on same CPU as eth interupts occur
# ----------------------------------------------------------------------
[MYSQLD DEFAULT]
[MYSQLD]
Id = 20
HostName = 192.168.1.35
# Keep some spare for later
# Prevents cluster restart
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
my.cnf template
A generic MySQL configuration file (my.cnf) to start with MySQL Cluster.
[mysqld]
# NDB Cluster related parameter
ndbcluster = 1
# Use IP no DNS name
ndb_connectstring = "host1;host2"
ndb_cluster_connection_pool = 5
# Recommended default setting changes:
ndb_use_exact_count = 0
# this is default
# Some of these parameter are performance relevant but have some
# drawbacks
# multi_range_count = 256
# 0, 1, 10, 10 is usually too verbouse
# ndb_extra_logging = 1
# ndb_cache_check_time = 0
# Increase if you are doing many inserts
ndb_autoincrement_prefetch_sz = 1024
engine_condition_pushdown = 1
ndb_force_send = 1
ndb_index_stat_enable = 0
# ndb_index_stat_cache_entries = 32
# Statistics cost more than they give
# ndb_index_stat_enable = 0
# ndb_index_stat_update_freq = 20
# Leave as default
# ndb_report_thresh_binlog_epoch_slip = 3
# ndb_report_thresh_binlog_mem_usage = 10
# ndb_use_copying_alter_table = 0
# ndb_use_transactions = 1
Genral rules and/or experience
MySQL Cluster v5.0
- Avoid ALTER TABLE in MySQL Cluster.
It needs 2 x the RAM of a table (memory is reclaimed only after a node restart).
Do
ALTER TABLE <table> ENGINE=MyISAM;instead, then the changes and then alter back to the cluster. DROP/TRUNCATEreclaims pages.UPDATE/INSERT/DELETEcauses memory de-fragmentation (same symptoms like a memory leak) ⇒ requires a node restart from time to time.- For production use is still and only v5.0 (exception CGE)!
MySQL Cluster v5.1
On disk table ⇒ is stable now.
Parallel node recovery is not yet implemented in v5.1 ⇒ maybe 5.2/6.0?
Partitioning does not work from the SQL interface (mysqld) but from the NDB-API.
No more memory de-fragmentation is done (see above).
Cluster to MySQL replication is still buggy! But should work.
Cluster to Cluster replication is stable.
In Cluster replication:
binlog_cache_size = TimeBetweenGlobalCheckpoints/1000 (default 2000) (default 32 kbyte) * total traffic (Mbyte/s) (for example 5 Mbyte/s) = 2000 ms / 1000 ms/s * 5 Mbyte/s ≌ 10 Mybtepdflush(kswapd)flushes every 5 seconds pages to disk ⇒ Causes troubles with cluster. The kernel parametersvm.page-clusterandvm.lower-zone-protection(later one does not exist on my maybe non-NUMA system) can help: /# cat /proc/sys/vm/lower_zone_protection # cat /proc/meminfo # sysctl vm | egrep 'page-cluster|lower-zone-protection'For MySQL Cluster on AMD Hardware set NUMA to OFF in BIOS! ⇒ Preferred hardware: Intel!
File system: Use ext3 or XFS.

