MySQL 8.0 Reference Manual Including MySQL NDB Cluster 8.0

25.4.3.6 Defining NDB Cluster Data Nodes

The [ndbd] and [ndbd default] sections are used to configure the behavior of the cluster's data nodes.

[ndbd] and [ndbd default] are always used as the section names whether you are using ndbd or ndbmtd binaries for the data node processes.

There are many parameters which control buffer sizes, pool sizes, timeouts, and so forth. The only mandatory parameter is HostName; this must be defined in the local [ndbd] section.

The parameter NoOfReplicas should be defined in the [ndbd default] section, as it is common to all Cluster data nodes. It is not strictly necessary to set NoOfReplicas, but it is good practice to set it explicitly.

Most data node parameters are set in the [ndbd default] section. Only those parameters explicitly stated as being able to set local values are permitted to be changed in the [ndbd] section. Where present, HostName and NodeId must be defined in the local [ndbd] section, and not in any other section of config.ini. In other words, settings for these parameters are specific to one data node.

For those parameters affecting memory usage or buffer sizes, it is possible to use K, M, or G as a suffix to indicate units of 1024, 1024×1024, or 1024×1024×1024. (For example, 100K means 100 × 1024 = 102400.)

Parameter names and values are case-insensitive, unless used in a MySQL Server my.cnf or my.ini file, in which case they are case-sensitive.

Information about configuration parameters specific to NDB Cluster Disk Data tables can be found later in this section (see Disk Data Configuration Parameters).

All of these parameters also apply to ndbmtd (the multithreaded version of ndbd). Three additional data node configuration parameters—MaxNoOfExecutionThreads, ThreadConfig, and NoOfFragmentLogParts—apply to ndbmtd only; these have no effect when used with ndbd. For more information, see Multi-Threading Configuration Parameters (ndbmtd). See also Section 25.5.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”.

Identifying data nodes.  The NodeId or Id value (that is, the data node identifier) can be allocated on the command line when the node is started or in the configuration file.

Data Memory, Index Memory, and String Memory

DataMemory and IndexMemory are [ndbd] parameters specifying the size of memory segments used to store the actual records and their indexes. In setting values for these, it is important to understand how DataMemory is used, as it usually needs to be updated to reflect actual usage by the cluster.

Note

IndexMemory is deprecated, and subject to removal in a future version of NDB Cluster. See the descriptions that follow for further information.

The following example illustrates how memory is used for a table. Consider this table definition:

CREATE TABLE example (
  a INT NOT NULL,
  b INT NOT NULL,
  c INT NOT NULL,
  PRIMARY KEY(a),
  UNIQUE(b)
) ENGINE=NDBCLUSTER;

For each record, there are 12 bytes of data plus 12 bytes overhead. Having no nullable columns saves 4 bytes of overhead. In addition, we have two ordered indexes on columns a and b consuming roughly 10 bytes each per record. There is a primary key hash index on the base table using roughly 29 bytes per record. The unique constraint is implemented by a separate table with b as primary key and a as a column. This other table consumes an additional 29 bytes of index memory per record in the example table as well 8 bytes of record data plus 12 bytes of overhead.

Thus, for one million records, we need 58MB for index memory to handle the hash indexes for the primary key and the unique constraint. We also need 64MB for the records of the base table and the unique index table, plus the two ordered index tables.

You can see that hash indexes takes up a fair amount of memory space; however, they provide very fast access to the data in return. They are also used in NDB Cluster to handle uniqueness constraints.

Currently, the only partitioning algorithm is hashing and ordered indexes are local to each node. Thus, ordered indexes cannot be used to handle uniqueness constraints in the general case.

An important point for both IndexMemory and DataMemory is that the total database size is the sum of all data memory and all index memory for each node group. Each node group is used to store replicated information, so if there are four nodes with two fragment replicas, there are two node groups. Thus, the total data memory available is 2 × DataMemory for each data node.

It is highly recommended that DataMemory and IndexMemory be set to the same values for all nodes. Data distribution is even over all nodes in the cluster, so the maximum amount of space available for any node can be no greater than that of the smallest node in the cluster.

DataMemory can be changed, but decreasing it can be risky; doing so can easily lead to a node or even an entire NDB Cluster that is unable to restart due to there being insufficient memory space. Increasing these values should be acceptable, but it is recommended that such upgrades are performed in the same manner as a software upgrade, beginning with an update of the configuration file, and then restarting the management server followed by restarting each data node in turn.

MinFreePct.  A proportion (5% by default) of data node resources including DataMemory is kept in reserve to insure that the data node does not exhaust its memory when performing a restart. This can be adjusted using the MinFreePct data node configuration parameter (default 5).

Version (or later) NDB 8.0.13
Type or units unsigned
Default 5
Range 0 - 100
Restart Type

Node Restart: Requires a rolling restart of the cluster. (NDB 8.0.13)

Updates do not increase the amount of index memory used. Inserts take effect immediately; however, rows are not actually deleted until the transaction is committed.

Transaction parameters.  The next few [ndbd] parameters that we discuss are important because they affect the number of parallel transactions and the sizes of transactions that can be handled by the system. MaxNoOfConcurrentTransactions sets the number of parallel transactions possible in a node. MaxNoOfConcurrentOperations sets the number of records that can be in update phase or locked simultaneously.

Both of these parameters (especially MaxNoOfConcurrentOperations) are likely targets for users setting specific values and not using the default value. The default value is set for systems using small transactions, to ensure that these do not use excessive memory.

MaxDMLOperationsPerTransaction sets the maximum number of DML operations that can be performed in a given transaction.

Transaction temporary storage.  The next set of [ndbd] parameters is used to determine temporary storage when executing a statement that is part of a Cluster transaction. All records are released when the statement is completed and the cluster is waiting for the commit or rollback.

The default values for these parameters are adequate for most situations. However, users with a need to support transactions involving large numbers of rows or operations may need to increase these values to enable better parallelism in the system, whereas users whose applications require relatively small transactions can decrease the values to save memory.

Transaction resource allocation parameters.  The parameters in the following list are used to allocate transaction resources in the transaction coordinator (DBTC). Leaving any one of these set to the default (0) dedicates transaction memory for 25% of estimated total data node usage for the corresponding resource. The actual maximum possible values for these parameters are typically limited by the amount of memory available to the data node; setting them has no impact on the total amount of memory allocated to the data node. In addition, you should keep in mind that they control numbers of reserved internal records for the data node independent of any settings for MaxDMLOperationsPerTransaction, MaxNoOfConcurrentIndexOperations, MaxNoOfConcurrentOperations, MaxNoOfConcurrentScans, MaxNoOfConcurrentTransactions, MaxNoOfFiredTriggers, MaxNoOfLocalScans, or TransactionBufferMemory (see Transaction parameters and Transaction temporary storage).

Scans and buffering.  There are additional [ndbd] parameters in the Dblqh module (in ndb/src/kernel/blocks/Dblqh/Dblqh.hpp) that affect reads and updates. These include ZATTRINBUF_FILESIZE, set by default to 10000 × 128 bytes (1250KB) and ZDATABUF_FILE_SIZE, set by default to 10000*16 bytes (roughly 156KB) of buffer space. To date, there have been neither any reports from users nor any results from our own extensive tests suggesting that either of these compile-time limits should be increased.

Memory Allocation

MaxAllocate

Version (or later) NDB 8.0.13
Type or units unsigned
Default 32M
Range 1M - 1G
Deprecated NDB 8.0.27
Restart Type

Node Restart: Requires a rolling restart of the cluster. (NDB 8.0.13)

This parameter was used in older versions of NDB Cluster, but has no effect in NDB 8.0. It is deprecated as of NDB 8.0.27, and subject to removal in a future release.

Multiple Transporters

Beginning with version 8.0.20, NDB allocates multiple transporters for communication between pairs of data nodes. The number of transporters so allocated can be influenced by setting an appropriate value for the NodeGroupTransporters parameter introduced in that release.

NodeGroupTransporters

Version (or later) NDB 8.0.20
Type or units integer
Default 0
Range 0 - 32
Added NDB 8.0.20
Restart Type

Node Restart: Requires a rolling restart of the cluster. (NDB 8.0.13)

This parameter determines the number of transporters used between nodes in the same node group. The default value (0) means that the number of transporters used is the same as the number of LDMs in the node. This should be sufficient for most use cases; thus it should seldom be necessary to change this value from its default.

Setting NodeGroupTransporters to a number greater than the number of LDM threads or the number of TC threads, whichever is higher, causes NDB to use the maximum of these two numbers of threads. This means that a value greater than this is effectively ignored.

Hash Map Size

DefaultHashMapSize

Version (or later) NDB 8.0.13
Type or units LDM threads
Default 240
Range 0 - 3840
Restart Type

Node Restart: Requires a rolling restart of the cluster. (NDB 8.0.13)

The original intended use for this parameter was to facilitate upgrades and especially downgrades to and from very old releases with differing default hash map sizes. This is not an issue when upgrading from NDB Cluster 7.3 (or later) to later versions.

Decreasing this parameter online after any tables have been created or modified with DefaultHashMapSize equal to 3840 is not currently supported.

Logging and checkpointing.  The following [ndbd] parameters control log and checkpoint behavior.

Metadata objects.  The next set of [ndbd] parameters defines pool sizes for metadata objects, used to define the maximum number of attributes, tables, indexes, and trigger objects used by indexes, events, and replication between clusters.

Note

These act merely as suggestions to the cluster, and any that are not specified revert to the default values shown.

Boolean parameters.  The behavior of data nodes is also affected by a set of [ndbd] parameters taking on boolean values. These parameters can each be specified as TRUE by setting them equal to 1 or Y, and as FALSE by setting them equal to 0 or N.

Controlling Timeouts, Intervals, and Disk Paging

There are a number of [ndbd] parameters specifying timeouts and intervals between various actions in Cluster data nodes. Most of the timeout values are specified in milliseconds. Any exceptions to this are mentioned where applicable.

The heartbeat interval between management nodes and data nodes is always 100 milliseconds, and is not configurable.

Buffering and logging.  Several [ndbd] configuration parameters enable the advanced user to have more control over the resources used by node processes and to adjust various buffer sizes at need.

These buffers are used as front ends to the file system when writing log records to disk. If the node is running in diskless mode, these parameters can be set to their minimum values without penalty due to the fact that disk writes are faked by the NDB storage engine's file system abstraction layer.

Controlling log messages.  In managing the cluster, it is very important to be able to control the number of log messages sent for various event types to stdout. For each event category, there are 16 possible event levels (numbered 0 through 15). Setting event reporting for a given event category to level 15 means all event reports in that category are sent to stdout; setting it to 0 means that no event reports in that category are made.

By default, only the startup message is sent to stdout, with the remaining event reporting level defaults being set to 0. The reason for this is that these messages are also sent to the management server's cluster log.

An analogous set of levels can be set for the management client to determine which event levels to record in the cluster log.

Data Node Debugging Parameters

The following parameters are intended for use during testing or debugging of data nodes, and not for use in production.

Backup parameters.  The [ndbd] parameters discussed in this section define memory buffers set aside for execution of online backups.

Note

The location of the backup files is determined by the BackupDataDir data node configuration parameter.

Additional requirements.  When specifying these parameters, the following relationships must hold true. Otherwise, the data node cannot start.

NDB Cluster Realtime Performance Parameters

The [ndbd] parameters discussed in this section are used in scheduling and locking of threads to specific CPUs on multiprocessor data node hosts.

Note

To make use of these parameters, the data node process must be run as system root.

Multi-Threading Configuration Parameters (ndbmtd).  ndbmtd runs by default as a single-threaded process and must be configured to use multiple threads, using either of two methods, both of which require setting configuration parameters in the config.ini file. The first method is simply to set an appropriate value for the MaxNoOfExecutionThreads configuration parameter. A second method makes it possible to set up more complex rules for ndbmtd multithreading using ThreadConfig. The next few paragraphs provide information about these parameters and their use with multithreaded data nodes.

Note

A backup using parallelism on the data nodes requires that multiple LDMs are in use on all data nodes in the cluster prior to taking the backup. For more information, see Section 25.6.8.5, “Taking an NDB Backup with Parallel Data Nodes”, as well as Section 25.5.23.3, “Restoring from a backup taken in parallel”.

Disk Data Configuration Parameters.  Configuration parameters affecting Disk Data behavior include the following:

Disk Data and GCP Stop errors.  Errors encountered when using Disk Data tables such as Node nodeid killed this node because GCP stop was detected (error 2303) are often referred to as GCP stop errors. Such errors occur when the redo log is not flushed to disk quickly enough; this is usually due to slow disks and insufficient disk throughput.

You can help prevent these errors from occurring by using faster disks, and by placing Disk Data files on a separate disk from the data node file system. Reducing the value of TimeBetweenGlobalCheckpoints tends to decrease the amount of data to be written for each global checkpoint, and so may provide some protection against redo log buffer overflows when trying to write a global checkpoint; however, reducing this value also permits less time in which to write the GCP, so this must be done with caution.

In addition to the considerations given for DiskPageBufferMemory as explained previously, it is also very important that the DiskIOThreadPool configuration parameter be set correctly; having DiskIOThreadPool set too high is very likely to cause GCP stop errors (Bug #37227).

GCP stops can be caused by save or commit timeouts; the TimeBetweenEpochsTimeout data node configuration parameter determines the timeout for commits. However, it is possible to disable both types of timeouts by setting this parameter to 0.

Parameters for configuring send buffer memory allocation.  Send buffer memory is allocated dynamically from a memory pool shared between all transporters, which means that the size of the send buffer can be adjusted as necessary. (Previously, the NDB kernel used a fixed-size send buffer for every node in the cluster, which was allocated when the node started and could not be changed while the node was running.) The TotalSendBufferMemory and OverLoadLimit data node configuration parameters permit the setting of limits on this memory allocation. For more information about the use of these parameters (as well as SendBufferMemory), see Section 25.4.3.14, “Configuring NDB Cluster Send Buffer Parameters”.

See also Section 25.6.7, “Adding NDB Cluster Data Nodes Online”.

Redo log over-commit handling.  It is possible to control a data node's handling of operations when too much time is taken flushing redo logs to disk. This occurs when a given redo log flush takes longer than RedoOverCommitLimit seconds, more than RedoOverCommitCounter times, causing any pending transactions to be aborted. When this happens, the API node that sent the transaction can handle the operations that should have been committed either by queuing the operations and re-trying them, or by aborting them, as determined by DefaultOperationRedoProblemAction. The data node configuration parameters for setting the timeout and number of times it may be exceeded before the API node takes this action are described in the following list:

Controlling restart attempts.  It is possible to exercise finely-grained control over restart attempts by data nodes when they fail to start using the MaxStartFailRetries and StartFailRetryDelay data node configuration parameters.

MaxStartFailRetries limits the total number of retries made before giving up on starting the data node, StartFailRetryDelay sets the number of seconds between retry attempts. These parameters are listed here:

NDB index statistics parameters.  The parameters in the following list relate to NDB index statistics generation.

Restart types.  Information about the restart types used by the parameter descriptions in this section is shown in the following table:

Table 25.16 NDB Cluster restart types

Symbol Restart Type Description
N Node The parameter can be updated using a rolling restart (see Section 25.6.5, “Performing a Rolling Restart of an NDB Cluster”)
S System All cluster nodes must be shut down completely, then restarted, to effect a change in this parameter
I Initial Data nodes must be restarted using the --initial option