Determine the maximum host data transfer rate, R _{max.}.
Determine this value empirically, because it depends on network and host hardware. Note this is different from the maximum HADB data transfer rate, R _{dt}, determined in the previous section.
Determine the number of hosts needed to accommodate this data
Updating a volume of data V distributed over a number of hosts N _{HOSTS} causes each host to receive approximately 4V/N _{HOSTS} of data. Determine the number of hosts needed to accommodate this volume of data with the following formula:
N_{HOSTS} = 4^{ .} R_{dt} / R_{max}
Round this value up to the nearest even number to get the same number of hosts for each DRU.
Add one host on each DRU for spare nodes.
If each of the other hosts run N data nodes, let this host run N spare nodes. This allows for single-machine failure taking down N data nodes.
Each host needs to run at least one node, so if the number of nodes is less than the number of hosts (N_{NODES} < N_{HOSTS}), adjust N_{NODES} to be equal to N_{HOSTS}. If the number of nodes is greater than the number of hosts, (N_{NODES} \> N_{HOSTS}), several nodes can be run on the same host.