This chapter provides information on the pre-installation tasks you must complete on Cassandra nodes before you can install Messaging Server software.
The following list summarizes the general pre-installation tasks you must complete before installing any Messaging Server component.
Create a UNIX system user and group for Messaging Server, and set permissions for the directories and files owned by that user.
Check that DNS is running and configured properly for the Messaging Server host.
Check the number of file descriptors for Linux, and if this number is less than 16384, you need to increase the value.
Install Oracle Directory Server Enterprise Edition, if your site does not currently have Directory Server deployed.
See the chapter titled "Messaging Server Pre-Installation Tasks" in Messaging Server Installation and Configuration Guide for detailed information.
The following list summarizes the pre-installation tasks you must complete on Cassandra nodes:
To install Java, see "Installing Oracle JDK on RHEL-based Systems" on the DataStax web site at:
http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/install/installJdkRHEL.html
Note:
The JAVA_HOME/bin directory must be in the PATH environment variable.To install Python, see the Python documentation at:
https://docs.python.org/2/installing/
Be sure to use the version of Python that is supported by the version of DataStax Enterprise Max that you are installing.
The tasks to install DataStax Enterprise are:
To download the DataStax Enterprise Max software:
Register with DataStax and download the DSE software from the DataStax download site, located at:
Copy the installer file to your Cassandra message store hosts.
To install DataStax Enterprise software:
On each Cassandra/Solr node, you configure a datastax.repo file, install the DataStax Enterprise packages, start the DSE software, and verify that DSE is running.
For more information, see the DataStax Enterprise installation documentation at:
http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/install/installTOC.html
On all Solr nodes, enable Solr by setting the following option in the /etc/default/dse file:
SOLR_ENABLED=1
Ensure that for Oracle Linux 6.x and later, the 32-bit versions of the glibc libraries are installed.
For more information, see the DataStax Enterprise documentation at:
https://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/install/installDseInstallGlibc.html
Optionally, install OpsCenter, a visual management and monitoring solution for DataStax Enterprise. For more information, see the OpsCenter installation documentation at:
https://docs.datastax.com/en/latest-opscenter/opsc/online_help/opscOverview_c.html
To set up the Cassandra cluster, see the following DataStax documentation:
For a single data center, see:
http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/production/singleDCperWorkloadType.html
For multiple data centers, see:
http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/production/multiDCperWorkloadType.html
When setting up multiple data centers, the Messaging Server recommendation, which minimizes the overhead in replicating and repairing DataStax keyspaces across all data centers, is to configure four data centers in three clusters with keyspaces arranged as shown in Table 5-1.
Table 5-1 Recommended Multiple Data Centers and Clusters Configuration
Data Center Name and Node Types | Keyspaces | Cluster Configuration |
---|---|---|
DC_MSG, Cassandra nodes |
ms_msg |
Cluster Content |
DC_META, Cassandra nodes |
ms_mbox, ms_index |
Combined with DC_INDEX into Cluster Metadata |
DC_INDEX, Cassandra/Solr nodes |
ms_index |
Combined with DC_META into Cluster Metadata |
DC_CACHE, Cassandra nodes |
ms_cache |
Cluster Cache |
Cluster settings, such as the cluster name and seed nodes, are defined in the cassandra.yaml file. See the following section for more information.
To support more concurrent index updates, the ratio of DC_META nodes to DC_INDEX nodes should be at least 1 to 2.
On each Cassandra node, optimize the DataStax Enterprise installation by following the recommendations in the DataStax documentation at:
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configRecommendedSettings.html
On each Cassandra node, change the configuration files described in this section so that the node operates correctly in the Cassandra message store deployment.
To optimize Cassandra on Linux, see the DataStax recommendations at:
https://docs.datastax.com/en/landing_page/doc/landing_page/recommendedSettingsLinux.html
For DC_INDEX nodes, which run Solr, make the following change to the /etc/default/dse file:
SOLR_ENABLED=1
For DC_INDEX (Solr) nodes, make the following changes to the /etc/dse/cassandra/dse.yaml file to improve performance:
max_solr_concurrency_per_core: 6 back_pressure_threshold_per_core: 5000 cql_slow_log_option: enabled: false
Make the changes in this section to the /etc/dse/cassandra/cassandra.yaml file.
For all nodes, to enable separate clusters for better performance, specify cluster_name.
Make the following changes to the num_tokens setting:
DC_MSG, DC_META, and DC_CACHE nodes:
num_tokens: 256
DC_INDEX (Solr) nodes:
num_tokens: 16 allocate_tokens_for_local_replication_factor
On DC_INDEX (Solr) nodes, make the following change to the allocate_tokens_for_local_replication_factor setting:
allocate_tokens_for_local_replication_factor replication_factor
where replication_factor is derived from the store.cassolrrf configuration option, and by default has a value of 2.
Note:
This recommendation is for the DataStax Enterprise 5.10. release.To improve performance, locate data on SSD drives:
data_file_directories:
/var/lib/cassandra/data
commitlog_directory:
/var/lib/cassandra/commitlog
saved_caches_directory:
/var/lib/cassandra/saved_caches
hints_directory:
/var/lib/cassandra/hints
To support large mailbox and message, increase the commitlog size:
commitlog_segment_size_in_mb: 256
To specify seed nodes, you must use two nodes from each data center in the cluster, preferably located on different racks, so that each cluster has different seeds, for example:
DC_MSG cluster:
seeds: "192.0.2.12,192.0.2.24"
DC_META/DC_INDEX cluster:
seeds: "192.0.2.1,192.0.2.2,192.0.2.10,192.0.2.3"
DC_CACHE cluster:
seeds: "192.0.2.14,192.0.2.7"
For DC_INDEX nodes, make the following changes to improve performance:
memtable_heap_space_in_mb: 2048
For all nodes, make the following change to improve performance:
memtable_flush_writers: 8
For all nodes, to specify listen_address, rpc_address, and so on, make the following changes:
listen_address: 10.128.128.12 rpc_address: 10.128.128.12
For all nodes, to specify the location of the heap dump, make the following change to the /etc/dse/cassandra/cassandra-env.sh file:
export CASSANDRA_HEAPDUMP_DIR=/scratch/heapdump
For DC_MSGDC_META, DC_INDEX, and DC_CACHE nodes, make the following heap size changes:
-Xms32G -Xmx32G
For DC_META nodes, to improve performance, make the following heap size changes to improve performance:
-Xms16G -Xmx16G
For all nodes, make the following changes to improve performance:
-XX:InitiatingHeapOccupancyPercent=70 -XX:ParallelGCThreads=12 -XX:ConcGCThreads=12
For all nodes, print garbage collection measurements, which are useful for monitoring system performance:
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=1 -Xloggc:/var/log/cassandra/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
For all nodes, make the following changes to the cassandra-rackdc.properties file.
Configure the endpoint snitch:
endpoint_snitch: GossipingPropertyFileSnitch
Set the data center and rack names as appropriate:
dc=mydc rack=myrac
For example, for a node in DC_CACHE in a physical rack in one location, set dc=DC_CACHE and rack=RAC1. And, for another node in DC_CACHE in a physical rack in another location, set dc=DC_CACHE and rack=RAC2.
Note:
Data center and rack names are case sensitive.On each Cassandra/Solr node, you might need to make changes to the solrconfig.xml file, which is the configuration file with the most tuning parameters affecting Solr itself. For more information about Solr tuning parameters, see the DataStax documentation at:
http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/search/performanceTuningTOC.html
Note:
DSE Search has a live indexing feature to increase indexing throughput, which is turned off by default. Enabling this feature causes sporadic search failures under load. This is a known DSE bug (DSP-12600) as of DSE 5.0.4.For more information about the live indexing feature, see the DataStax documentation at:
http://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/srch/tuningIndexing.html
To make changes to the solrconfig.xml file:
Use the dsetool read_resource keyspace.table name=resfilename command to read the xml.
where:
keyspace.table is ks-preindex (ks-pre is the prefix configured by the store.caskeyspaceprefix option; the default is ms_)
resfilename is solrconfig.xml
Edit the xml.
Use the dsetool write_resource keyspace.table name=resfilename file=path_to_file _to_upload command to write changes to the xml.
where:
file=path_to_file_to_upload is the name and path of the resource file to upload
Use the dsetool reload_core keyspace.table command to reload the Solr core.
Example:
dsetool read_resource ms_index.msgindex name=solrconfig.xml > /tmp/solrconfig.xml vi /tmp/solrconfig.xml ##### Make required edits to the file ##### dsetool write_resource ms_index.msgindex name=solrconfig.xml file=/tmp/solrconfig.xml dsetool reload_core ms_index.msgindex