6Data Integrity and Capacity Planning
Data Integrity and Capacity Planning
This chapter provides information about planning for database capacity and data integrity in your Siebel CRM deployment. It includes the following topics:
Sizing the Database for a Siebel CRM Deployment
As with most client-server applications, the overall performance of Siebel CRM applications is largely dependent on the input and output (I/O) performance of the database server. To promote optimal performance for I/O, you must arrange the tables and indexes in the database across available disk devices in a way that evenly distributes the I/O load.
The mechanism for distributing database objects varies by RDBMS, depending on the way storage space is allocated. Most databases can force a given object to be created on a specific disk.
To verify which RDBMS products, versions, and patch levels are supported, see the Certifications tab on My Oracle Support.
In your planning, you must allocate space for multiple purposes, including system storage space, undo or rollback space, temporary table space, and space for logs and system files, as well as space for Siebel data and indexes. If you allocate too little space for your system storage, then you reduce performance. If you allocate too much space, then you waste disk space.
The space the RDBMS needs varies primarily based on the total number and types of users supported, as well as the transaction mix and rate. Consult the RDBMS vendor’s documentation for more information about these requirements.
The space required for Siebel data and indexes varies depending on what Siebel CRM functionality you implement and the amount and nature of data supporting that functionality.
To determine the size of the database required for a Siebel CRM deployment
Determine the total number and types of users of Siebel CRM applications (for example, 500 sales representatives and 75 sales managers).
Determine the Siebel CRM functionality that you implement and the entities that are required to support them. Usually, the largest entities are as follows:
Accounts
Activities
Contacts
Forecasts
Opportunities
Service requests
Estimate the average number of entities for each user (for example, 100 accounts for each sales representative) and calculate an estimated total number of records for each entity for your total user base.
Using standard sizing procedures for your specific database, calculate the average record size for each entity and multiply by the total number of records.
Typically, these entities span multiple physical tables, all of which you must include in the row size calculation in order to determine the estimated data size for the largest entities.
Add additional space for the storage of other Siebel data. A rough guideline for this additional amount would be half the storage required for these key entities.
Indexes typically require approximately the same amount of space as data.
Factor growth rates into your total size calculation.
Factor a margin of error into your total size calculation.
Database Table Planning
In most enterprise deployments, the following Siebel tables, and their corresponding indexes, are the most commonly used tables or are very large.
For example, the tables S_EVT_ACT, S_CONTACT, and S_ORG_EXT are large in all enterprise-level deployments of Siebel CRM. Separate these tables and indexes across devices.
As a general rule, put indexes in a different table space and, if possible, on different physical devices from the tables on which they are created.
For Siebel table spaces on an IBM DB2 database, use database-managed table spaces (DMS) rather than system-managed table spaces (SMS).
S_ACCNT_CHRCTR
S_ACCNT_CO_MSTR
S_ACCNT_POSTN
S_ADDR_ORG
S_ADDR_PER
S_ASSET
S_CALL_LST_CON
S_CON_CHRCTR
S_CON_TERR
S_CRSE_TSTRUN
S_CRSE_TSTRUN_A
S_CS_RUN
S_CS_RUN_ANSWR
S_CTLGCAT_PATH
S_CYC_CNT_ASSET
S_DNB_CON_MRC
S_DNB_ORG
S_DNB_ORG_SIC
S_DNB_UPDATE
S_DOCK_INIT_ITEM
S_DOCK_TXN_LOG
S_DOCK_TXN_LOGT
S_DOCK_TXN_SET
S_DOCK_TXN_SETT
S_ESCL_ACTN_REQ
S_ESCL_LOG
S_ESCL_REQ
S_EVT_ACT
S_EXP_ITEM
S_EXP_RPT
S_EXP_RPT_APPR
S_IC_CALC
S_IC_CALC_IT
S_IC_CMPNT_EARN
S_IC_TXN
S_IC_TXN_IT
S_IC_TXN_POSTN
S_INVC_ITM_DTL
S_INVLOC_ROLLUP
S_INVOICE
S_INVOICE_ITEM
S_INV_LGR_ENTRY
S_OPTY_POSTN
S_OPTY_PROD
S_OPTY_TERR
S_ORG_EXT
S_ORG_TERR
S_PARTY
S_PARTY_PER
S_PARTY_REL
S_PARTY_RPT_REL
S_POSTN_CON
S_PROC_REQ
S_PROD_BASELINE
S_PROD_CONSUME
S_PROD_SHIPMENT
S_PROD_TARGET
S_QUOTE_ITEM
S_SRM_REPLY
S_SRM_REQUEST
S_SRM_REQ_PARAM
S_SRV_REQ
Database Recovery Planning
Follow the RDBMS vendor’s recommendations on configuring the database for recovery in case of data corruption, hardware failure, or disaster.
Oracle Database Recovery Planning
Many companies today use RAID storage systems that make Oracle Database online redo log mirroring unnecessary.
If your organization does not use RAID storage systems, then mirror the redo log, at a minimum, because the redo log is essential when a database goes through failure recovery.
Also, when redo logs are mirrored at the RAID storage system level (usually RAID 1 or RAID 0+1), there is usually no need to mirror them at the Oracle Database level, since the RAID controller assures that these volumes can always be recovered. Mirroring at the RAID level usually improves database performance (especially beneficial for read operation).
If you have the resources, then mirror the Oracle Database control files as well. Otherwise, you can put the Oracle Database control files into a RAID 5 device, as it is not heavily accessed and disk performance is not a concern. The information it records, though, is very critical for Oracle Database. Any updates to the control file, such as the current System Change Number (SCN) or transaction tables, ripple across all of the members of the control file specification.
IBM DB2 Recovery Planning
Mirror the transaction log to guarantee database recovery if a single device fails. You must mirror the instance home directory, if resources are available. Hardware or operating system mirroring generally provides the best performance.
Database Physical Device Planning
To make sure that your database performs well, create at least one container for each available logical or physical disk device. You can use table spaces to place objects on multiple physical containers to promote parallel input and output (I/O). Spreading the data and index information across several containers (physical devices) can improve the performance of queries.
IBM DB2 Physical Device Planning
Locate data and log devices on different disk spindles to reduce contention between random and serial I/O. For IBM DB2, locate these devices on different disk spindles to minimize I/O contention. When this approach is not possible, spread devices containing database objects that are often used together across different spindles. These objects include tables, their indexes, and commonly joined tables.
If you are using a high performance disk subsystem, then you might choose a different physical device layout. Consult your DBA and the disk subsystem vendor for the optimal setup.
Physical Device Planning for UNIX Deployments
For UNIX database servers, locate all of the containers on raw UNIX disk partitions, except the containers used for LONG VARCHAR
data. Locate containers for LONG VARCHAR
data on the UNIX file system to take advantage of the operating system’s buffering capabilities. To make sure that your database performs well, create one container for each available logical or physical disk device. Locate data and log devices on different disk spindles to reduce contention between random and serial I/O.
Microsoft SQL Server Physical Device Planning
Use filegroups for assigning database objects to one or more files within a filegroup for maximum performance of the Siebel database. When you group objects, you have the ability to distribute a filegroup across multiple disks, thereby causing less resource contention.
If your enterprise does not require very high performance, based on the number of concurrent users, for example, then using RAID devices and Microsoft’s default setting might suffice. A database administrator must do the necessary sizing calculations to assess the performance requirements during the planning process.
Database RAID Array Planning
A database RAID array (redundant array of independent drives) can provide large amounts of input and output (I/O) throughput and capacity, while appearing to the operating system and RDBMS as a single large disk (or multiple disks, as desired, for manageability). The use of RAIDs can greatly simplify the database layout process by providing an abstraction layer above the physical disks, while promoting high performance.
Performance of the RAID feature provided by the operating system might not be satisfactory. To obtain the best RAID performance, use the RAID support provided by your RAID vendor.
If a RAID Array Is Not Used
If a RAID device is not in use, even if space is at a premium, then you must separate indexes with names ending in _P1 from the tables on which they are created. These tables are heavily used in joins.
If you make frequent use of Siebel Enterprise Integration Manager (EIM), then you might want to put the EIM tables and indexes (names starting with EIM_) on different devices from the Siebel base tables. Both tables are accessed simultaneously during Siebel EIM operations.
Microsoft SQL Server RAID Array Planning
The following table describes a sample disk layout for a server dedicated to Microsoft SQL Server, where the database uses a single filegroup residing on a disk array. The use of a single RAID array for the database devices provides satisfactory performance in many cases without the administrative overhead of using individual filegroups.
Disk | Objects | Comments |
---|---|---|
Single mirrored |
Windows OS |
N/A |
Single disk |
Windows pagefile |
Segregate for maximum performance. |
Single mirrored |
SQL Server logfile |
Segregate sequential I/O for database performance. |
3 to 5 disks (minimum) in a RAID configuration |
Siebel database data and indexes |
Add as many spindles as required for performance and storage capacity. |
If your enterprise requires the highest performance standards, then place heavily used tables and their corresponding indexes, such as those listed under Sizing the Database for a Siebel CRM Deployment, in a specific SQL Server filegroup within your database. By creating a filegroup on a specific disk or on multiple disks, you can control where tables and indexes in your database are physically located. For more information, see Database Physical Device Planning.
When separating database objects into filegroups, you can avoid complex calculations by using Microsoft’s recommended RAID disk layouts.
Your choice to use RAID devices or multiple filegroups to distribute database objects depends solely on how great your performance needs are. It is recommended that you work with your hardware vendor to determine the optimal RAID configuration for your specific requirements.