Schema Optimization

Optimization is an important consideration when you design your database schema and object model. Most performance issues occur when the object model or database schema is too complex, which can make the database slow and difficult to query. This is most likely to happen if you derive your database schema directly from a complex object model.

To optimize performance, design the object model and database schema together. However, allow each model to be designed optimally: do not require a direct one-to-one correlation between the two.

This section includes the following schema optimization examples:

Schema Case 1: Aggregation of Two Tables into One
Schema Case 2: Splitting One Table Into Many
Schema Case 3: Collapsed Hierarchy
Schema Case 4: Choosing One out of Many

Schema Case 1: Aggregation of Two Tables into One

A common schema optimization technique is to aggregate two tables into a single table. This improves read and write performance by requiring only one database operation instead of two.

Table 11-3 and Table 11-4 illustrate the table aggregation technique.

Table 11-3 Original Schema (Aggregation of Two Tables Case)

Elements	Details
Title	ACME Member Location Tracking System
Classes	Member, Address
Tables	MEMBER, ADDRESS
Relationships	Source, Instance Variable, Mapping, Target, Member, address, one-to-one, Address

The nature of this application dictates that developers always look up employees and addresses together. As a result, querying a member based on address information requires a database join, and reading a member and its address requires two read statements. Writing a member requires two write statements. This adds unnecessary complexity to the system, and results in poor performance.

A better solution is to combine the MEMBER and ADDRESS tables into a single table, and change the one-to-one relationship to an aggregate relationship. This lets you read all information with a single operation, and doubles the update and insert speed, because only a single row in one table requires modifications.

Table 11-4 Optimized Schema (Aggregation of Two Tables Case)

Elements	Details
Classes	Member, Address
Tables	MEMBER
Relationships	Source, Instance Variable, Mapping, Target, Member, address, aggregate, Address

Schema Case 2: Splitting One Table Into Many

To improve overall performance of the system, split large tables into two or more smaller tables. This significantly reduces the amount of data traffic required to query the database.

For example, the system illustrated in Table 11-5 assigns employees to projects within an organization. The most common operation reads a set of employees and projects, assigns employees to projects, and update the employees. The employee's address or job classification is also occasionally used to determine the project on which the employee is placed.

Table 11-5 Original Schema (Splitting One Table into Many Case)

Elements	Details	Instance Variable	Mapping	Target
Title	ACME Employee Workflow System
Classes	Employee, Address, PhoneNumber, EmailAddress, JobClassification, Project
Tables	EMPLOYEE, PROJECT, PROJ_EMP
Relationships	Employee	address	aggregate	Address
	Employee	phoneNumber	aggregate	EmailAddress
	Employee	emailAddress	aggregate	EmailAddress
	Employee	job	aggregate	JobClassification
	Employee	projects	many-to-many	Project

When you read a large volume of employee records from the database, you must also read their aggregate parts. Because of this, the system suffers from general read performance issues. To resolve this, break the EMPLOYEE table into the EMPLOYEE, ADDRESS, PHONE, EMAIL, and JOB tables, as illustrated in Table 11-6.

Because you usually read only the employee information, splitting the table reduces the amount of data transferred from the database to the client. This improves your read performance by reducing the amount of data traffic by 25 percent.

Table 11-6 Optimized Schema (Splitting One Table into Many Case)

Elements	Details	Instance Variable	Mapping	Target
Title	ACME Employee Workflow System
Classes	Employee, Address, PhoneNumber, EmailAddress, JobClassification, Project
Tables	EMPLOYEE, ADDRESS, PHONE, EMAIL, JOB, PROJECT, PROJ_EMP
Relationships	Employee	address	one-to-one	Address
	Employee	phoneNumber	one-to-one	EmailAddress
	Employee	emailAddress	one-to-one	EmailAddress
	Employee	job	one-to-one	JobClassification
	Employee	projects	many-to-many	Project

Schema Case 3: Collapsed Hierarchy

A common mistake when you transform an object-oriented design into a relational model, is to build a large hierarchy of tables on the database. This makes querying difficult, because queries against this type of design can require a large number of joins. It is usually a good idea to collapse some of the levels in your inheritance hierarchy into a single table.

Table 11-7 represents a system that assigns clients to a company's sales representatives. The managers also track the sales representatives that report to them.

Table 11-7 Original Schema (Collapsed Hierarchy Case)

Elements	Details
Title	ACME Sales Force System
Classes	Tables
Person	PERSON
Employee	PERSON, EMPLOYEE
SalesRep	PERSON, EMPLOYEE, REP
Staff	PERSON, EMPLOYEE, STAFF
Client	PERSON, CLIENT
Contact	PERSON, CONTACT

The system suffers from complexity issues that hinder system development and performance. Nearly all queries against the database require large, resource-intensive joins. If you collapse the three-level table hierarchy into a single table, as illustrated in Table 11-8, you substantially reduce system complexity. You eliminate joins from the system, and simplify queries.

Table 11-8 Optimized Schema (Collapsed Hierarchy Case)

Elements	Details
Classes	Tables
Person	none
Employee	EMPLOYEE
SalesRep	EMPLOYEE
Staff	EMPLOYEE
Client	CLIENT
Contact	CLIENT

Schema Case 4: Choosing One out of Many

In a one-to-many relationship, a single source object has a collection of other objects. In some cases, the source object frequently requires one particular object in the collection, but requires the other objects only infrequently. You can reduce the size of the returned result set in this type of case by adding an instance variable for the frequently required object. This lets you access the object without instantiating the other objects in the collection.

Table 11-9 represents a system by which an international shipping company tracks the location of packages in transit. When a package moves from one location to another, the system creates a new a location entry for the package in the database. The most common query against any given package is for its current location.

Table 11-9 Original Schema (Choosing One out of Many Case)

Elements	Details	Instance Variable	Mapping	Target
Title	ACME Shipping Package Location Tracking System
Classes	Package, Location
Tables	PACKAGE, LOCATION
Relationships	Package	locations	one-to-many	Location

A package in this system can accumulate several location values in its LOCATION collection as it travels to its destination. Reading all locations from the database is resource intensive, especially when the only location of interest is the current location.

To resolve this type of problem, add a specific instance variable that represents the current location. You then add a one-to-one mapping for the instance variable, and use the instance variable to query for the current location. As illustrated in Table 11-10, because you can now query for the current location without reading all locations associated with the package, this dramatically improves the performance of the system.

Table 11-10 Optimized Schema (Choosing One out of Many Case)

Elements	Details	Instance Variable	Mapping	Target
Classes	Package, Location
Tables	PACKAGE, LOCATION
Relationships	Package	locations	one-to-many	Location
	Package	currentLocation	one-to-one	Location