Sun Java Enterprise System Deployment Planning Guide

Chapter 3 Technical Requirements

During the technical requirements phase of the solution life cycle you perform a usage analysis, identify use cases, and determine quality of service requirements for the proposed deployment solution.

This chapter contains the following sections:

About Technical Requirements

Technical requirements analysis begins with the business requirements documents created during the business analysis phase of the solution life cycle. Using the business analysis as a basis, you do the following:

Usage Analysis

Usage analysis involves identifying the various users of the solution you are designing and determining the usage patterns for those users. The information you gather provides a basis for estimating the load conditions on the system. Usage analysis information is also useful when assigning weights to use cases, as described in Use Cases.

During usage analysis, you should interview users whenever possible, research existing data on usage patterns, and interview builders and administrators of previous systems. The following table lists factors to consider when performing a usage analysis.

Table 3–1 Usage Analysis Factors

Topic 

Description 

Number and type of users 

Identify how many users your solution must support, and categorize those users, if necessary. 

For example: 

  • A Business to Customer (B2C) solution might have a large number of visitors, but only a small number of users who register and engage in business transactions.

  • A Business to Employee (B2E) solution typically accommodates each employee, although some employees might need access from outside the corporate network. In a B2E solution, managers might need authorization to areas that regular employees cannot access.

Active and inactive users 

Identify the usage patterns and ratios of active and inactive users. 

Active users are those users logged into the system and interact with the system’s services. Inactive users can be users who are not logged in, users who log in but do not interact with the system’s components, or users who are in the database but never log in. 

Administrative users 

Identify users that access the deployed system to monitor, update, and support the deployment. 

Determine any specific administrative usage patterns that might affect technical requirements (for example, administration of the deployment from outside the firewall). 

Usage patterns 

Identify how various types of users access the system and provide targets for expected usage. 

For example: 

  • Are there peak times when usage spikes?

  • What are normal business hours?

  • Are users distributed globally?

  • What is the expected duration of user connectivity?

User growth 

Determine if the size of the user base is fixed or if the deployment expects growth in the number of users. 

If the user base is expected to grow, try to create reasonable projections of the growth. 

User transactions 

Identify the type of user transactions that must be supported. These user transactions can be translated into use cases. 

For example: 

  • What tasks do users perform?

  • When users log in, do they remain logged in? Do they typically perform a few tasks and log out?

  • Will significant collaboration between users require common calendars, web-conferences, and deployment of internal web pages?

User studies and statistical data 

Use pre-existing user studies and other sources to determine patterns of user behavior. 

Often, enterprises or industry organizations have user research studies from which you can extract useful information about users. Log files for existing applications might contain statistical data useful in making estimates for a system. 

Use Cases

Use cases model typical user interaction with the solution that you are designing, and describe the complete flow of an operation from the perspective of an end user. Prioritizing the design around a complete set of use cases ensures a continual focus on the delivery of expected functionality. Use cases are the principal input to logical design.

Assign relative weights to use cases, with the highest weighted use cases representing the most common user tasks. The weighting of use cases allows you to focus your design decisions on the system services that are used the most.

Use cases can be described at two levels.

Quality of Service Requirements

Quality of service (QoS) requirements are technical specifications that specify the system quality of features such as performance, availability, scalability, and serviceability. QoS requirements are driven by business needs specified in the business requirements. For example, if services must be available 24 hours a day throughout the year, the availability requirement must address this business requirement.

The following table lists the system qualities that typically form a basis for QoS requirements.

Table 3–2 System Qualities Affecting QoS Requirements

System Quality 

Description 

Performance 

The measurement of response time and throughput with respect to user load conditions. 

Availability 

A measure of how often a system’s resources and services are accessible to end users, often expressed as the uptime of a system.

Scalability 

The ability to add capacity (and users) to a deployed system over time. Scalability typically involves adding resources to the system but should not require changes to the deployment architecture. 

Security 

A complex combination of factors that describe the integrity of a system and its users. Security includes authentication and authorization of users, security of data, and secure access to a deployed system. 

Latent capacity 

The ability of a system to handle unusual peak loads without additional resources. Latent capacity is a factor in availability, performance, and scalability qualities. 

Serviceability 

The ease by which a deployed system can be maintained, including monitoring the system, repairing problems that arise, and upgrading hardware and software components. 

System qualities are closely interrelated. Requirements for one system quality might affect the requirements and design for other system qualities. For example, higher levels of security might affect performance, which in turn might affect availability. Adding additional servers to address availability issues affect serviceability (maintenance costs).

Understanding how system qualities are interrelated and the trade-offs that must be made is the key to designing a system that successfully satisfies both business requirements and business constraints.

The following sections describe further the system qualities that impact deployment design, providing guidance on factors to consider when formulating QoS requirements. A section on service level requirements, which form the basis of service level agreements, is also included.

Performance

Business requirements typically express performance in nontechnical terms that specify response time. For example, a business requirement for web-based access might state the following:

Users expect a reasonable response time upon login, typically no greater than four seconds.

Starting with this business requirement, examine all use cases to determine how to express this requirement at a system level. In some cases, you might want to include user load conditions determined during usage analysis. Express the performance requirement for each use case in terms of response time under specified load conditions or response time plus throughput. You might also specify the allowable number of errors.

Here are two examples of how to specify system requirements for performance:

Performance requirements are closely related to availability requirements (how failover impacts performance) and latent capacity (how much capacity is available to handle unusual peak loads).

Availability

Availability is a way to specify the uptime of a system and is typically measured as the percentage of time that the system is accessible to users. The time that the system is not accessible (downtime) can be due to the failure of hardware, software, the network, or any other factor (such as loss of power) that causes the system to be down. Scheduled downtime for service (maintenance and upgrades) is not considered downtime. A basic equation to calculate system availability in terms of percentage of uptime is:

Availability = uptime / (uptime + downtime) * 100%

Typically you measure availability by the number of “nines” you can achieve. For example, 99% availability is two nines. Specifying additional nines significantly affects the deployment design. The following table quantifies the unscheduled downtime for additional nines of availability to a system that is running 24x7 year-round (a total of 8,760 hours).

Table 3–3 Unscheduled Downtime for a System Running Year-Round (8,760 hours)

Number of Nines 

Percentage Available 

Unscheduled Downtime 

99% 

88 hours 

99.9% 

9 hours 

99.99% 

45 minutes 

99.999% 

5 minutes 

Fault-Tolerant Systems

Availability requirements of four or five nines typically require a system that is fault-tolerant. A fault-tolerant system must be able to continue service even during a hardware or software failure. Typically, fault tolerance is achieved by redundancy in both hardware (such as CPUs, memory, and network devices) and in software providing key services.

A single point of failure is a hardware or software component that is part of a critical path but is not backed up by redundant components. The failure of this component results in the loss of service for the system. When designing a fault-tolerant system, you must identify and eliminate potential single points of failure.

Fault-tolerant systems can be expensive to implement and maintain. Make sure you understand the nature of the business requirements for availability and consider the strategies and costs of availability solutions that meet those requirements.

Prioritizing Service Availability

From a user perspective, availability often applies more on a service-by-service basis rather than on the availability of the entire system. For example, the unavailability of instant messaging services usually has little or no impact on the availability of other services. However, the unavailability of services upon which many other services depend (such as Directory Server) has a much wider impact. Higher availability specifications should clearly reference specific use cases and usage analysis that require the increased availability.

It is helpful to list availability needs according to an ordered set of priorities. The following table prioritizes the availability of different types of services.

Table 3–4 Availability of Services by Priority

Priority 

Service Type 

Description 

Mission critical 

Services that must be available at all times. For example, database services (such as LDAP directories) to applications. 

Must be available 

Services that must be available, but can be available at reduced performance. For example, messaging service availability might not be critical in some business environments. 

Can be postponed 

Services that must be available within a given time period. For example, calendar services availability might not be essential in some business environments. 

Optional 

Services that can be postponed indefinitely. For example, in some environments instant messaging services can be considered useful but not necessary. 

Loss of Services

Availability design includes consideration for what happens when availability is compromised or when a component is lost. This includes considering whether users connected must restart sessions and how a failure in one area affects other areas of a system. QoS requirements should consider these scenarios and specify how the deployment reacts to these situations.

Scalability

Scalability is the ability to add capacity to a system so the system can support additional load from existing users or from an increased user-base. Scalability usually requires the addition of resources, but should not require changes in the design of the deployment architecture or loss of service due to the time required to add additional resources.

As with availability, scalability applies more to individual services provided by a system rather than to the entire system. However, for services upon which other services depend, such as Directory Server, scalability can have system-wide impact.

You do not necessarily specify scalability requirements with QoS requirements unless projected growth of the deployment is clearly stated in the business requirements. However, during the deployment design phase of the solution life cycle, the deployment architecture should always add some tolerance for scaling the system even if no QoS requirements for scalability have been specified.

Estimating Growth

Estimating the growth of a system to determine scalability requirements involves working with projections, estimates, and guesses that might not be fulfilled. Three keys to developing requirements for a scalable system are the following.

The following table lists factors to consider for determining scalability requirements.

Table 3–5 Scalability Factors

Topic 

Description 

Analyze usage patterns 

Understand the usage patterns of the current (or projected) user base by studying existing data. In the absence of current data, analyze industry data or market estimates. 

Design for reasonable maximum scale 

Design with a goal towards the maximum required scale for both known demand and possible demand. 

Often, this is a 24-month estimate based on performance evaluation of the existing user load and reasonable expectations of future load. The time period for the estimate depends largely on the reliability of projections. 

Set appropriate milestones 

Implement the deployment design in increments to meet short-term requirements with a buffer to allow for unexpected growth. Set milestones for adding system resources. 

For example: 

  • Capital acquisition (such as quarterly or yearly)

  • Lead time to purchase hardware and software (such as one to six weeks)

  • Buffer (10% to 100%, depending on growth expectations)

Incorporate emerging technology 

Understand emerging technology, such as faster processors and Web servers, and how this technology can affect the performance of the underlying architecture. 

Security Requirements

Security is a complex topic that involves all levels of a deployed system. Developing security requirements revolves around identifying the security threats and developing a strategy to combat them. This security analysis includes the following steps:

  1. Identifying critical assets

  2. Identifying threats to those assets

  3. Identifying vulnerabilities that expose the threats that create risk to the organization

  4. Developing a security plan that mitigates the risk to the organization

The analysis of security requirements should involve a cross-section of stakeholders from your organization, including managers, business analysts, and information technology personnel. Often, an organization appoints a security architect to take the lead in the design and implementation of security measures.

The following section describes some of the areas that are covered in security planning.

Elements of a Security Plan

Planning for security of a system is part of deployment design that is essential to successful implementation. Consider the following when planning for security:

Latent Capacity

Latent capacity is the ability of a deployment to handle unusual peak load usage without the addition of resources. Typically, you do not specify QoS requirements directly around latent capacity, but this system quality is a factor in the availability, performance, and scalability of the system.

Serviceability Requirements

Serviceability is the ease with which a deployed system can be maintained, including tasks such as monitoring the system, repairing problems that arise, adding and removing users from the system, and upgrading hardware and software components.

When planning requirements for serviceability, consider the topics listed in the following table.

Table 3–6 Topics for Serviceability Requirements

Topic 

Description 

Downtime planning 

Identify maintenance tasks that require specific services to be unavailable or partially unavailable. 

Some maintenance and upgrades can occur seamlessly to users, while others require interruption of service. When possible, schedule with users those maintenance activities that require downtime, allowing the users to plan for the downtime. 

Usage patterns 

Identify the usage patterns to determine the best time to schedule maintenance. 

For example, on systems where peak usage is during normal business hours, schedule maintenance in the evening or weekends. For geographically distributed systems, identifying these times can be more challenging. 

Availability 

Serviceability is often a reflection of your availability design. Strategies for minimizing downtime for maintenance and upgrades revolve around your availability strategy. Systems that require a high degree of availability have limited opportunities for maintenance, upgrades, and repair. 

Strategies for handling availability requirements affect how you handle maintenance and upgrades. For example, on systems that are distributed geographically, servicing can depend on the ability to route workloads to remote servers during maintenance periods. 

Also, systems requiring a high degree of availability might require more sophisticated solutions that automate restarting of systems with little human intervention. 

Diagnostics and monitoring 

You can improve the stability of a system by regularly running diagnostic and monitoring tools to identify problem areas. 

Regular monitoring of a system can avoid problems before they occur, help balance workloads according to availability strategies, and improve planning for maintenance and downtime. 

Service Level Requirements

A service level agreement (SLA) specifies minimum performance requirements and, upon failure to meet those requirements, the level and extent of customer support that must be provided. Service level requirements are system requirements that specify the conditions upon which the SLA is based.

As with QoS requirements, service level requirements derive from business requirements and represent a guarantee about the overall system quality that the deployed system must meet. Because the service level agreement is considered to be a contract, specification of service level requirements should be unambiguous. The service level requirements define exactly under what conditions the requirements are tested and precisely what constitutes failure to meet the requirements.