This chapter describes the results of scalability studies. You can refer to these studies for a sample of how the server performs, and how you can configure your system to best take advantage of Web Server’s strengths.
This chapter includes the following topics:
The goal of the tests in the study was to shows how well Sun Java System Web Server 7 scales. The tests also helped to determine the configuration and tuning requirements for different types of content.
The studies were conducted with the following content:
100% static
100% C CGI
100% Perl CGI
100% NSAPI
100% Java servlets
100% PHP/FastCGI
E-commerce web application with large inventory
When tuned, Sun Java System Web Server 7.0 scaled almost linearly in performance for dynamic and static content.
With the exception of the e-commerce study, these studies were conducted using the following hardware. For hardware information for the e-commerce study, see Hardware for E-Commerce Test.
Web Server system configuration for static content:
Sun Microsystems Sun Fire T2000 (120 MHz, 8 cores) (only six cores were used for this test)
16256 Megabytes of memory
Solaris 10 operating system
Three Sun StoreEdge 3510
Web Server system configuration:
Sun Microsystems Sun Fire T2000 (1000 MHz , 6 cores)
16376 Megabytes of memory
Solaris 10 operating system
Driver system configuration:
Three Sun Microsystems Sun FireTM X4100
Four Sun Microsystems Sun Fire V490 ( 2 X 1050 MHzUS-IV)
Three Sun Fire T1000
Sun Fire 880 (990 MHz US-III+)
8192 Megabytes of memory
Solaris 10 operating system
Network configuration:
The Web Server and the driver machines were connected with multiple gigabit Ethernet links
The load driver for these tests was an internally-developed Java application framework called the Faban driver.
The following tuning settings are common to all the tests in this study. Individual studies have additional configuration and tuning information.
set rlim_fd_max=500000 set rlim_fd_cur=500000 set sq_max_size=0 set consistent_coloring=2 set autoup=60 set ip:ip_squeue_bind=0 set ip:ip_soft_rings_cnt=0 set ip:ip_squeue_fanout=1 set ip:ip_squeue_enter=3 set ip:ip_squeue_worker_wait=0 set segmap_percent=6 set bufhwm=32768 set maxphys=1048576 set maxpgio=128 set ufs:smallfile=6000000 *For ipge driver set ipge:ipge_tx_ring_size=2048 set ipge:ipge_tx_syncq=1 set ipge:ipge_srv_fifo_depth=16000 set ipge:ipge_reclaim_pending=32 set ipge:ipge_bcopy_thresh=512 set ipge:ipge_dvma_thresh=1 set pcie:pcie_aer_ce_mask=0x1 *For e1000g driver set pcie:pcie_aer_ce_mask = 0x1
ndd -set /dev/tcp tcp_conn_req_max_q 102400 ndd -set /dev/tcp tcp_conn_req_max_q0 102400 ndd -set /dev/tcp tcp_max_buf 4194304 ndd -set /dev/tcp tcp_cwnd_max 2097152 ndd -set /dev/tcp tcp_recv_hiwat 400000 ndd -set /dev/tcp tcp_xmit_hiwat 400000
Since the tests use multiple network interfaces, it is important to make sure that all the network interfaces are not going to the same core. Network interrupts were enabled on one strand and disabled on the remaining three strand of a core using the following script:
allpsr=`/usr/sbin/psrinfo | grep -v off-line | awk '{ print $1 }'` set $allpsr numpsr=$# while [ $numpsr -gt 0 ]; do shift numpsr=`expr $numpsr - 1` tmp=1 while [ $tmp -ne 4 ]; do /usr/sbin/psradm -i $1 shift numpsr=`expr $numpsr - 1` tmp=`expr $tmp + 1` done done |
The following example shows psrinfo output before running the script:
# psrinfo | more 0 on-line since 12/06/2006 14:28:34 1 on-line since 12/06/2006 14:28:35 2 on-line since 12/06/2006 14:28:35 3 on-line since 12/06/2006 14:28:35 4 on-line since 12/06/2006 14:28:35 5 on-line since 12/06/2006 14:28:35 ................. |
The following example shows psrinfo output after running the script:
0 on-line since 12/06/2006 14:28:34 1 no-intr since 12/07/2006 09:17:04 2 no-intr since 12/07/2006 09:17:04 3 no-intr since 12/07/2006 09:17:04 4 on-line since 12/06/2006 14:28:35 5 no-intr since 12/07/2006 09:17:04 ................. |
The following table shows the tuning settings used for the Web Server.
Table 6–1 Web Server Tuning Settings
Component |
Default |
Tuned |
---|---|---|
Access logging |
enabled=true |
enabled=false |
Thread pool |
min-threads=16 max-threads=128 stack-size=131072 queue-size=1024 |
min-threads=128 max-threads=200 stack-size=262144 queue-size=15000 |
HTTP listener |
Non-secure listener on port 80 listen-queue-size=128 |
Non-secure listener on port 80 Secure listener on port 443 listen-queue-size=15000 |
Keep alive |
enabled=true threads=1 max-connections=200 timeout=30 sec |
enabled=true threads=2 max-connections=15000 timeout=180 sec |
default-web.xml |
JSP compilation turned on |
JSP compilation turned off |
The following table shows the SSL session cache tuning settings used for the SSL tests.
Table 6–2 SSL Session Cache Tuning Settings
Component |
Default |
---|---|
SSL session cache |
enabled=true max-entries=10000 max-ssl2-session-age=100 max-ssl3-tls-session-age=86400 |
This section contains the test-specific configuration, tuning, and results for the following tests:
The following metrics were used to characterize performance:
Operations per second (ops/sec) = successful transactions per second
Response time for single transaction (round-trip time) in milliseconds
The performance and scalability diagrams show throughput (ops/sec) against the number of cores enabled on the system.
This test was performed with a static download of a randomly selected file from a pool of 10,000 directories, each containing 36 files ranging in size from 1KB to 1000 KB. The goal of the static content test was to saturate the cores and find out the respective throughput and response time.
This test used the following configuration:
Static files were created on striped disk array (Sun StorEdge 3510).
Multiple network interfaces were configured.
Web Server was configured with 64 bit.
File-cache was enabled with the tuning settings described in the following table.
Default |
Tuned |
---|---|
enabled=true max-age=30 sec max-entries=1024 sendfile=false max-heap-file-size=524288 max-heap-space=10485760 max-mmap-file-size=0 max-mmap-space=0 |
enabled=true max-age=3600 max-entries=1048576 sendfile=true max-heap-file-size=1200000 max-heap-space=8000000000 max-mmap-file-size=1048576 max-mmap-space= l max-open-files=1048576 |
The following table shows the static content scalability results.
Table 6–4 Static Content Scalability
Number Of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
10365 |
184 |
4 |
19729 |
199 |
6 |
27649 |
201 |
The following is a graphical representation of static content scalability results.
This test was conducted using the servlet. The test prints out the servlet's initialization arguments, environments, request headers, connection and client information, URL information, and remote user information. JVM tuning settings were applied to the server. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table shows the JVM tuning settings used in the test.
Table 6–5 JVM Tuning Settings
Default |
Tuned |
---|---|
-Xmx128m -Xms256m |
-server -Xrs -Xmx2048m -Xms2048m -Xmn2024m -XX:+AggressiveHeap -XX:LargePageSizeInBytes=256m -XX:+UseParallelOldGC -XX:+UseParallelGC -XX:ParallelGCThreads=<number of cores> -XX:+DisableExplicitGC |
The following table shows the results for the dynamic content servlet test.
Table 6–6 Dynamic Content Test: Servlet Scalability
Number Of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
5287 |
19 |
4 |
10492 |
19 |
6 |
15579 |
19 |
The following is a graphical representation of servlet scalability results.
This test was performed by accessing a C executable called printenv. This executable outputs the environment variable information. CGI tuning settings were applied to the server. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table describes the CGI tuning settings used in this test.
Table 6–7 CGI Tuning Settings
Default |
Tuned |
---|---|
idle-timeout=300 cgistub-idle-timeout=30 min-cgistubs=0 max-cgistubs=16 |
idle-timeout=300 cgistub-idle-timeout=1000 min-cgistubs=100 max-cgistubs=100 |
The following table shows the results of the dynamic content test for C CGI.
Table 6–8 Dynamic Content Test: C CGI Scalability
Number Of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
892 |
112 |
4 |
1681 |
119 |
6 |
2320 |
129 |
The following is a graphical representation of C CGI scalability results.
This test was conducted with Perl script called printenv.pl that prints the CGI environment. CGI tuning settings were applied to the server. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table shows the CGI tuning settings used in the dynamic content test for Perl CGI.
Table 6–9 CGI Tuning Settings
Default |
Tuned |
---|---|
idle-timeout=300 cgistub-idle-timeout=30 min-cgistubs=0 max-cgistubs=16 |
idle-timeout=300 cgistub-idle-timeout=1000 min-cgistubs=100 max-cgistubs=100 |
The following table shows the results for the dynamic content test of Perl CGI.
Table 6–10 Dynamic Content Test: Perl CGI Scalability
Number Of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
322 |
310 |
4 |
611 |
327 |
6 |
873 |
343 |
The following is a graphical representation of Perl CGI scalability results.
The NSAPI module used in this test was printenv2.so. It prints the NSAPI environment variables along with some text to make the entire response 2 KB. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The only tuning for this test was optimizing the path checks in obj.conf by removing the unused path checks.
The following table shows the results of the dynamic content test for NSAPI.
Table 6–11 Dynamic Content Test: NSAPI Scalability
Number Of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
6264 |
14 |
4 |
12520 |
15 |
6 |
18417 |
16 |
The following is a graphical representation of NSAPI scalability results.
PHP is a widely-used scripting language uniquely suited to creating dynamic, Web-based content. It is the most rapidly expanding scripting language in use on the Internet due to its simplicity, accessibility, wide number of available modules, and large number of easily available applications.
The scalability of Web Server combined with the versatility of the PHP engine provides a high-performing and versatile web deployment platform for dynamic content. These tests used PHP version 5.1.6.
The tests were performed in two modes:
An out-of-process fastcgi-php application invoked using the FastCGI plug-in.
In-process PHP NSAPI plug-in.
The test executed the phpinfo() query. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table shows the Web Server tuning settings used for the FastCGI plug-in test
Table 6–12 Tuning Settings for FastCGI Plug-in Test
The following table shows the results of the PHP with FastCGI test.
Table 6–13 PHP Scalability with Fast CGI
Number of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
876 |
114 |
4 |
1706 |
117 |
6 |
2475 |
121 |
The following is a graphical representation of PHP scalability with Fast CGI.
The following table shows the Web Server tuning settings for the PHP with NSAPI test.
Table 6–14 NSAPI Plug-in Configuration for PHP
The following table shows the results of the PHP with NSAPI test.
Table 6–15 PHP Scalability with NSAPI
Number of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
950 |
105 |
4 |
1846 |
108 |
6 |
2600 |
115 |
The following is a graphical representation of PHP scalability with NSAPI.
This test was performed with static download of a randomly selected file from a pool of 10,000 directories, each containing 36 files ranging in size from 1KB to 1000 KB. The goal of the SSL static content tests was to saturate the cores and find out the respective throughput and response time. Only four cores of T2000 were used for this test.
This test used the following configuration:
Static files were created on striped disk array (Sun StorEdge 3510).
Multiple network interfaces were configured.
The file cache was enabled and tuned using the settings in Table 6–3.
The SSL session cache was tuned using the settings in Table 6–2.
Web Server is configured with 64 bit
The following table shows the SSL static content test results.
Table 6–16 SSL Performance Test: Static Content Scalability
Number of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
2284 |
379 |
4 |
4538 |
387 |
6 |
6799 |
387 |
The following is a graphical representation of static content scalability with SSL.
This test was conducted with Perl script called printenv.pl that prints the CGI environment in SSL mode. The test was performed in SSL mode with the SSL session cache enabled. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table shows the SSL Perl CGI test results.
Table 6–17 SSL Performance Test: Perl CGI Scalability
Number of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
303 |
329 |
4 |
580 |
344 |
6 |
830 |
361 |
The following is a graphical representation of Perl scalability with SSL.
This test was performed by accessing a C executable called printenv in SSL mode. This executable outputs the environment variable information. The test was performed in SSL mode with the SSL session cache enabled. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table shows the SSL CGI test results.
Table 6–18 SSL Performance Test: C CGI Scalability
Number of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
792 |
126 |
4 |
1499 |
133 |
6 |
2127 |
141 |
The following is a graphical representation of C CGI scalability with SSL.
The NSAPI module used in this test was printenv2.so. It prints the NSAPI environment variables along with some text to make the entire response 2 KB. The test was performed in SSL mode with the SSL session cache enabled. The goal was to saturate the cores on the server and find out the respective throughput and response time.
The following table shows the SSL NSAPI test results.
Table 6–19 SSL Performance Test: NSAPI Scalability
Number of Cores |
Average Throughput (ops/sec) |
Average Response Time (ms) |
---|---|---|
2 |
2729 |
29 |
4 |
5508 |
30 |
6 |
7982 |
32 |
The following is a graphical representation of NSAPI scalability with SSL.
The e-commerce application is a more complicated application that utilizes a database to simulate online shopping.
The e-commerce studies were conducted using the following hardware.
Web Server system configuration:
Sun Microsystems Sun Fire 880 ( 900MHz US-III+). Only four CPUs were used for this test.
16384 Megabytes of memory.
Solaris 10 operating system.
Database system configuration:
Sun Microsystems Sun Fire 880 ( 900MHz US-III+)
16384 Megabytes of memory
Solaris 10 operating system
Oracle 10.1.0.2.0
Driver system configuration:
Sun Microsystems Sun Fire 880 ( 900MHz US-III+)
Solaris 10 operating system
Network configuration:
The Web Server, database, and the driver machines were connected with a gigabit Ethernet link.
The e-commerce test was run with the following tuning settings.
JDBC tuning:
<jdbc-resource> <jndi-name>jdbc/jwebapp</jndi-name> <datasource-class>oracle.jdbc.pool.OracleDataSource</datasource-class> <max-connections>200</max-connections> <idle-timeout>0</idle-timeout> <wait-timeout>5</wait-timeout> <connection-validation>auto-commit</connection-validation> <property> <name>username</name> <value> db_user </value> </property> <property> <name>password</name> <value> db_password </value> </property> <property> <name>url</name> <value>jdbc:oracle:thin:@db_host_name:1521:oracle_sid</value> </property> <property> <name>ImplicitCachingEnabled</name> <value>true</value> </property> <property> <name>MaxStatements</name> <value>200</value> </property> </jdbc-resource
JVM tuning:
-server -Xmx1500m -Xms1500m -Xss128k -XX:+DisableExplicitGC
This test models an e-commerce web site that sells items from a large inventory. It uses the standard web application model-view-controller design pattern for its implementation: the user interface (that is, the view) is handled by 16 different JSP pages which interface with a single master control servlet. The servlet maintains JDBC connections to the database, which serves as the model and handles 27 different queries. The JSP pages make extensive use of JSP tag libraries and comprise almost 2000 lines of logic.
The database contains 1000 orderable items (which have two related tables which also have a cardinality of 1000), 72000 customers (with two related tables), and 1.9 million orders (with two related tables). Standard JDBC connections handle database connection using prepared statements and following standard JDBC design principles.
A randomly-selected user performs the online shopping. The following operations were used in the Matrix mix workload. Operations were carried out with the following precedence: Home, AdminConfirm, AdminRequest, BestSellers, BuyConfirm, BuyRequest, CustomerRegistration, NewProducts, OrderDisplay, OrderInquiry, ProductDetail, SearchRequest, SearchResults, and ShoppingCart.
The Faban driver was used to drive the load. Think time was chosen from a negative exponential distribution. The minimum think time was 7.5 seconds, the maximum was 75 seconds. The maximum number of concurrent users that the system can support was based on the following passing criteria.
Table 6–20 Performance Test Pass Criteria
Transaction |
90th Percentile Response Time (Seconds) |
---|---|
HomeStart |
3 |
AdminConfirm |
20 |
AdminRequest |
3 |
BestSellers |
5 |
BuyConfirm |
5 |
BuyRequest |
3 |
CustomerRegistration |
3 |
Home |
3 |
NewProducts |
5 |
OrderDisplay |
3 |
OrderInquiry |
3 |
ProductDetail |
3 |
SearchRequest |
3 |
SearchResults |
10 |
ShoppingCart |
3 |
The following table shows the e-commerce web application test results.
Table 6–21 E-Commerce Web Application Scalability
Number of CPUs |
Users |
Throughput (ops/sec) |
---|---|---|
2 |
7000 |
790 |
4 |
11200 |
1350 |
The following is a graphical representation of e-commerce web application scalability.
The following is a graphical representation of e-commerce web application scalability.