The following system performance features and enhancements have been added to the Solaris 10 8/07 release.
UltraSPARC T2 systems PCI Express Interface Units (PIU) have built-in performance counters which can be dumped by using busstat. The output of the busstat -l command shows the following devices for such systems:
where # is an instance number.
The use of this built-in performance counter is intended mainly for Sun field service personnel.
Hashed Cache Index mode is a new hardware feature available in UltraSPARC T2 processors. The hardware uses many more address bits to compute an L2 cache index. As a result, there are more page colors for large pages.
To achieve optimum performance, the Solaris kernel must maximize the number of page colors used by all the threads sharing a cache. The Solaris virtual memory subsystem has been extended to support this new hardware feature. Correct color calculation improves the performance and throughput consistency of application programs on UltraSPARC T2 systems.
The multi-level Chip Multi-Threaded (CMT) scheduling optimizations feature provides the Solaris kernel with a platform independent mechanism. This mechanism enables discovering and optimizing various performance relevant hardware-sharing relationships existing between CPUs on current and emerging CMT processor architectures, including Niagara II.
This feature also enhances the kernel's thread scheduler or dispatcher with a multilevel CMT load-balancing policy that benefits system performance on various multithreaded, multicore, and multisocket processor-based systems.
For more information on this feature, see the OpenSolaris performance community website, http://www.opensolaris.org/os/community/performance.
The process count scalability feature improves the process count scalability of the Solaris OS. Currently, all UltraSPARC systems support a maximum of 8192 contexts. When the number of processes exceeds 8192, then the kernel steals contexts to keep the processes running. Stealing a context from a process involves the following tasks:
Cross-calling all CPUs that the process ran on
Invalidating the context for CPUs that are running threads of the process
Flushing the context from the TLBs of all CPUs that are running threads of the process
This procedure is very expensive and gets worse as the number of processes rise beyond 8K. The process count scalability feature completely redesigns context management. The contexts are managed on a per-MMU basis rather than a global basis which enables efficient TLB flushing and greatly improves the scalability of context management.
The process count scalability feature also improves throughput on workloads that consist of more than 8K active processes, or create and destroy processes at a high rate, and is most beneficial on systems with many CPUs.
The multiple page size support (MPSS) for shared memory feature adds large page support for mapping shared memory and provides an out-of-box (OOB) policy for the use of large pages for shared memory. The MPSS support is for shared memory created by the mmap(1) of /dev/zero or with the MAP_ANON flag, and for System V shared memory. This feature also adds support for memcntl(2) changing the page size of these shared memory segments.
MPSS support is also extended for the use of large pages for memory created by the mmap(1), mmap(MAP_PRIVATE) of /dev/zero.