Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Studio 12.2: Simple Performance Optimization Tool (SPOT) User's Guide |
1. The Simple Performance Optimization Tool (SPOT)
2. Running SPOT on Your Application
Runtime System and Build Information
Analysis of Application Stall Behavior Section
Maximum Resources Used By The Process Section
Pairs of Top Four Stall Counters Section
Application HW Counter Profile Output
Summary of Key Experiment Metrics Table
The following test example program was run with SPOT to generate the reports discussed in this chapter. The program has three routines, each of which targets a different kind of event:
The fp_routine routine does floating point computation on three 80 MB arrays. The routine has floating point operations, and also (because of the size of the array) significant amounts of memory traffic, which appears as read and write memory bandwidth consumption.
The cache_miss routine is a test of memory latency. Each pointer chase in the key loop brings in another cacheline, resulting in many cache misses, and also a significant amount of memory read bandwidth.
The tlb_miss routine is identical to the cache_miss routine except for the way it is called. The reason for duplicating the code is to show clearly the location in the code where the events are happening. This routine brings in a new TLB page on every pointer chase in the key loops, so the routine encounters both cache and TLB misses.
#include <stdio.h> #include <stdlib.h> void fp_routine(double *out, double *in1, *double *in2, int n) { for (int i=0; i<n; i++) (out[i]=in1[i]+in2[i];) } int** cache_miss(int **array, int size, int step) { for (int i=0; i<size-step; i++){array[i]=(int*)&array[i+step];} for (int i=size-step; i<size; i++) {array[i]=(int*)&array[i-size+step];} int ** cp=(int**)array[0]; for (int i=0; i<size*16; i++) {cp= (int**)*cp;} } int** tlb_miss(int **array, int size, int step) { for (int i=0; i<size-step; i++){array[i]=(int*)&array[i+step];} for (int i=size-step' i<size' i++) {array[i]=(int*)&array[i-size+step];} int ** cp=(int**)array[0]; for (int i=0; i<size*16; i++) {cp= (int**)*cp;} return cp; } void main() { double * out, *in1, *in2; int **array; out=(double*) calloc(sizeof(double),10*1024*1024); in1=(double*) calloc(sizeof(double),10*1024*1024); in2=(double*) calloc(sizeof(double),10*1024*1024); for (int rpt=0; rpt <100; rpt++) fp_routine(out,in1,in2,10*1024*1024); free(out); free(in1); free(in2); array=(int**)calloc(sizeof(int*),10*1024*1024); cache_miss(array,10*1024*1024,64/sizeof(int*)); tlb_miss(array,10*1024*1024,8192/sizeof(int*)); free (array); }
The program was compiled with the Oracle Solaris Studio 12.2 c compiler:
cc -g -O -o test test.c