In addition to using er_kernel -F=regexp, you can profile the kernel and a load together if you run er_kernel using a target of collect load instead of load. Only one of the collect and er_kernel specifications can include hardware counters. If er_kernel is using the hardware counters, the collect command cannot.
The advantage of this technique is that it collects data on the user processes when they are not running on a CPU, while the user experiment collected by er_kernel would only include User CPU Time and System CPU Time. Furthermore, when you use collect, you get the data for OpenMP and Java profiling in user mode. With er_kernel you can only get machine mode for either, and you will not have any information about HotSpot compilations in a Java experiment.
% er_kernel collect load
% analyzer ktest.1.er test.1.er
The data displayed by Performance Analyzer shows both the kernel profile from ktest.1.er and the user profile from test.1.er. The Timeline view enables you to see correlations between the two experiments.