Sun Studio 12 Update 1: C++ User's Guide

Part II Writing C++ Programs

Chapter 4 Language Extensions

This chapter documents the language extensions specific to this compiler. The compiler does not recognize some of the features described in this chapter unless you specify certain compiler options on the command line. The relevant compiler options are listed in each section as appropriate.

The -features=extensions option enables you to compile nonstandard code that is commonly accepted by other C++ compilers. You can use this option when you must compile invalid code and you are not permitted to modify the code to make it valid.

This chapter describes the language extensions that the compiler supports when you use the -features=extensions options.


Note –

You can easily turn each supported instance of invalid code into valid code that all compilers will accept. If you are allowed to make the code valid, you should do so instead of using this option. Using the -features=extensions option perpetuates invalid code that will be rejected by some compilers.


4.1 Linker Scoping

Use the following declaration specifiers to help constrain declarations and definitions of extern symbols. The scoping restraints you specify for a static archive or an object file will not take effect until the file is linked into a shared library or an executable. Despite this, the compiler can still perform some optimization given the presence of the linker scoping specifiers.

By using these specifiers, you no longer need to use mapfiles for linker scoping. You can also control the default setting for variable scoping by specifying -xldscope on the command line.

For more information, see A.2.136 -xldscope={v}.

Table 4–1 Linker Scoping Declaration Specifiers

Value 

Meaning 

__global

Symbol definitions have global linker scoping and is the least restrictive linker scoping. All references to the symbol bind to the definition in the first dynamic load module that defines the symbol. This linker scoping is the current linker scoping for extern symbols.

__symbolic

Symbol definitions have symbolic linker scoping and is more restrictive than global linker scoping. All references to the symbol from within the dynamic load module being linked bind to the symbol defined within the module. Outside of the module, the symbol appears as though it were global. This linker scoping corresponds to the linker option -Bsymbolic. Although you cannot use -Bsymbolic with C++ libraries, you can use the __symbolic specifier without causing problems. See ld(1) for more information on the linker.

__hidden

Symbol definitions have hidden linker scoping. Hidden linker scoping is more restrictive than symbolic and global linker scoping. All references within a dynamic load module bind to a definition within that module. The symbol will not be visible outside of the module.

A symbol definition may be redeclared with a more restrictive specifier, but may not be redeclared with a less restrictive specifier. A symbol may not be declared with a different specifier once the symbol has been defined.

__global is the least restrictive scoping, __symbolic is more restrictive, and __hidden is the most restrictive scoping.

All virtual functions must be visible to all compilation units that include the class definition because the declaration of virtual functions affects the construction and interpretation of virtual tables.

You can apply the linker scoping specifiers to struct, class, and union declarations and definitions because C++ classes may require generation of implicit information, such as virtual tables and run-time type information. The specifier, in this case, follows the struct, class, or union keyword. Such an application implies the same linker scoping for all its implicit members.

4.1.1 Compatibility with Microsoft Windows

For compatibility with similar scoping features in Microsoft Visual C++ (MSVC++) for dynamic libraries, the following syntax is also supported:

__declspec(dllexport) is equivalent to __symbolic

__declspec(dllimport) is equivalent to __global

When taking advantage of this syntax with Sun C++, you should add the option -xldscope=hidden to CC command lines. The result will be comparable to the results using MSVC++. With MSVC++, __declspec(dllimport) is supposed to be used only on declarations of external symbols, not on definitions. Example:


__declspec(dllimport) int foo(); // OK 
__declspec(dllimport) int bar() { ... } // not OK  

MSVC++ is lax about allowing dllimport on definitions, and the results using Sun C++ will be different. In particular, using dllimport on a definition using Sun C++ results in the symbol having global linkage instead of symbolic linkage. Dynamic libraries on Microsoft Windows do not support global linkage of symbols. If you run into this problem, you can change the source code to use dllexport instead of dllimport on definitions. You will then get the same results with MSVC++ and Sun C++.

4.2 Thread-Local Storage

Take advantage of thread-local storage by declaring thread-local variables. A thread-local variable declaration consists of a normal variable declaration with the addition of the declaration specifier __thread. For more information, see A.2.182 -xthreadvar[=o].

You must include the __thread specifier in the first declaration of the thread variable. Variables that you declare with the __thread specifier are bound as they would be without the __thread specifier.

You can declare variables only of static duration with the __thread specifier. Variables with static duration include file global, file static, function local static, and class static member. You should not declare variables with dynamic or automatic duration with the __thread specifier. A thread variable can have a static initializer, but it cannot have a dynamic initializer or destructors. For example, __thread int x = 4; is permitted, but __thread int x = f(); is not. A thread variable should not have a type with non-trivial constructors and destructors. In particular, a thread variable may not have type std::string.

The address-of operator (&) for a thread variable is evaluated at run time and returns the address of the current thread’s variable. Therefore, the address of a thread variable is not a constant.

The address of a thread variable is stable for the lifetime of the corresponding thread. Any thread in the process can freely use the address of a thread variable during the variable’s lifetime. You cannot use a thread variable’s address after its thread terminates. All addresses of a thread’s variables are invalid after the thread’s termination.

4.3 Overriding With Less Restrictive Virtual Functions

The C++ standard says that an overriding virtual function must not be less restrictive in the exceptions it allows than any function it overrides. It can have the same restrictions or be more restrictive. Note that the absence of an exception specification allows any exception.

Suppose, for example, that you call a function through a pointer to a base class. If the function has an exception specification, you can count on no other exceptions being thrown. If the overriding function has a less-restrictive specification, an unexpected exception could be thrown, which can result in bizarre program behavior followed by a program abort. This is the reason for the rule.

When you use -features=extensions, the compiler will allow overriding functions with less-restrictive exception specifications.

4.4 Making Forward Declarations of enum Types and Variables

When you use -features=extensions, the compiler allows the forward declaration of enum types and variables. In addition, the compiler allows the declaration of a variable with an incomplete enum type. The compiler will always assume an incomplete enum type to have the same size and range as type int on the current platform.

The following two lines show an example of invalid code that will compile when you use the -features=extensions option.


enum E; // invalid: forward declaration of enum not allowed
E e;    // invalid: type E is incomplete

Because enum definitions cannot reference one another, and no enum definition can cross-reference another type, the forward declaration of an enumeration type is never necessary. To make the code valid, you can always provide the full definition of the enum before it is used.


Note –

On 64-bit architectures, it is possible for an enum to require a size that is larger than type int. If that is the case, and if the forward declaration and the definition are visible in the same compilation, the compiler will emit an error. If the actual size is not the assumed size and the compiler does not see the discrepancy, the code will compile and link, but might not run properly. Mysterious program behavior can occur, particularly if an 8-byte value is stored in a 4-byte variable.


4.5 Using Incomplete enum Types

When you use -features=extensions, incomplete enum types are taken as forward declarations. For example, the following invalid code will compile when you use the -features=extensions option.


typedef enum E F; // invalid, E is incomplete

As noted previously, you can always include the definition of an enum type before it is used.

4.6 Using an enum Name as a Scope Qualifier

Because an enum declaration does not introduce a scope, an enum name cannot be used as a scope qualifier. For example, the following code is invalid.


enum E {e1, e2, e3};
int i = E::e1; // invalid: E is not a scope name

To compile this invalid code, use the -features=extensions option. The -features=extensions option instructs the compiler to ignore a scope qualifier if it is the name of an enum type.

To make the code valid, remove the invalid qualifier E::.


Note –

Use of this option increases the possibility of typographical errors yielding incorrect programs that compile without error messages.


4.7 Using Anonymous struct Declarations

An anonymous struct declaration is a declaration that declares neither a tag for the struct, nor an object or typedef name. Anonymous structs are not allowed in C++.

The -features=extensions option allows the use of an anonymous struct declaration, but only as member of a union.

The following code is an example of an invalid anonymous struct declaration that compiles when you use the -features=extensions option.


union U {
  struct {
    int a;
    double b;
  };  // invalid: anonymous struct
  struct {
    char* c;
    unsigned d;
  };  // invalid: anonymous struct
};

The names of the struct members are visible without qualification by a struct member name. Given the definition of U in this code example, you can write:


U u;
u.a = 1;

Anonymous structs are subject to the same limitations as anonymous unions.

Note that you can make the code valid by giving a name to each struct, such as:


union U {
  struct {
    int a;
    double b;
  } A;
  struct {
    char* c;
    unsigned d;
  } B;
};
U u;
U.A.a = 1;

4.8 Passing the Address of an Anonymous Class Instance

You are not allowed to take the address of a temporary variable. For example, the following code is invalid because it takes the address of a variable created by a constructor call. However, the compiler accepts this invalid code when you use the -features=extensions option.


class C {
  public:
    C(int);
    ...
};
void f1(C*);
int main()
{
  f1(&C(2)); // invalid
}

Note that you can make this code valid by using an explicit variable.


C c(2);
f1(&c);

The temporary object is destroyed when the function returns. Ensuring that the address of the temporary variable is not retained is the programmer’s responsibility. In addition, the data that is stored in the temporary variable (for example, by f1) is lost when the temporary variable is destroyed.

4.9 Declaring a Static Namespace-Scope Function as a Class Friend

The following code is invalid.


class A {
  friend static void foo(<args>);
  ...
};

Because a class name has external linkage and all definitions must be identical, friend functions must also have external linkage. However, when you use the -features=extensions option, the compiler to accepts this code.

Presumably the programmer’s intent with this invalid code was to provide a nonmember “helper” function in the implementation file for class A. You can get the same effect by making foo a static member function. You can make it private if you do not want clients to call the function.


Note –

If you use this extension, your class can be “hijacked” by any client. Any client can include the class header, then define its own static function foo, which will automatically be a friend of the class. The effect will be as if you made all members of the class public.


4.10 Using the Predefined __func__ Symbol for Function Name

When you use -features=extensions, the compiler implicitly declares the identifier __func__ in each function as a static array of const char. If the program uses the identifier, the compiler also provides the following definition where function-name is the unadorned name of the function. Class membership, namespaces, and overloading are not reflected in the name.


static const char __func__[] = "function-name";

For example, consider the following code fragment.


#include <stdio.h>
void myfunc(void)
{
  printf("%s\n", __func__);
}

Each time the function is called, it will print the following to the standard output stream.


myfunc

4.11 The __packed__ Attribute

This attribute, attached to struct or union type definition, specifies that each member (other than zero-width bitfields) of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used.

Specifying this attribute for struct and union types is equivalent to specifying the packed attribute on each of the structure or union members.

In the following example struct my_packed_struct's members are packed closely together, but the internal layout of its s member is not packed. To do that, struct my_unpacked_struct would also need to be packed.


struct my_unpacked_struct
{
   char c;
   int i;
;
              
struct __attribute__ ((__packed__)) my_packed_struct
{
   char c;
   int  i;
   struct my_unpacked_struct s;
};

You may only specify this attribute on the definition of a enum, struct or union, and not on a typedef that does not also define the enumerated type, structure or union.

Chapter 5 Program Organization

The file organization of a C++ program requires more care than is typical for a C program. This chapter describes how to set up your header files and your template definitions.

5.1 Header Files

Creating an effective header file can be difficult. Often your header file must adapt to different versions of both C and C++. To accommodate templates, make sure your header file is tolerant of multiple inclusions (idempotent).

5.1.1 Language-Adaptable Header Files

You might need to develop header files for inclusion in both C and C++ programs. However, Kernighan and Ritchie C (K&R C), also known as “classic C,” ANSI C, Annotated Reference Manual C++ (ARM C++), and ISO C++ sometimes require different declarations or definitions for the same program element within a single header file. (See the C++ Migration Guide for additional information on the variations between languages and versions.) To make header files acceptable to all these standards, you might need to use conditional compilation based on the existence or value of the preprocessor macros __STDC__ and __cplusplus.

The macro __STDC__ is not defined in K&R C, but is defined in both ANSI C and C++. Use this macro to separate K&R C code from ANSI C or C++ code. This macro is most useful for separating prototyped from nonprototyped function definitions.


#ifdef __STDC__
int function(char*,...);      // C++ & ANSI C declaration
#else
int function();               // K&R C
#endif

The macro __cplusplus is not defined in C, but is defined in C++.


Note –

Early versions of C++ defined the macro c_plusplus instead of __ cplusplus. The macro c_plusplus is no longer defined.


Use the definition of the __cplusplus macro to separate C and C++. This macro is most useful in guarding the specification of an extern “C” interface for function declarations, as shown in the following example. To prevent inconsistent specification of extern “C”, never place an #include directive within the scope of an extern “C” linkage specification.


#include “header.h”
...                     // ... other include files...
#if defined(__cplusplus)
extern “C” {
#endif
  int g1();
  int g2();
  int g3()
#if defined(__cplusplus)
}
#endif

In ARM C++, the __cplusplus macro has a value of 1. In ISO C++, the macro has the value 199711L (the year and month of the standard expressed as a long constant). Use the value of this macro to separate ARM C++ from ISO C++. The macro value is most useful for guarding changes in template syntax.


// template function specialization
#if __cplusplus < 199711L
int power(int,int);                       // ARM C++
#else
template <> int power(int,int);           // ISO C++
#endif

5.1.2 Idempotent Header Files

Your header files should be idempotent. That is, the effect of including a header file many times should be exactly the same as including the header file only once. This property is especially important for templates. You can best accomplish idempotency by setting preprocessor conditions that prevent the body of your header file from appearing more than once.


#ifndef HEADER_H
#define HEADER_H
/* contents of header file */
#endif

5.2 Template Definitions

You can organize your template definitions in two ways: with definitions included and with definitions separated. The definitions-included organization allows greater control over template compilation.

5.2.1 Template Definitions Included

When you put the declarations and definitions for a template within the file that uses the template, the organization is definitions-included. For example:

main.cc


template <class Number> Number twice(Number original);
template <class Number> Number twice(Number original )
    { return original + original; }
int main()
    { return twice<int>(-3); }

When a file using a template includes a file that contains both the template’s declaration and the template’s definition, the file that uses the template also has the definitions-included organization. For example:

twice.h


#ifndef TWICE_H
#define TWICE_H
template <class Number>
Number twice(Number original);
template <class Number> Number twice( Number original )
    { return original + original; }
#endif

main.cc


#include “twice.h”
int main()
    { return twice(-3); }

Note –

It is very important to make your template headers idempotent. (See 5.1.2 Idempotent Header Files.)


5.2.2 Template Definitions Separate

Another way to organize template definitions is to keep the definitions in template definition files, as shown in the following example.

twice.h


#ifndef TWICE_H
#define TWICE_H
template <class Number>
Number twice(Number original);
#endif TWICE_H

twice.cc


template <class Number>
Number twice( Number original )
    { return original + original; }

main.cc


#include “twice.h”
int main( )
    { return twice<int>( -3 ); }

Template definition files must not include any non-idempotent header files and often need not include any header files at all. (See 5.1.2 Idempotent Header Files.) Note that not all compilers support the definitions-separate model for templates.

Because a separate definitions file is a header file, it might be included implicitly in many files. It therefore should not contain any function or variable definitions, unless they are part of a template definition. A separate definitions file can include type definitions, including typedefs.


Note –

Although source-file extensions for template definition files are commonly used (that is, .c, .C, .cc, .cpp, .cxx, or .c++), template definition files are header files. The compiler includes them automatically if necessary. Template definition files should not be compiled independently.


If you place template declarations in one file and template definitions in another file, you have to be very careful how you construct the definition file, what you name it, and where you put it. You might also need to identify explicitly to the compiler the location of the definitions. Refer to 7.5 Template Definition Searching” for information about the template definition search rules.

Chapter 6 Creating and Using Templates

Templates make it possible for you to write a single body of code that applies to a wide range of types in a type-safe manner. This chapter introduces template concepts and terminology in the context of function templates, discusses the more complicated (and more powerful) class templates, and describes the composition of templates. Also discussed are template instantiation, default template parameters, and template specialization. The chapter concludes with a discussion of potential problem areas for templates.

6.1 Function Templates

A function template describes a set of related functions that differ only by the types of their arguments or return values.

6.1.1 Function Template Declaration

You must declare a template before you can use it. A declaration, as in the following example, provides enough information to use the template, but not enough information to implement the template.


template <class Number> Number twice( Number original );

In this example, Number is a template parameter; it specifies the range of functions that the template describes. More specifically, Number is a template type parameter, and its use within the template definition stands for a type determined at the location where the template is used.

6.1.2 Function Template Definition

If you declare a template, you must also define it. A definition provides enough information to implement the template. The following example defines the template declared in the previous example.


template <class Number> Number twice( Number original )
    { return original + original; }

Because template definitions often appear in header files, a template definition might be repeated in several compilation units. All definitions, however, must be the same. This restriction is called the One-Definition Rule.

6.1.3 Function Template Use

Once declared, templates can be used like any other function. Their use consists of naming the template and providing function arguments. The compiler can infer the template type arguments from the function argument types. For example, you can use the previously declared template as follows.


double twicedouble( double item )
    { return twice( item ); }

If a template argument cannot be inferred from the function argument types, it must be supplied where the function is called. For example:


template<class T> T func(); // no function arguments
int k = func<int>(); // template argument supplied explicitly

6.2 Class Templates

A class template describes a set of related classes or data types that differ only by types, by integral values, by pointers or references to variables with global linkage, or by a combination thereof. Class templates are particularly useful in describing generic, but type-safe, data structures.

6.2.1 Class Template Declaration

A class template declaration provides only the name of the class and its template arguments. Such a declaration is an incomplete class template.

The following example is a template declaration for a class named Array that takes any type as an argument.


template <class Elem> class Array;

This template is for a class named String that takes an unsigned int as an argument.


template <unsigned Size> class String;

6.2.2 Class Template Definition

A class template definition must declare the class data and function members, as in the following examples.


template <class Elem> class Array {
        Elem* data;
        int size;
    public:
        Array( int sz );
        int GetSize();
        Elem& operator[]( int idx );
};

template <unsigned Size> class String {
        char data[Size];
        static int overflows;
    public:
        String( char *initial );
        int length();
};

Unlike function templates, class templates can have both type parameters (such as class Elem) and expression parameters (such as unsigned Size). An expression parameter can be:

6.2.3 Class Template Member Definitions

The full definition of a class template requires definitions for its function members and static data members. Dynamic (nonstatic) data members are sufficiently defined by the class template declaration.

6.2.3.1 Function Member Definitions

The definition of a template function member consists of the template parameter specification followed by a function definition. The function identifier is qualified by the class template’s class name and the template arguments. The following example shows definitions of two function members of the Array class template, which has a template parameter specification of template <class Elem>. Each function identifier is qualified by the template class name and the template argument Array<Elem>.


template <class Elem> Array<Elem>::Array( int sz )
    {size = sz; data = new Elem[size];}

template <class Elem> int Array<Elem>::GetSize()
    { return size; }

This example shows definitions of function members of the String class template.


#include <string.h>
template <unsigned Size> int String<Size>::length( )
    {int len = 0;
      while (len < Size && data[len]!= ’\0’) len++;
      return len;}

template <unsigned Size> String<Size>::String(char *initial)
    {strncpy(data, initial, Size);
      if (length( ) == Size) overflows++;}

6.2.3.2 Static Data Member Definitions

The definition of a template static data member consists of the template parameter specification followed by a variable definition, where the variable identifier is qualified by the class template name and its template actual arguments.


template <unsigned Size> int String<Size>::overflows = 0;

6.2.4 Class Template Use

A template class can be used wherever a type can be used. Specifying a template class consists of providing the values for the template name and arguments. The declaration in the following example creates the variable int_array based upon the Array template. The variable’s class declaration and its set of methods are just like those in the Array template except that Elem is replaced with int (see 6.3 Template Instantiation).


Array<int> int_array(100);

The declaration in this example creates the short_string variable using the String template.


String<8> short_string("hello");

You can use template class member functions as you would any other member function


int x = int_array.GetSize( );

int x = short_string.length( );
.

6.3 Template Instantiation

Template instantiation involves generating a concrete class or function (instance) for a particular combination of template arguments. For example, the compiler generates a class for Array<int> and a different class for Array<double>. The new classes are defined by substituting the template arguments for the template parameters in the definition of the template class. In the Array<int> example, shown in the preceding “Class Templates” section, the compiler substitutes int wherever Elem appears.

6.3.1 Implicit Template Instantiation

The use of a template function or template class introduces the need for an instance. If that instance does not already exist, the compiler implicitly instantiates the template for that combination of template arguments.

6.3.2 Explicit Template Instantiation

The compiler implicitly instantiates templates only for those combinations of template arguments that are actually used. This approach may be inappropriate for the construction of libraries that provide templates. C++ provides a facility to explicitly instantiate templates, as seen in the following examples.

6.3.2.1 Explicit Instantiation of Template Functions

To instantiate a template function explicitly, follow the template keyword by a declaration (not definition) for the function, with the function identifier followed by the template arguments.


template float twice<float>(float original);

Template arguments may be omitted when the compiler can infer them.


template int twice(int original);

6.3.2.2 Explicit Instantiation of Template Classes

To instantiate a template class explicitly, follow the template keyword by a declaration (not definition) for the class, with the class identifier followed by the template arguments.


template class Array<char>;

template class String<19>;

When you explicitly instantiate a class, all of its members are also instantiated.

6.3.2.3 Explicit Instantiation of Template Class Function Members

To explicitly instantiate a template class function member, follow the template keyword by a declaration (not definition) for the function, with the function identifier qualified by the template class, followed by the template arguments.


template int Array<char>::GetSize();

template int String<19>::length();

6.3.2.4 Explicit Instantiation of Template Class Static Data Members

To explicitly instantiate a template class static data member, follow the template keyword by a declaration (not definition) for the member, with the member identifier qualified by the template class, followed by the template argument.


template int String<19>::overflows;

6.4 Template Composition

You can use templates in a nested manner. This is particularly useful when defining generic functions over generic data structures, as in the standard C++ library. For example, a template sort function may be declared over a template array class:


template <class Elem> void sort(Array<Elem>);

and defined as:


template <class Elem> void sort(Array<Elem> store)
    {int num_elems = store.GetSize();
      for (int i = 0; i < num_elems-1; i++)
          for (int j = i+1; j < num_elems; j++)
              if (store[j-1] > store[j])
                  {Elem temp = store[j];
                    store[j] = store[j-1];
                    store[j-1] = temp;}}

The preceding example defines a sort function over the predeclared Array class template objects. The next example shows the actual use of the sort function.


Array<int> int_array(100);   // construct an array of ints
sort(int_array);             // sort it

6.5 Default Template Parameters

You can give default values to template parameters for class templates (but not function templates).


template <class Elem = int> class Array;
template <unsigned Size = 100> class String;

If a template parameter has a default value, all parameters after it must also have default values. A template parameter can have only one default value.

6.6 Template Specialization

There may be performance advantages to treating some combinations of template arguments as a special case, as in the following examples for twice. Alternatively, a template description might fail to work for a set of its possible arguments, as in the following examples for sort. Template specialization allows you to define alternative implementations for a given combination of actual template arguments. The template specialization overrides the default instantiation.

6.6.1 Template Specialization Declaration

You must declare a specialization before any use of that combination of template arguments. The following examples declare specialized implementations of twice and sort.


template <> unsigned twice<unsigned>( unsigned original );

template <> sort<char*>(Array<char*> store);

You can omit the template arguments if the compiler can unambiguously determine them. For example:


template <> unsigned twice(unsigned original);

template <> sort(Array<char*> store);

6.6.2 Template Specialization Definition

You must define all template specializations that you declare. The following examples define the functions declared in the preceding section.


template <> unsigned twice<unsigned>(unsigned original)
    {return original << 1;}

#include <string.h>
template <> void sort<char*>(Array<char*> store)
    {int num_elems = store.GetSize();
      for (int i = 0; i < num_elems-1; i++)
          for (int j = i+1; j < num_elems; j++)
              if (strcmp(store[j-1], store[j]) > 0)
                  {char *temp = store[j];
                    store[j] = store[j-1];
                    store[j-1] = temp;}}

6.6.3 Template Specialization Use and Instantiation

A specialization is used and instantiated just as any other template, except that the definition of a completely specialized template is also an instantiation.

6.6.4 Partial Specialization

In the previous examples, the templates are fully specialized. That is, they define an implementation for specific template arguments. A template can also be partially specialized, meaning that only some of the template parameters are specified, or that one or more parameters are limited to certain categories of type. The resulting partial specialization is itself still a template. For example, the following code sample shows a primary template and a full specialization of that template.


template<class T, class U> class A {...}; //primary template
template<> class A<int, double> {...};    //specialization

The following code shows examples of partial specialization of the primary template.


template<class U> class A<int> {...};          // Example 1
template<class T, class U> class A<T*> {...}; // Example 2
template<class T> class A<T**, char> {...};   // Example 3

6.7 Template Problem Areas

This section describes problems you might encounter when using templates.

6.7.1 Nonlocal Name Resolution and Instantiation

Sometimes a template definition uses names that are not defined by the template arguments or within the template itself. If so, the compiler resolves the name from the scope enclosing the template, which could be the context at the point of definition, or at the point of instantiation. A name can have different meanings in different places, yielding different resolutions.

Name resolution is complex. Consequently, you should not rely on nonlocal names, except those provided in a pervasive global environment. That is, use only nonlocal names that are declared and defined the same way everywhere. In the following example, the template function converter uses the nonlocal names intermediary and temporary. These names have different definitions in use1.cc and use2.cc, and will probably yield different results under different compilers. For templates to work reliably, all nonlocal names (intermediary and temporary in this case) must have the same definition everywhere.


use_common.h
// Common template definition
template <class Source, class Target>
Target converter(Source source)
       {temporary = (intermediary)source;
       return (Target)temporary;}
use1.cc
typedef int intermediary;
int temporary;

#include "use_common.h"
use2.cc
typedef double intermediary;
unsigned int temporary;

#include "use_common.h"

A common use of nonlocal names is the use of the cin and cout streams within a template. Few programmers really want to pass the stream as a template parameter, so they refer to a global variable. However, cin and cout must have the same definition everywhere.

6.7.2 Local Types as Template Arguments

The template instantiation system relies on type-name equivalence to determine which templates need to be instantiated or reinstantiated. Thus local types can cause serious problems when used as template arguments. Beware of creating similar problems in your code. For example:


Example 6–1 Example of Local Type as Template Argument Problem


array.h
template <class Type> class Array {
        Type* data;
        int   size;
    public:
        Array(int sz);
        int GetSize();
};

array.cc
template <class Type> Array<Type>::Array(int sz)
    {size = sz; data = new Type[size];}
template <class Type> int Array<Type>::GetSize()
    {return size;}

file1.cc
#include "array.h"
struct Foo {int data;};
Array<Foo> File1Data(10);

file2.cc
#include "array.h"
struct Foo {double data;};
Array<Foo> File2Data(20);

The Foo type as registered in file1.cc is not the same as the Foo type registered in file2.cc. Using local types in this way could lead to errors and unexpected results.

6.7.3 Friend Declarations of Template Functions

Templates must be declared before they are used. A friend declaration constitutes a use of the template, not a declaration of the template. A true template declaration must precede the friend declaration. For example, when the compilation system attempts to link the produced object file for the following example, it generates an undefined error for the operator<< function, which is not instantiated.


Example 6–2 Example of Friend Declaration Problem


array.h
// generates undefined error for the operator<< function
#ifndef ARRAY_H
#define ARRAY_H
#include <iosfwd>

template<class T> class array {
    int size;
public:
    array();
    friend std::ostream&
        operator<<(std::ostream&, const array<T>&);
};
#endif

array.cc
#include <stdlib.h>
#include <iostream>

template<class T> array<T>::array() {size = 1024;}

template<class T>
std::ostream&
operator<<(std::ostream& out, const array<T>& rhs)
    {return out <<’[’ << rhs.size <<’]’;}

main.cc
#include <iostream>
#include "array.h"

int main()
{
    std::cout
      << "creating an array of int... " << std::flush;
    array<int> foo;
    std::cout << "done\n";
    std::cout << foo << std::endl;
    return 0;
}

Note that there is no error message during compilation because the compiler reads the following as the declaration of a normal function that is a friend of the array class.


friend ostream& operator<<(ostream&, const array<T>&);

Because operator<< is really a template function, you need to supply a template declaration for prior to the declaration of template class array. However, because operator<< has a parameter of type array<T>, you must precede the function declaration with a declaration of array<T>. The file array.h must look like this:


#ifndef ARRAY_H
#define ARRAY_H
#include <iosfwd>

// the next two lines declare operator<< as a template function
template<class T> class array;
template<class T>
    std::ostream& operator<<(std::ostream&, const array<T>&);

template<class T> class array {
    int size;
public:
    array();
    friend std::ostream&
      operator<< <T> (std::ostream&, const array<T>&);
};
#endif

6.7.4 Using Qualified Names Within Template Definitions

The C++ standard requires types with qualified names that depend upon template arguments to be explicitly noted as type names with the typename keyword. This is true even if the compiler can “know” that it should be a type. The comments in the following example show the types with qualified names that require the typename keyword.


struct simple {
  typedef int a_type;
  static int a_datum;
};
int simple::a_datum = 0; // not a type
template <class T> struct parametric {
  typedef T a_type;
  static T a_datum;
};
template <class T> T parametric<T>::a_datum = 0;   // not a type
template <class T> struct example {
  static typename T::a_type variable1;             // dependent
  static typename parametric<T>::a_type variable2; // dependent
  static simple::a_type variable3;                 // not dependent
};
template <class T> typename T::a_type             // dependent
  example<T>::variable1 = 0;                      // not a type
template <class T> typename parametric<T>::a_type // dependent
  example<T>::variable2 = 0;                      // not a type
template <class T> simple::a_type   // not dependent
example<T>::variable3 = 0;          // not a type

6.7.5 Nesting Template Names

Because the “>>” character sequence is interpreted as the right-shift operator, you must be careful when you use one template names inside another. Make sure you separate adjacent “>” characters with at least one blank space.

For example, the following ill-formed statement:


Array<String<10>> short_string_array(100); // >> = right-shift

is interpreted as:


Array<String<10 >> short_string_array(100);

The correct syntax is:


Array<String<10> > short_string_array(100);

6.7.6 Referencing Static Variables and Static Functions

Within a template definition, the compiler does not support referencing an object or function that is declared static at global scope or in a namespace. If multiple instances are generated, the One-Definition Rule (C++ standard section 3.2) is violated, because each instance refers to a different object. The usual failure indication is missing symbols at link time.

If you want a single object to be shared by all template instantiations, then make the object a nonstatic member of a named namespace. If you want a different object for each instantiation of a template class, then make the object a static member of the template class. If you want a different object for each instantiation of a template function, then make the object local to the function.

6.7.7 Building Multiple Programs Using Templates in the Same Directory

If you are building more than one program or library by specifying -instances=extern, it’s advisable to build them in separate directories. If you want to build in the same directory then you should clean the repository between the different builds. This avoids any unpredictable errors. For more information see 7.4.4 Sharing Template Repositories.

Consider the following example with make files a.cc, b.cc, x.h, and x.cc. Note that this example is meaningful only if you specify -instances=extern:


........
Makefile
........
CCC = CC

all: a b

a:
    $(CCC) -I. -instances=extern -c a.cc
    $(CCC) -instances=extern -o a a.o

b:
    $(CCC) -I. -instances=extern -c b.cc
    $(CCC) -instances=extern -o b b.o

clean:
    /bin/rm -rf SunWS_cache *.o a b

...
x.h
...
template <class T> class X {
public:
  int open();
  int create();
  static int variable;
};

...
x.cc
...
template <class T> int X<T>::create() {
  return variable;
}

template <class T> int X<T>::open() {
  return variable;
}

template <class T> int X<T>::variable = 1;

...
a.cc
...
#include "x.h"

int main()
{
  X<int> temp1;

  temp1.open();
  temp1.create();
}

...
b.cc
...
#include "x.h"

int main()
{
  X<int> temp1;

  temp1.create();
}

If you build both a and b, add a make clean between the two builds. The following commands result in an error:


example% make a
example% make b

The following commands will not produce any error:


example% make a
example% make clean
example% make b

Chapter 7 Compiling Templates

Template compilation requires the C++ compiler to do more than traditional UNIX compilers have done. The C++ compiler must generate object code for template instances on an as-needed basis. It might share template instances among separate compilations using a template repository. It might accept some template compilation options. It must locate template definitions in separate source files and maintain consistency between template instances and mainline code.

7.1 Verbose Compilation

When given the flag -verbose=template, the C++ compiler notifies you of significant events during template compilation. Conversely, the compiler does not notify you when given the default, -verbose=no%template. The +w option might give other indications of potential problems when template instantiation occurs.

7.2 Repository Administration

The CCadmin(1) command administers the template repository (used only with the option -instances=extern). For example, changes in your program can render some instantiations superfluous, thus wasting storage space. The CCadmin– clean command (formerly ptclean) clears out all instantiations and associated data. Instantiations are recreated only when needed.

7.2.1 Generated Instances

The compiler treats inline template functions as inline functions for the purposes of template instance generation. The compiler manages them as it does other inline functions, and the descriptions in this chapter do not apply to template inline functions.

7.2.2 Whole-Class Instantiation

The compiler usually instantiates members of template classes independently of other members, so that the compiler instantiates only members that are used within the program. Methods written solely for use through a debugger will therefore not normally be instantiated.

There are two means to ensure that debugging members are available to the debugger.

The ISO C++ Standard permits developers to write template classes for which all members may not be legal with a given template argument. As long as the illegal members are not instantiated, the program is still well formed. The ISO C++ Standard Library uses this technique. However, the -template=wholeclass option instantiates all members, and hence cannot be used with such template classes when instantiated with the problematic template arguments.

7.2.3 Compile-Time Instantiation

Instantiation is the process by which a C++ compiler creates a usable function or object from a template. The C++ compiler uses compile-time instantiation, which forces instantiations to occur when the reference to the template is being compiled.

The advantages of compile-time instantiation are:

Templates can be instantiated multiple times if source files reside in different directories or if you use libraries with template symbols.

7.2.4 Template Instance Placement and Linkage

By default, instances go into special address sections, and the linker recognizes and discards duplicates. You can instruct the compiler to use one of five instance placement and linkage methods: external, static, global, explicit, and semi-explicit.

This section discusses the five instance placement and linkage methods. Additional information about generating instances can be found in 6.3 Template Instantiation.

7.3 External Instances

With the external instances method, all instances are placed within the template repository. The compiler ensures that exactly one consistent template instance exists; instances are neither undefined nor multiply defined. Templates are reinstantiated only when necessary. For non-debug code, the total size of all object files (including any within the template cache) may be smaller with -instances=extern than with -instances=global.

Template instances receive global linkage in the repository. Instances are referenced from the current compilation unit with external linkage.


Note –

If you are compiling and linking in separate steps and you specify -instance=extern for the compilation step, you must also specify it for the link step.


The disadvantage of this method is that the cache must be cleared whenever changing programs or making significant program changes. The cache is a bottleneck for parallel compilation, as when using dmake because access to the cache must be restricted to one compilation at a time. Also, you can only build one program within a directory.

It can take longer to determine whether a valid template instance is already in the cache than just to create the instance in the main object file and discard it later if needed.

Specify external linkage with the -instances=extern option.

Because instances are stored within the template repository, you must use the CC command to link C++ objects that use external instances into programs.

If you wish to create a library that contains all template instances that it uses, use the CC command with the— xar option. Do not use the ar command. For example:


example% CC– xar -instances=extern– o libmain.a a.o b.o c.o

See Table 15–3 for more information.

7.3.1 Possible Cache Conflicts

Do not run different compiler versions in the same directory due to possible cache conflicts when you specify -instance=extern. Consider the following when you use the -instances=extern template model:

7.3.2 Static Instances


Note –

The -instances=static option is deprecated. There is no longer any reason to use -instances=static, because -instances=global now gives you all the advantages of static without the disadvantages. This option was provided in earlier compilers to overcome problems that no longer exist.


With the static instances method, all instances are placed within the current compilation unit. As a consequence, templates are reinstantiated during each recompilation; instances are not saved to the template repository.

The disadvantage of this method is that it does not follow language semantics and makes substantially larger objects and executables.

Instances receive static linkage. These instances will not be visible or usable outside the current compilation unit. As a result, templates might have identical instantiations in several object files. Because multiple instances produce unnecessarily large programs, static instance linkage is suitable only for small programs, where templates are unlikely to be multiply instantiated.

Compilation is potentially faster with static instances, so this method might also be suitable during Fix-and-Continue debugging. (See Debugging a Program With dbx.)


Note –

If your program depends on sharing template instances (such as static data members of template classes or template functions) across compilation units, do not use the static instances method. Your program will not work properly.


Specify static instance linkage with the -instances=static compiler option.

7.3.3 Global Instances

Unlike early compiler releases, it is not necessary to guard against multiple copies of a global instance.

The advantage of this method is that incorrect source code commonly accepted by other compilers is now also accepted in this mode. In particular, references to static variables from within a template instances are not legal, but commonly accepted.

The disadvantage of this method is that individual object files may be larger, due to copies of template instances in multiple files. If you compile some object files for debug using the -g option, and some without, it is hard to predict whether you will get a debug or non-debug version of a template instance linked into the program.

Template instances receive global linkage. These instances are visible and usable outside the current compilation unit.

Specify global instances with the -instances=global option (this is the default).

7.3.4 Explicit Instances

In the explicit instances method, instances are generated only for templates that are explicitly instantiated. Implicit instantiations are not satisfied. Instances are placed within the current compilation unit.

The advantage of this method is that you have the least amount of template compilation and smallest object sizes.

The disadvantage is that you must perform all instantiation manually.

Template instances receive global linkage. These instances are visible and usable outside the current compilation unit. The linker recognizes and discards duplicates.

Specify explicit instances with the -instances=explicit option.

7.3.5 Semi-Explicit Instances

When you use the semi-explicit instances method, instances are generated only for templates that are explicitly instantiated or implicitly instantiated within the body of a template. Instances required by explicitly-created instances are generated automatically. Implicit instantiations in the mainline code are not satisfied. Instances are placed within the current compilation unit. As a consequence, templates are reinstantiated during each recompilation; instances receive global linkage and they are not saved to the template repository.

Specify semi-explicit instances with the -instances=semiexplicit option.

7.4 The Template Repository

The template repository stores template instances between separate compilations so that template instances are compiled only when it is necessary. The template repository contains all nonsource files needed for template instantiation when using the external instances method. The repository is not used for other kinds of instances.

7.4.1 Repository Structure

The template repository is contained, by default, within a cache directory called SunWS_cache.

The cache directory is contained within the directory in which the object files are placed. You can change the name of the cache directory by setting the SUNWS_CACHE_NAME environment variable. Note that the value of the SUNWS_CACHE_NAME variable must be a directory name and not a path name. This is because the compiler automatically places the template cache directory under the object file directory so the compiler already has a path.

7.4.2 Writing to the Template Repository

When the compiler must store template instances, it stores them within the template repository corresponding to the output file. For example, the following command line writes the object file to ./sub/a.o and writes template instances into the repository contained within ./sub/SunWS_cache. If the cache directory does not exist, and the compiler needs to instantiate a template, the compiler will create the directory.


example% CC -o sub/a.o a.cc

7.4.3 Reading From Multiple Template Repositories

The compiler reads from the template repositories corresponding to the object files that it reads. That is, the following command line reads from ./sub1/SunWS_cache and ./sub2/SunWS_cache, and, if necessary, writes to ./SunWS_cache.


example% CC sub1/a.o sub2/b.o

7.4.4 Sharing Template Repositories

Templates that are within a repository must not violate the one-definition rule of the ISO C++ standard. That is, a template must have the same source in all uses of the template. Violating this rule produces undefined behavior.

The simplest, though most conservative, way to ensure that the rule is not violated is to build only one program or library within any one directory. Two unrelated programs might use the same type name or external name to mean different things. If the programs share a template repository, template definitions could conflict, thus yielding unpredictable results.

7.4.5 Template Instance Automatic Consistency With -instances=extern

The template repository manager ensures that the states of the instances in the repository are consistent and up-to-date with your source files when you specify -instances=extern.

For example, if your source files are compiled with the– g option (debugging on), the files you need from the database are also compiled with– g.

In addition, the template repository tracks changes in your compilation. For example, if you have the— DDEBUG flag set to define the name DEBUG, the database tracks this. If you omit this flag on a subsequent compile, the compiler reinstantiates those templates on which this dependency is set.


Note –

If you remove the source code of a template or stop using a template, instances of the template remain in the cache. If you change the signature of a function template, instances using the old signature remain in the cache. If you run into strange behavior at compile or link time due to these issues, clear the template cache and rebuild the program.


7.5 Template Definition Searching

When you use the definitions-separate template organization, template definitions are not available in the current compilation unit, and the compiler must search for the definition. This section describes how the compiler locates the definition.

Definition searching is somewhat complex and prone to error. Therefore, you should use the definitions-included template file organization if possible. Doing so helps you avoid definition searching altogether. See 5.2.1 Template Definitions Included.


Note –

If you use the -template=no%extdef option, the compiler will not search for separate source files.


7.5.1 Source File Location Conventions

Without the specific directions provided with an options file, the compiler uses a Cfront-style method to locate template definition files. This method requires that the template definition file contain the same base name as the template declaration file. This method also requires that the template definition file be on the current include path. For example, if the template function foo() is located in foo.h, the matching template definition file should be named foo.cc or some other recognizable source-file extension (.C, .c, .cc, .cpp, .cxx, or .c++). The template definition file must be located in one of the normal include directories or in the same directory as its matching header file.

7.5.2 Definitions Search Path

As an alternative to the normal search path set with –I, you can specify a search directory for template definition files with the option –ptidirectory. Multiple -pti flags define multiple search directories—that is, a search path. If you use -ptidirectory, the compiler looks for template definition files on this path and ignores the –I flag. Since the –ptidirectory flag complicates the search rules for source files, use the –I option instead of the –ptidirectory option.

7.5.3 Troubleshooting a Problematic Search

Sometimes the compiler generates confusing warnings or error messages because it is looking for file that you don’t intend to compile. Usually, the problem is that a file, for example foo.h, contains template declarations and another file, such as foo.cc, gets implicitly included.

If a header file, foo.h, has template declarations, the compiler searches for a file called foo with a C++ file extension (.C, .c, .cc, .cpp, .cxx, or .c++) by default. If the compiler finds such a file, it includes the file automatically. See 7.5 Template Definition Searching for more information on such searches.

If you have a file foo.cc that you don’t intend to be treated this way, you have two options:

Chapter 8 Exception Handling

This chapter discusses the C++ compiler’s implementation of exception handling. Additional information can be found in 11.2 Using Exceptions in a Multithreaded Program. For more information on exception handling, see The C++ Programming Language, Third Edition, by Bjarne Stroustrup (Addison-Wesley, 1997).

8.1 Synchronous and Asynchronous Exceptions

Exception handling is designed to support only synchronous exceptions, such as array range checks. The term synchronous exception means that exceptions can be originated only from throw expressions.

The C++ standard supports synchronous exception handling with a termination model. Termination means that once an exception is thrown, control never returns to the throw point.

Exception handling is not designed to directly handle asynchronous exceptions such as keyboard interrupts. However, you can make exception handling work in the presence of asynchronous events if you are careful. For instance, to make exception handling work with signals, you can write a signal handler that sets a global variable, and create another routine that polls the value of that variable at regular intervals and throws an exception when the value changes. You cannot throw an exception from a signal handler.

8.2 Specifying Runtime Errors

There are five runtime error messages associated with exceptions:

When errors are detected at runtime, the error message displays the type of the current exception and one of the five error messages. By default, the predefined function terminate() is called, which then calls abort().

The compiler uses the information provided in the exception specification to optimize code production. For example, table entries for functions that do not throw exceptions are suppressed, and runtime checking for exception specifications of functions is eliminated wherever possible.

8.3 Disabling Exceptions

If you know that exceptions are not used in a program, you can use the compiler option -features=no%except to suppress generation of code that supports exception handling. The use of the option results in slightly smaller code size and faster code execution. However, when files compiled with exceptions disabled are linked to files using exceptions, some local objects in the files compiled with exceptions disabled are not destroyed when exceptions occur. By default, the compiler generates code to support exception handling. Unless the time and space overhead is important, it is usually better to leave exceptions enabled.


Note –

Because the C++ standard library, dynamic_cast, and the default operator new require exceptions, you should not turn off exceptions when you compile in standard mode (the default mode).


8.4 Using Runtime Functions and Predefined Exceptions

The standard header <exception> provides the classes and exception-related functions specified in the C++ standard. You can access this header only when compiling in standard mode (compiler default mode, or with option -compat=5). The following excerpt shows the <exception> header file declarations.


// standard header <exception>
namespace std {
    class exception {
           exception() throw();
           exception(const exception&) throw();
           exception& operator=(const exception&) throw();
           virtual ~exception() throw();
           virtual const char* what() const throw();
    };
    class bad_exception: public exception {...};
    // Unexpected exception handling
       typedef void (*unexpected_handler)();
       unexpected_handler
         set_unexpected(unexpected_handler) throw();
       void unexpected();
    // Termination handling
       typedef void (*terminate_handler)();
       terminate_handler set_terminate(terminate_handler) throw();
       void terminate();
       bool uncaught_exception() throw();
}

The standard class exception is the base class for all exceptions thrown by selected language constructs or by the C++ standard library. An object of type exception can be constructed, copied, and destroyed without generating an exception. The virtual member function what() returns a character string that describes the exception.

For compatibility with exceptions as used in C++ release 4.2, the header <exception.h> is also provided for use in standard mode. This header allows for a transition to standard C++ code and contains declarations that are not part of standard C++. Update your code to follow the C++ standard (using <exception> instead of <exception.h>) as development schedules permit.


// header <exception.h>, used for transition
#include <exception>
#include <new>
using std::exception;
using std::bad_exception;
using std::set_unexpected;
using std::unexpected;
using std::set_terminate;
using std::terminate;
typedef std::exception xmsg;
typedef std::bad_exception xunexpected;
typedef std::bad_alloc xalloc;

In compatibility mode (—compat[=4]), header <exception> is not available, and header <exception.h> refers to the same header provided with C++ release 4.2. It is not reproduced here.

8.5 Mixing Exceptions With Signals and Setjmp/Longjmp

You can use the setjmp/longjmp functions in a program where exceptions can occur, as long as they do not interact.

All the rules for using exceptions and setjmp/longjmp separately apply. In addition, a longjmp from point A to point B is valid only if an exception thrown at A and caught at B would have the same effect. In particular, you must not longjmp into or out of a try-block or catch-block (directly or indirectly), or longjmp past the initialization or non-trivial destruction of auto variables or temporary variables.

You cannot throw an exception from a signal handler.

8.6 Building Shared Libraries That Have Exceptions

Never use -Bsymbolic with programs containing C++ code, use linker map files instead or linker scoping options (See 4.1 Linker Scoping).. With -Bsymbolic, references in different modules can bind to different copies of what is supposed to be one global object.

The exception mechanism relies on comparing addresses. If you have two copies of something, their addresses won’t compare equal, and the exception mechanism can fail because the exception mechanism relies on comparing what are supposed to be unique addresses.

Chapter 9 Cast Operations

This chapter discusses the newer cast operators in the C++ standard: const_cast, reinterpret_cast, static_cast, and dynamic_cast. A cast converts an object or value from one type to another.

These cast operations provide finer control than previous cast operations. The dynamic_cast<> operator provides a way to check the actual type of a pointer to a polymorphic class. You can search with a text editor for all new-style casts (search for _cast), whereas finding old-style casts required syntactic analysis.

Otherwise, the new casts all perform a subset of the casts allowed by the classic cast notation. For example, const_cast<int*>(v) could be written (int*)v. The new casts simply categorize the variety of operations available to express your intent more clearly and allow the compiler to provide better checking.

The cast operators are always enabled. They cannot be disabled.

9.1 const_cast

The expression const_cast<T>(v) can be used to change the const or volatile qualifiers of pointers or references. (Among new-style casts, only const_cast<> can remove const qualifiers.) T must be a pointer, reference, or pointer-to-member type.


class A
{
public:
  virtual void f();
  int i;
};
extern const volatile int* cvip;
extern int* ip;
void use_of_const_cast()
{
  const A a1;
  const_cast<A&>(a1).f();                // remove const
ip = const_cast<int*> (cvip);    // remove const and volatile
}

9.2 reinterpret_cast

The expression reinterpret_cast<T>(v)changes the interpretation of the value of the expression v. It can be used to convert between pointer and integer types, between unrelated pointer types, between pointer-to-member types, and between pointer-to-function types.

Usage of the reinterpret_cast operator can have undefined or implementation-dependent results. The following points describe the only ensured behavior:


class A {int a; public: A();};
class B: public A {int b, c;};
void use_of_reinterpret_cast()
{
     A a1;
     long l = reinterpret_cast<long>(&a1);
     A* ap = reinterpret_cast<A*>(l);      // safe
     B* bp = reinterpret_cast<B*>(&a1);    // unsafe
     const A a2;
     ap = reinterpret_cast<A*>(&a2);  // error, const removed
}

9.3 static_cast

The expression static_cast<T>(v) converts the value of the expression v to type T. It can be used for any type conversion that is allowed implicitly. In addition, any value can be cast to void, and any implicit conversion can be reversed if that cast would be legal as an old-style cast.


class B            {...};
class C: public B {...};
enum E {first=1, second=2, third=3};
void use_of_static_cast(C* c1)
{
  B* bp = c1;                  // implicit conversion
  C* c2 = static_cast<C*>(bp); // reverse implicit conversion
  int i = second;              // implicit conversion
  E e = static_cast<E>(i);    // reverse implicit conversion
}

The static_cast operator cannot be used to cast away const. You can use static_cast to cast “down” a hierarchy (from a base to a derived pointer or reference), but the conversion is not checked; the result might not be usable. A static_cast cannot be used to cast down from a virtual base class.

9.4 Dynamic Casts

A pointer (or reference) to a class can actually point (refer) to any class derived from that class. Occasionally, it may be desirable to obtain a pointer to the fully derived class, or to some other subobject of the complete object. The dynamic cast provides this facility.


Note –

When compiling in compatibility mode ( -compat[=4]), you must compile with -f eatures=rtti if your program uses dynamic casts.


The dynamic type cast converts a pointer (or reference) to one class T1 into a pointer (reference) to another class T2. T1 and T2 must be part of the same hierarchy, the classes must be accessible (via public derivation), and the conversion must not be ambiguous. In addition, unless the conversion is from a derived class to one of its base classes, the smallest part of the hierarchy enclosing both T1 and T2 must be polymorphic (have at least one virtual function).

In the expression dynamic_cast<T>(v), v is the expression to be cast, and T is the type to which it should be cast. T must be a pointer or reference to a complete class type (one for which a definition is visible), or a pointer to cv void, where cv is an empty string, const, volatile, or const volatile.

9.4.1 Casting Up the Hierarchy

When casting up the hierarchy, if T points (or refers) to a base class of the type pointed (referred) to by v, the conversion is equivalent to static_cast<T>(v).

9.4.2 Casting to void*

If T is void*, the result is a pointer to the complete object. That is, v might point to one of the base classes of some complete object. In that case, the result of dynamic_cast<void*>(v) is the same as if you converted v down the hierarchy to the type of the complete object (whatever that is) and then to void*.

When casting to void*, the hierarchy must be polymorphic (have virtual functions).

9.4.3 Casting Down or Across the Hierarchy

When casting down or across the hierarchy, the hierarchy must be polymorphic (have virtual functions). The result is checked at runtime.

The conversion from v to T is not always possible when casting down or across a hierarchy. For example, the attempted conversion might be ambiguous, T might be inaccessible, or v might not point (or refer) to an object of the necessary type. If the runtime check fails and T is a pointer type, the value of the cast expression is a null pointer of type T. If T is a reference type, nothing is returned (there are no null references in C++), and the standard exception std::bad_cast is thrown.

For example, this example of public derivation succeeds:


#include <assert.h>
#include <stddef.h> // for NULL

class A {public: virtual void f();};
class B {public: virtual void g();};
class AB: public virtual A, public B {};

void simple_dynamic_casts()
{
  AB ab;
  B* bp = &ab;        // no casts needed
  A* ap = &ab;
  AB& abr = dynamic_cast<AB&>(*bp);  // succeeds
  ap = dynamic_cast<A*>(bp);         assert(ap!= NULL);
  bp = dynamic_cast<B*>(ap);         assert(bp!= NULL);
  ap = dynamic_cast<A*>(&abr);       assert(ap!= NULL);
  bp = dynamic_cast<B*>(&abr);       assert(bp!= NULL);
}

whereas this example fails because base class B is inaccessible.


#include <assert.h>
#include <stddef.h> // for NULL
#include <typeinfo>

class A {public: virtual void f() {}};
class B {public: virtual void g() {}};
class AB: public virtual A, private B {};

void attempted_casts()
{
  AB ab;
  B* bp = (B*)&ab; // C-style cast needed to break protection
  A* ap = dynamic_cast<A*>(bp); // fails, B is inaccessible
  assert(ap == NULL);
  try {
    AB& abr = dynamic_cast<AB&>(*bp); // fails, B is inaccessible
  }
  catch(const std::bad_cast&) {
    return; // failed reference cast caught here
  }
  assert(0); // should not get here
}

In the presence of virtual inheritance and multiple inheritance of a single base class, the actual dynamic cast must be able to identify a unique match. If the match is not unique, the cast fails. For example, given the additional class definitions:


class AB_B:     public AB,        public B {};
class AB_B__AB: public AB_B,      public AB {};

Example:


void complex_dynamic_casts()
{
  AB_B__AB ab_b__ab;
  A*ap = &ab_b__ab;
                    // okay: finds unique A statically
  AB*abp = dynamic_cast<AB*>(ap);
                    // fails: ambiguous
  assert(abp == NULL);
                    // STATIC ERROR: AB_B* ab_bp = (AB_B*)ap;
                    // not a dynamic cast
  AB_B*ab_bp = dynamic_cast<AB_B*>(ap);
                    // dynamic one is okay
  assert(ab_bp!= NULL);
}

The null-pointer error return of dynamic_cast is useful as a condition between two bodies of code—one to handle the cast if the type guess is correct, and one if it is not.


void using_dynamic_cast(A* ap)
{
  if (AB *abp = dynamic_cast<AB*>(ap))
    {            // abp is non-null,
                 // so ap was a pointer to an AB object
                 // go ahead and use abp
      process_AB(abp);}
  else
    {          // abp is null,
               // so ap was NOT a pointer to an AB object
               // do not use abp
      process_not_AB(ap);
    }
}

In compatibility mode (-compat[=4]), if runtime type information has not been enabled with the -features=rtti compiler option, the compiler converts dynamic_cast to static_cast and issues a warning.

If exceptions have been disabled, the compiler converts dynamic_cast<T&> to static_cast<T&> and issues a warning. (A dynamic_cast to a reference type requires an exception to be thrown if the conversion is found at run time to be invalid.) For information about exceptions, see 7.5.3 Troubleshooting a Problematic Search.

Dynamic cast is necessarily slower than an appropriate design pattern, such as conversion by virtual functions. See Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma (Addison-Wesley, 1994).

Chapter 10 Improving Program Performance

You can improve the performance of C++ functions by writing those functions in a manner that helps the compiler do a better job of optimizing them. Many books have been written on software performance in general and C++ in particular, and this chapter does not repeat such valuable information, but discusses only those performance techniques that strongly affect the C++ compiler.

10.1 Avoiding Temporary Objects

C++ functions often produce implicit temporary objects, each of which must be created and destroyed. For non-trivial classes, the creation and destruction of temporary objects can be expensive in terms of processing time and memory usage. The C++ compiler does eliminate some temporary objects, but it cannot eliminate all of them.

Write functions to minimize the number of temporary objects as long as your programs remain comprehensible. Techniques include using explicit variables rather than implicit temporary objects and using reference parameters rather than value parameters. Another technique is to implement and use operations such as += rather than implementing and using only + and =. For example, the first line below introduces a temporary object for the result of a + b, while the second line does not.


T x = a + b;
T x(a); x += b;

10.2 Using Inline Functions

Calls to small and quick functions can be smaller and quicker when expanded inline than when called normally. Conversely, calls to large or slow functions can be larger and slower when expanded inline than when branched to. Furthermore, all calls to an inline function must be recompiled whenever the function definition changes. Consequently, the decision to use inline functions requires considerable care.

Do not use inline functions when you anticipate changes to the function definition and recompiling all callers is expensive. Otherwise, use inline functions when the code to expand the function inline is smaller than the code to call the function or the application performs significantly faster with the function inline.

The compiler cannot inline all function calls, so making the most effective use of function inlining may require some source changes. Use the +w option to learn when function inlining does not occur. In the following situations, the compiler will not inline the function:

10.3 Using Default Operators

If a class definition does not declare a parameterless constructor, a copy constructor, a copy assignment operator, or a destructor, the compiler will implicitly declare them. These are called default operators. A C-like struct has these default operators. When the compiler builds a default operator, it knows a great deal about the work that needs to be done and can produce very good code. This code is often much faster than user-written code because the compiler can take advantage of assembly-level facilities while the programmer usually cannot. So, when the default operators do what is needed, the program should not declare user-defined versions of these operators.

Default operators are inline functions, so do not use default operators when inline functions are inappropriate (see the previous section). Otherwise, default operators are appropriate when:

Some C++ programming texts suggest that class programmers always define all operators so that any reader of the code will know that the class programmer did not forget to consider the semantics of the default operators. Obviously, this advice interferes with the optimization discussed above. The resolution of the conflict is to place a comment in the code stating that the class is using the default operator.

10.4 Using Value Classes

C++ classes, including structures and unions, are passed and returned by value. For Plain-Old-Data (POD) classes, the C++ compiler is required to pass the struct as would the C compiler. Objects of these classes are passed directly. For objects of classes with user-defined copy constructors, the compiler is effectively required to construct a copy of the object, pass a pointer to the copy, and destruct the copy after the return. Objects of these classes are passed indirectly. For classes that fall between these two requirements, the compiler can choose. However, this choice affects binary compatibility, so the compiler must choose consistently for every class.

For most compilers, passing objects directly can result in faster execution. This execution improvement is particularly noticeable with small value classes, such as complex numbers or probability values. You can sometimes improve program efficiency by designing classes that are more likely to be passed directly than indirectly.

In compatibility mode (-compat[=4]), a class is passed indirectly if it has any one of the following:

Otherwise, the class is passed directly.

In standard mode (the default mode), a class is passed indirectly if it has any one of the following:

Otherwise, the class is passed directly.

10.4.1 Choosing to Pass Classes Directly

To maximize the chance that a class will be passed directly:

10.4.2 Passing Classes Directly on Various Processors

Classes (and unions) that are passed directly by the C++ compiler are passed exactly as the C compiler would pass a struct (or union). However, C++ structs and unions are passed differently on different architectures.

Table 10–1 Passing of Structs and Unions by Architecture

Architecture  

Description  

SPARC V7/V8 

Structs and unions are passed and returned by allocating storage within the caller and passing a pointer to that storage. (That is, all structs and unions are passed by reference.) 

SPARC V9 

Structs with a size no greater than 16 bytes (32 bytes) are passed (returned) in registers. Unions and all other structs are passed and returned by allocating storage within the caller and passing a pointer to that storage. (That is, small structs are passed in registers; unions and large structs are passed by reference.) As a consequence, small value classes are passed as efficiently as primitive types. 

x86 platforms 

Structs and unions are passed by allocating space on the stack and copying the argument onto the stack. Structs and unions are returned by allocating a temporary object in the caller’s frame and passing the address of the temporary object as an implicit first parameter. 

10.5 Cache Member Variables

Accessing member variables is a common operation in C++ member functions.

The compiler must often load member variables from memory through the this pointer. Because values are being loaded through a pointer, the compiler sometimes cannot determine when a second load must be performed or whether the value loaded before is still valid. In these cases, the compiler must choose the safe, but slow, approach and reload the member variable each time it is accessed.

You can avoid unnecessary memory reloads by explicitly caching the values of member variables in local variables, as follows:

This optimization is most productive when the values can reside in registers, as is the case with primitive types. The optimization may also be productive for memory-based values because the reduced aliasing gives the compiler more opportunity to optimize.

This optimization may be counter productive if the member variable is often passed by reference, either explicitly or implicitly.

On occasion, the desired semantics of a class requires explicit caching of member variables, for instance when there is a potential alias between the current object and one of the member function’s arguments. For example:


complex& operator*= (complex& left, complex& right)
{
  left.real = left.real * right.real + left.imag * right.imag;
  left.imag = left.real * right.imag + left.image * right.real;
}

will yield unintended results when called with:


x*=x;

Chapter 11 Building Multithreaded Programs

This chapter explains how to build multithreaded programs. It also discusses the use of exceptions, explains how to share C++ Standard Library objects across threads, and describes how to use classic (old) iostreams in a multithreading environment.

For more information about multithreading, see the Multithreaded Programming Guide, the Tools.h++ User’s Guide, and the Standard C++ Library User’s Guide.

See also the OpenMP API User's Guide for information on using OpenMP shared memory paralellization directives to create multithreaded programs.

11.1 Building Multithreaded Programs

All libraries shipped with the C++ compiler are multithreading safe. If you want to build a multithreaded application, or if you want to link your application to a multithreaded library, you must compile and link your program with the –mt option. This option passes –D_REENTRANT to the preprocessor and passes –lthread in the correct order to ld. For compatibility mode (–compat[=4]), the –mt option ensures that libthread is linked before libC. For standard mode (the default mode), the -mt option ensures that libthread is linked before libCrun. Use of —mt is recommended a simpler and less error-prone alternative to specifying the macro and library..

11.1.1 Indicating Multithreaded Compilation

You can check whether an application is linked to libthread or not by using the ldd command:


example% CC -mt myprog.cc
example% ldd a.out
libm.so.1 =>      /usr/lib/libm.so.1
libCrun.so.1 =>   /usr/lib/libCrun.so.1
libthread.so.1 => /usr/lib/libthread.so.1
libc.so.1 =>      /usr/lib/libc.so.1
libdl.so.1 =>     /usr/lib/libdl.so.1

11.1.2 Using C++ Support Libraries With Threads and Signals

The C++ support libraries, libCrun, libiostream, libCstd, and libC are multithread safe but are not async safe. This means that in a multithreaded application, functions available in the support libraries should not be used in signal handlers. Doing so can result in a deadlock situation.

It is not safe to use the following in a signal handler in a multithreaded application:

11.2 Using Exceptions in a Multithreaded Program

The current exception-handling implementation is safe for multithreading; exceptions in one thread do not interfere with exceptions in other threads. However, you cannot use exceptions to communicate across threads; an exception thrown from one thread cannot be caught in another.

Each thread can set its own terminate() or unexpected() function. Calling set_terminate() or set_unexpected() in one thread affects only the exceptions in that thread. The default function for terminate() is abort() for any thread (see 8.2 Specifying Runtime Errors).

11.2.1 Thread Cancellation

Thread cancellation through a call to pthread_cancel(3T) results in the destruction of automatic (local nonstatic) objects on the stack except when you specify -noex or -features=no%except.

pthread_cancel(3T)uses the same mechanism as exceptions. When a thread is cancelled, the execution of local destructors is interleaved with the execution of cleanup routines that the user has registered with pthread_cleanup_push(). The local objects for functions called after a particular cleanup routine is registered are destroyed before that routine is executed.

11.3 Sharing C++ Standard Library Objects Between Threads

The C++ Standard Library (libCstd -library=Cstd) is MT-Safe, with the exception of some locales, and it ensures that the internals of the library work properly in a multi-threaded environment. You still need to lock around any library objects that you yourself share between threads. See the man pages for setlocale(3C) and attributes(5).

For example, if you instantiate a string, then create a new thread and pass that string to the thread by reference, then you must lock around write access to that string, since you are explicitly sharing the one string object between threads. (The facilities provided by the library to accomplish this task are described below.)

On the other hand, if you pass the string to the new thread by value, you do not need to worry about locking, even though the strings in the two different threads may be sharing a representation through Rogue Wave’s “copy on write” technology. The library handles that locking automatically. You are only required to lock when making an object available to multiple threads explicitly, either by passing references between threads or by using global or static objects.

The following describes the locking (synchronization) mechanism used internally in the C++ Standard Library to ensure correct behavior in the presence of multiple threads.

Two synchronization classes provide mechanisms for achieving multithreaded safety; _RWSTDMutex and _RWSTDGuard.

The _RWSTDMutex class provides a platform-independent locking mechanism through the following member functions:


class _RWSTDMutex
{
public:
    _RWSTDMutex ();
    ~_RWSTDMutex ();
    void acquire ();
    void release ();
};

The _RWSTDGuard class is a convenience wrapper class that encapsulates an object of _RWSTDMutex class. An _RWSTDGuard object attempts to acquire the encapsulated mutex in its constructor (throwing an exception of type ::thread_error, derived from std::exception on error), and releases the mutex in its destructor (the destructor never throws an exception).


class _RWSTDGuard
{
public:
    _RWSTDGuard (_RWSTDMutex&);
    ~_RWSTDGuard ();
};

Additionally, you can use the macro _RWSTD_MT_GUARD(mutex) (formerly _STDGUARD) to conditionally create an object of the _RWSTDGuard class in multithread builds. The object guards the remainder of the code block in which it is defined from being executed by multiple threads simultaneously. In single-threaded builds the macro expands into an empty expression.

The following example illustrates the use of these mechanisms.


#include <rw/stdmutex.h>

//
// An integer shared among multiple threads.
//
int I;

//
// A mutex used to synchronize updates to I.
//
_RWSTDMutex I_mutex;

//
// Increment I by one. Uses an _RWSTDMutex directly.
//

void increment_I ()
{
   I_mutex.acquire(); // Lock the mutex.
   I++;
   I_mutex.release(); // Unlock the mutex.
}

//
// Decrement I by one. Uses an _RWSTDGuard.
//

void decrement_I ()
{
   _RWSTDGuard guard(I_mutex); // Acquire the lock on I_mutex.
   --I;
   //
   // The lock on I is released when destructor is called on guard.
   //
}

11.4 Using Classic iostreams in a Multithreading Environment

This section describes how to use the iostream classes of the libC and libiostream libraries for input-output (I/O) in a multithreaded environment. It also provides examples of how to extend functionality of the library by deriving from the iostream classes. This section is not a guide for writing multithreaded code in C++, however.

The discussion here applies only to the old iostreams (libC and libiostream) and does not apply to libCstd, the new iostream that is part of the C++ Standard Library.

The iostream library allows its interfaces to be used by applications in a multithreaded environment by programs that utilize the multithreading capabilities when running supported versions of the Solaris operating system. Applications that utilize the single-threaded capabilities of previous versions of the library are not affected.

A library is defined to be MT-safe if it works correctly in an environment with threads. Generally, this “correctness” means that all of its public functions are reentrant. The iostream library provides protection against multiple threads that attempt to modify the state of objects (that is, instances of a C++ class) shared by more than one thread. However, the scope of MT-safety for an iostream object is confined to the period in which the object’s public member function is executing.


Note –

An application is not automatically guaranteed to be MT-safe because it uses MT-safe objects from the libC library. An application is defined to be MT-safe only when it executes as expected in a multithreaded environment.


11.4.1 Organization of the MT-Safe iostream Library

The organization of the MT-safe iostream library is slightly different from other versions of the iostream library. The exported interface of the library refers to the public and protected member functions of the iostream classes and the set of base classes available, and is consistent with other versions; however, the class hierarchy is different. See 11.4.2 Interface Changes to the iostream Library for details.

The original core classes have been renamed with the prefix unsafe_. Table 11–1 lists the classes that are the core of the iostream package.

Table 11–1 iostream Original Core Classes

Class 

Description 

stream_MT

The base class for MT-safe classes. 

streambuf

The base class for buffers. 

unsafe_ios

A class that contains state variables that are common to the various stream classes; for example, error and formatting state. 

unsafe_istream

A class that supports formatted and unformatted conversion from sequences of characters retrieved from the streambufs.

unsafe_ostream

A class that supports formatted and unformatted conversion to sequences of characters stored into the streambufs.

unsafe_iostream

A class that combines unsafe_istream and unsafe_ostream classes for bidirectional operations.

Each MT-safe class is derived from the base class stream_MT. Each MT-safe class, except streambuf, is also derived from the existing unsafe_ base class. Here are some examples:


class streambuf: public stream_MT {...};
class ios: virtual public unsafe_ios, public stream_MT {...};
class istream: virtual public ios, public unsafe_istream {...};

The class stream_MT provides the mutual exclusion (mutex) locks required to make each iostream class MT-safe; it also provides a facility that dynamically enables and disables the locks so that the MT-safe property can be dynamically changed. The basic functionality for I/O conversion and buffer management are organized into the unsafe_ classes; the MT-safe additions to the library are confined to the derived classes. The MT-safe version of each class contains the same protected and public member functions as the unsafe_ base class. Each member function in the MT-safe version class acts as a wrapper that locks the object, calls the same function in the unsafe_ base class, and unlocks the object.


Note –

The class streambuf is not derived from an unsafe class. The public and protected member functions of class streambuf are reentrant by locking. Unlocked versions, suffixed with _unlocked, are also provided.


11.4.1.1 Public Conversion Routines

A set of reentrant public functions that are MT-safe have been added to the iostream interface. A user-specified buffer is an additional argument to each function. These functions are described as follows.

Table 11–2 MT-Safe Reentrant Public Functions

Function 

Description  

char *oct_r (char *buf,

int buflen,

long num,

int width)

Returns a pointer to the ASCII string that represents the number in octal. A width of nonzero is assumed to be the field width for formatting. The returned value is not guaranteed to point to the beginning of the user-provided buffer. 

char *hex_r (char *buf,

int buflen,

long num,

int width)

Returns a pointer to the ASCII string that represents the number in hexadecimal. A width of nonzero is assumed to be the field width for formatting. The returned value is not guaranteed to point to the beginning of the user-provided buffer. 

char *dec_r (char *buf,

int buflen,

long num,

int width)

Returns a pointer to the ASCII string that represents the number in decimal. A width of nonzero is assumed to be the field width for formatting. The returned value is not guaranteed to point to the beginning of the user-provided buffer. 

char *chr_r (char *buf,

int buflen,

long num,

int width)

Returns a pointer to the ASCII string that contains character chr. If the width is nonzero, the string contains width blanks followed by chr. The returned value is not guaranteed to point to the beginning of the user-provided buffer.

char *form_r (char *buf,

int buflen,

long num,

int width)

Returns a pointer of the string formatted by sprintf, using the format string format and any remaining arguments. The buffer must have sufficient space to contain the formatted string.


Note –

The public conversion routines of the iostream library ( oct, hex, dec, chr, and form) that are present to ensure compatibility with an earlier version of libC are not MT-safe.


11.4.1.2 Compiling and Linking With the MT-Safe libC Library

When you build an application that uses the iostream classes of the libC library to run in a multithreaded environment, compile and link the source code of the application using the -mt option. This option passes -D_REENTRANT to the preprocessor and -lthread to the linker.


Note –

Use -mt (rather than -lthread) to link with libC and libthread. This option ensures proper linking order of the libraries. Using -lthread improperly could cause your application to work incorrectly.


Single-threaded applications that use iostream classes do not require special compiler or linker options. By default, the compiler links with the libC library.

11.4.1.3 MT-Safe iostream Restrictions

The restricted definition of MT-safety for the iostream library means that a number of programming idioms used with iostream are unsafe in a multithreaded environment using shared iostream objects.

Checking Error State

To be MT-safe, error checking must occur in a critical region with the I/O operation that causes the error. The following example illustrates how to check for errors:


Example 11–1 Checking Error State


#include <iostream.h>
enum iostate {IOok, IOeof, IOfail};

iostate read_number(istream& istr, int& num)
{
    stream_locker sl(istr, stream_locker::lock_now);

    istr >> num;

    if (istr.eof()) return IOeof;
    if (istr.fail()) return IOfail;
    return IOok;
}

In this example, the constructor of the stream_locker object sl locks the istream object istr. The destructor of sl, called at the termination of read_number, unlocks istr.

Obtaining Characters Extracted by Last Unformatted Input Operation

To be MT-safe, the gcount function must be called within a thread that has exclusive use of the istream object for the period that includes the execution of the last input operation and gcount call. The following example shows a call to gcount:


Example 11–2 Calling gcount


#include <iostream.h>
#include <rlocks.h>
void fetch_line(istream& istr, char* line, int& linecount)
{
    stream_locker sl(istr, stream_locker::lock_defer);

    sl.lock(); // lock the stream istr
    istr >> line;
    linecount = istr.gcount();
    sl.unlock(); // unlock istr
    ...
}

In this example, the lock and unlock member functions of class stream_locker define a mutual exclusion region in the program.

User-Defined I/O Operations

To be MT-safe, I/O operations defined for a user-defined type that involve a specific ordering of separate operations must be locked to define a critical region. The following example shows a user-defined I/O operation:


Example 11–3 User-Defined I/O Operations


#include <rlocks.h>
#include <iostream.h>
class mystream: public istream {

    // other definitions...
    int getRecord(char* name, int& id, float& gpa);
};

int mystream::getRecord(char* name, int& id, float& gpa)
{
    stream_locker sl(this, stream_locker::lock_now);

    *this >> name;
    *this >> id;
    *this >> gpa;

    return this->fail() == 0;
}

11.4.1.4 Reducing Performance Overhead of MT-Safe Classes

Using the MT-safe classes in this version of the libC library results in some amount of performance overhead, even in a single-threaded application; however, if you use the unsafe_ classes of libC, this overhead can be avoided.

The scope resolution operator can be used to execute member functions of the base unsafe_ classes; for example:


    cout.unsafe_ostream::put(’4’);

    cin.unsafe_istream::read(buf, len);

Note –

The unsafe_ classes cannot be safely used in multithreaded applications.


Instead of using unsafe_ classes, you can make the cout and cin objects unsafe and then use the normal operations. A slight performance deterioration results. The following example shows how to use unsafe cout and cin:


Example 11–4 Disabling MT-Safety


#include <iostream.h>
//disable mt-safety
cout.set_safe_flag(stream_MT::unsafe_object);    
//disable mt-safety
cin.set_safe_flag(stream_MT::unsafe_object);    
cout.put(”4’);
cin.read(buf, len);

When an iostream object is MT-safe, mutex locking is provided to protect the object’s member variables. This locking adds unnecessary overhead to an application that only executes in a single-threaded environment. To improve performance, you can dynamically switch an iostream object to and from MT-safety. The following example makes an iostream object MT-unsafe:


Example 11–5 Switching to MT-Unsafe


fs.set_safe_flag(stream_MT::unsafe_object);// disable MT-safety
    .... do various i/o operations

You can safely use an MT-unsafe stream in code where an iostream is not shared by threads; for example, in a program that has only one thread, or in a program where each iostream is private to a thread.

If you explicitly insert synchronization into the program, you can also safely use MT-unsafe iostreams in an environment where an iostream is shared by threads. The following example illustrates the technique:


Example 11–6 Using Synchronization With MT-Unsafe Objects


    generic_lock();
    fs.set_safe_flag(stream_MT::unsafe_object);
    ... do various i/o operations
    generic_unlock();

where the generic_lock and generic_unlock functions can be any synchronization mechanism that uses such primitives as mutex, semaphores, or reader/writer locks.


Note –

The stream_locker class provided by the libC library is the preferred mechanism for this purpose.


See 11.4.5 Object Locks for more information.

11.4.2 Interface Changes to the iostream Library

This section describes the interface changes made to the iostream library to make it MT-Safe.

11.4.2.1 The New Classes

The following table lists the new classes added to the libC interfaces.


Example 11–7 New Classes


    stream_MT
    stream_locker
    unsafe_ios
    unsafe_istream
    unsafe_ostream
    unsafe_iostream
    unsafe_fstreambase
    unsafe_strstreambase

11.4.2.2 The New Class Hierarchy

The following table lists the new class hierarchy added to the iostream interfaces.


Example 11–8 New Class Hierarchy


class streambuf: public stream_MT {...};
class unsafe_ios {...};
class ios: virtual public unsafe_ios, public stream_MT {...};
class unsafe_fstreambase: virtual public unsafe_ios {...};
class fstreambase: virtual public ios, public unsafe_fstreambase
  {...};
class unsafe_strstreambase: virtual public unsafe_ios {...};
class strstreambase: virtual public ios, public unsafe_strstreambase {...};
class unsafe_istream: virtual public unsafe_ios {...};
class unsafe_ostream: virtual public unsafe_ios {...};
class istream: virtual public ios, public unsafe_istream {...};
class ostream: virtual public ios, public unsafe_ostream {...};
class unsafe_iostream: public unsafe_istream, public unsafe_ostream {...};

11.4.2.3 The New Functions

The following table lists the new functions added to the iostream interfaces.


Example 11–9 New Functions


 class streambuf {
 public:
   int sgetc_unlocked();                
   void sgetn_unlocked(char *, int);
   int snextc_unlocked();
   int sbumpc_unlocked();
   void stossc_unlocked();
   int in_avail_unlocked();
   int sputbackc_unlocked(char);
   int sputc_unlocked(int);
   int sputn_unlocked(const char *, int);
   int out_waiting_unlocked();
 protected:
   char* base_unlocked();
   char* ebuf_unlocked();
   int blen_unlocked();
   char* pbase_unlocked();
   char* eback_unlocked();
   char* gptr_unlocked();
   char* egptr_unlocked();
   char* pptr_unlocked();
   void setp_unlocked(char*, char*);
   void setg_unlocked(char*, char*, char*);
   void pbump_unlocked(int);
   void gbump_unlocked(int);
   void setb_unlocked(char*, char*, int);
   int unbuffered_unlocked();
   char *epptr_unlocked();
   void unbuffered_unlocked(int);
   int allocate_unlocked(int);
 };

 class filebuf: public streambuf {
 public:
  int is_open_unlocked();
  filebuf* close_unlocked();
  filebuf* open_unlocked(const char*, int, int =
    filebuf::openprot);

  filebuf* attach_unlocked(int);
 };

 class strstreambuf: public streambuf {
 public:
  int freeze_unlocked();
  char* str_unlocked();
 };


 unsafe_ostream& endl(unsafe_ostream&);
 unsafe_ostream& ends(unsafe_ostream&);
 unsafe_ostream& flush(unsafe_ostream&);
 unsafe_istream& ws(unsafe_istream&);
 unsafe_ios& dec(unsafe_ios&);
 unsafe_ios& hex(unsafe_ios&);
 unsafe_ios& oct(unsafe_ios&);

 char* dec_r (char* buf, int buflen, long num, int width)
 char* hex_r (char* buf, int buflen, long num, int width)
 char* oct_r (char* buf, int buflen, long num, int width)
 char* chr_r (char* buf, int buflen, long chr, int width)
 char* str_r (char* buf, int buflen, const char* format, int width
    = 0);
 char* form_r (char* buf, int buflen, const char* format,...)

11.4.3 Global and Static Data

Global and static data in a multithreaded application are not safely shared among threads. Although threads execute independently, they share access to global and static objects within the process. If one thread modifies such a shared object, all the other threads within the process observe the change, making it difficult to maintain state over time. In C++, class objects (instances of a class) maintain state by the values in their member variables. If a class object is shared, it is vulnerable to changes made by other threads.

When a multithreaded application uses the iostream library and includes iostream.h, the standard streams—cout, cin, cerr, and clog— are, by default, defined as global shared objects. Since the iostream library is MT-safe, it protects the state of its shared objects from access or change by another thread while a member function of an iostream object is executing. However, the scope of MT-safety for an object is confined to the period in which the object’s public member function is executing. For example,


    int c;
    cin.get(c);

gets the next character in the get buffer and updates the buffer pointer in ThreadA. However, if the next instruction in ThreadA is another get call, the libC library does not guarantee to return the next character in the sequence. It is not guaranteed because, for example, ThreadB may have also executed the get call in the intervening period between the two get calls made in ThreadA.

See 11.4.5 Object Locks for strategies for dealing with the problems of shared objects and multithreading.

11.4.4 Sequence Execution

Frequently, when iostream objects are used, a sequence of I/O operations must be MT-safe. For example, the code:


cout << " Error message:" << errstring[err_number] << "\n";

involves the execution of three member functions of the cout stream object. Since cout is a shared object, the sequence must be executed atomically as a critical section to work correctly in a multithreaded environment. To perform a sequence of operations on an iostream class object atomically, you must use some form of locking.

The libC library now provides the stream_locker class for locking operations on an iostream object. See 11.4.5 Object Locks for information about the stream_locker class.

11.4.5 Object Locks

The simplest strategy for dealing with the problems of shared objects and multithreading is to avoid the issue by ensuring that iostream objects are local to a thread. For example,

However, in many cases, such as default shared standard stream objects, it is not possible to make the objects local to a thread, and an alternative strategy is required.

To perform a sequence of operations on an iostream class object atomically, you must use some form of locking. Locking adds some overhead even to a single-threaded application. The decision whether to add locking or make iostream objects private to a thread depends on the thread model chosen for the application: Are the threads to be independent or cooperating?

11.4.5.1 Class stream_locker

The iostream library provides the stream_locker class for locking a series of operations on an iostream object. You can, therefore, minimize the performance overhead incurred by dynamically enabling or disabling locking in iostream objects.

Objects of class stream_locker can be used to make a sequence of operations on a stream object atomic. For example, the code shown in the example below seeks to find a position in a file and reads the next block of data.


Example 11–10 Example of Using Locking Operations


#include <fstream.h>
#include <rlocks.h>

void lock_example (fstream& fs)
{
    const int len = 128;
    char buf[len];
    int offset = 48;
    stream_locker s_lock(fs, stream_locker::lock_now);
    .....// open file
    fs.seekg(offset, ios::beg);
    fs.read(buf, len);
}

In this example, the constructor for the stream_locker object defines the beginning of a mutual exclusion region in which only one thread can execute at a time. The destructor, called after the return from the function, defines the end of the mutual exclusion region. The stream_locker object ensures that both the seek to a particular offset in a file and the read from the file are performed together, atomically, and that ThreadB cannot change the file offset before the original ThreadA reads the file.

An alternative way to use a stream_locker object is to explicitly define the mutual exclusion region. In the following example, to make the I/O operation and subsequent error checking atomic, lock and unlock member function calls of a vbstream_locker object are used.


Example 11–11 Making I/O Operation and Error Checking Atomic


{
    ...
    stream_locker file_lck(openfile_stream,
                             stream_locker::lock_defer);
    ....
    file_lck.lock();  // lock openfile_stream
    openfile_stream << "Value: " << int_value << "\n";
    if(!openfile_stream) {
            file_error("Output of value failed\n");
            return;
    }
    file_lck.unlock(); // unlock openfile_stream
}

For more information, see the stream_locker(3CC4) man page.

11.4.6 MT-Safe Classes

You can extend or specialize the functionality of the iostream classes by deriving new classes. If objects instantiated from the derived classes will be used in a multithreaded environment, the classes must be MT-safe.

Considerations when deriving MT-safe classes include:

11.4.7 Object Destruction

Before an iostream object that is shared by several threads is deleted, the main thread must verify that the subthreads are finished with the shared object. The following example shows how to safely destroy a shared object.


Example 11–12 Destroying a Shared Object


#include <fstream.h>
#include <thread.h>
fstream* fp;

void *process_rtn(void*)
{
    // body of sub-threads which uses fp...
}

void multi_process(const char* filename, int numthreads)
{
    fp = new fstream(filename, ios::in); // create fstream object
                                         // before creating threads.
    // create threads
    for (int i=0; i<numthreads; i++)
            thr_create(0, STACKSIZE, process_rtn, 0, 0, 0);

        ...
    // wait for threads to finish
    for (int i=0; i<numthreads; i++)
            thr_join(0, 0, 0);

    delete fp;                          // delete fstream object after
    fp = NULL;                         // all threads have completed.
}

11.4.8 An Example Application

The following code provides an example of a multiply-threaded application that uses iostream objects from the libC library in an MT-safe way.

The example application creates up to 255 threads. Each thread reads a different input file, one line at a time, and outputs the line to an output file, using the standard output stream, cout. The output file, which is shared by all threads, is tagged with a value that indicates which thread performed the output operation.


Example 11–13 Using iostream Objects in an MT-Safe Way


// create tagged thread data
// the output file is of the form:
//         <tag><string of data>\n
// where tag is an integer value in a unsigned char.
// Allows up to 255 threads to be run in this application
// <string of data> is any printable characters
// Because tag is an integer value written as char,
// you need to use od to look at the output file, suggest:
//            od -c out.file |more

#include <stdlib.h>
#include <stdio.h>
#include <iostream.h>
#include <fstream.h>
#include <thread.h>

struct thread_args {
  char* filename;
  int thread_tag;
};

const int thread_bufsize = 256;

// entry routine for each thread
void* ThreadDuties(void* v) {
// obtain arguments for this thread
  thread_args* tt = (thread_args*)v;
  char ibuf[thread_bufsize];
  // open thread input file
  ifstream instr(tt->filename);
  stream_locker lockout(cout, stream_locker::lock_defer);
  while(1) {
  // read a line at a time
    instr.getline(ibuf, thread_bufsize - 1, ’\n’);
    if(instr.eof())
      break;
  // lock cout stream so the i/o operation is atomic
    lockout.lock();
  // tag line and send to cout
    cout << (unsigned char)tt->thread_tag << ibuf << "\n";
    lockout.unlock();
  }
  return 0;
}

int main(int argc, char** argv) {
  // argv: 1+ list of filenames per thread
   if(argc < 2) {
     cout << “usage: " << argv[0] << " <files..>\n";
     exit(1);
   }
  int num_threads = argc - 1;
  int total_tags = 0;

// array of thread_ids
  thread_t created_threads[thread_bufsize];
// array of arguments to thread entry routine
  thread_args thr_args[thread_bufsize];
  int i;
  for(i = 0; i < num_threads; i++) {
    thr_args[i].filename = argv[1 + i];
// assign a tag to a thread - a value less than 256
    thr_args[i].thread_tag = total_tags++;
// create threads
    thr_create(0, 0, ThreadDuties, &thr_args[i],
           THR_SUSPENDED, &created_threads[i]);
  }

  for(i = 0; i < num_threads; i++) {
    thr_continue(created_threads[i]);
  }
  for(i = 0; i < num_threads; i++) {
    thr_join(created_threads[i], 0, 0);
  }
  return 0;
}