An additional problem occurs with standard I/O. Programmers are accustomed to routines such as getc(3S) and putc(3S) being very quick--they are implemented as macros. Because of this, they can be used within the inner loop of a program with no concerns about efficiency.
However, when they are made thread safe they suddenly become more expensive--they now require (at least) two internal subroutine calls, to lock and unlock a mutex.
However, to use them in a thread-safe way, you must explicitly lock and release the mutexes that protect the standard I/O streams, using flockfile(3S) and funlockfile(3S). The calls to these latter routines are placed outside the loop, and the calls to getc_unlocked() or putc_unlocked() are placed inside the loop.