Writing Device Drivers

Synchronizing Memory Objects

In the process of accessing the memory object, the driver might need to synchronize the memory object with respect to various caches. This section provides guidelines on when and how to synchronize memory objects.

Cache

CPU cache is a very high-speed memory that sits between the CPU and the system's main memory. I/O cache sits between the device and the system's main memory, as shown in the following figure.

Figure 9–1 CPU and System I/O Caches

Diagram shows how the cache is used to speed data transfers
involving devices.

When an attempt is made to read data from main memory, the associated cache checks for the requested data. If the data is available, the cache supplies the data quickly. If the cache does not have the data, the cache retrieves the data from main memory. The cache then passes the data on to the requester and saves the data in case of a subsequent request.

Similarly, on a write cycle, the data is stored in the cache quickly. The CPU or device is allowed to continue executing, that is, transferring data. Storing data in a cache takes much less time than waiting for the data to be written to memory.

With this model, after a device transfer is complete, the data can still be in the I/O cache with no data in main memory. If the CPU accesses the memory, the CPU might read the wrong data from the CPU cache. The driver must call a synchronization routine to flush the data from the I/O cache and update the CPU cache with the new data. This action ensures a consistent view of the memory for the CPU. Similarly, a synchronization step is required if data modified by the CPU is to be accessed by a device.

You can create additional caches and buffers between the device and memory, such as bus extenders and bridges. Use ddi_dma_sync(9F) to synchronize all applicable caches.

ddi_dma_sync() Function

A memory object might have multiple mappings, such as for the CPU and for a device, by means of a DMA handle. A driver with multiple mappings needs to call ddi_dma_sync(9F) if any mappings are used to modify the memory object. Calling ddi_dma_sync() ensures that the modification of the memory object is complete before the object is accessed through a different mapping. The ddi_dma_sync() function can also inform other mappings of the object if any cached references to the object are now stale. Additionally, ddi_dma_sync() flushes or invalidates stale cache references as necessary.

Generally, the driver must call ddi_dma_sync() when a DMA transfer completes. The exception to this rule is if deallocating the DMA resources with ddi_dma_unbind_handle(9F) does an implicit ddi_dma_sync() on behalf of the driver. The syntax for ddi_dma_sync() is as follows:

int ddi_dma_sync(ddi_dma_handle_t handle, off_t off,
size_t length, uint_t type);

If the object is going to be read by the DMA engine of the device, the device's view of the object must be synchronized by setting type to DDI_DMA_SYNC_FORDEV. If the DMA engine of the device has written to the memory object and the object is going to be read by the CPU, the CPU's view of the object must be synchronized by setting type to DDI_DMA_SYNC_FORCPU.

The following example demonstrates synchronizing a DMA object for the CPU:

if (ddi_dma_sync(xsp->handle, 0, length, DDI_DMA_SYNC_FORCPU)
    == DDI_SUCCESS) {
    /* the CPU can now access the transferred data */
    /* ... */
} else {
    /* error handling */
}

Use the flag DDI_DMA_SYNC_FORKERNEL if the only mapping is for the kernel, as in the case of memory that is allocated by ddi_dma_mem_alloc(9F). The system tries to synchronize the kernel's view more quickly than the CPU's view. If the system cannot synchronize the kernel view faster, the system acts as if the DDI_DMA_SYNC_FORCPU flag were set.