A table scan is the reading of every row in a table and is caused by queries that don’t properly use indexes. Table scans on large tables take an excessive amount of time and cause performance problems.

Make sure that, for any queries against large tables, at least one WHERE clause condition:

You should be concerned primarily with queries against large tables. If you have a table with a few hundred rows, table scans are not a problem and are sometimes faster than indexed access.

During initialization, systems like Dynamo may front-load caches to avoid unnecessary database operations later. You may see queries with large results during this time, but that is okay. Within reason, lengthy database operations at startup are acceptable. However, if you see frequent, large, or slow queries issuing from Dynamo during the course of normal operation, then you have a design problem that must be addressed to achieve acceptable performance.

For example, suppose your database has a large table that holds products such as this:

CREATE table product
   (    sku            char(6)        not null,
        type           char(1)        not null,
        name           varchar(50)    not null,
        description    varchar(200)   null       )

and has these indexes:

CREATE unique index i1 on product(sku)
CREATE index i2 on product(name)
CREATE index i3 on product(type)

The following query is fine:

  FROM product
WHERE sku = 'a12345'

That query will not cause performance problems because the WHERE clause refers to a very specific condition on a column with an index.

Here is an example of a query that is likely to cause problems:

  FROM product
WHERE description LIKE '%shoes%'

This query causes a table scan, since the indexes can’t help the database to optimize the query. Queries like this on a large table will result in an unacceptable performance drag and therefore should not be allowed in a production system.

Here are some more queries that are likely to cause performance problems. The following query is inadvisable because, although it refers to the indexed sku column, it is not very selective and could return millions of rows:

  FROM product
WHERE sku > 'abc'

The following query is bad because, although it is relatively selective, it will cause a table scan (at least, on most DBMSs). A LIKE query with a leading wildcard typically cannot be optimized:

  FROM product
WHERE name LIKE '%stereo'
loading table of contents...