MySQL 8.4 Reference Manual Including MySQL NDB Cluster 8.4
      Full-text indexes are created on text-based columns
      (CHAR,
      VARCHAR, or
      TEXT columns) to speed up queries
      and DML operations on data contained within those columns.
    
      A full-text index is defined as part of a
      CREATE TABLE statement or added to
      an existing table using ALTER TABLE
      or CREATE INDEX.
    
      Full-text search is performed using MATCH()
      ... AGAINST syntax. For usage information, see
      Section 14.9, “Full-Text Search Functions”.
    
      InnoDB full-text indexes are described under
      the following topics in this section:
    
        InnoDB full-text indexes have an inverted
        index design. Inverted indexes store a list of words, and for
        each word, a list of documents that the word appears in. To
        support proximity search, position information for each word is
        also stored, as a byte offset.
      
        When an InnoDB full-text index is created, a
        set of index tables is created, as shown in the following
        example:
      
mysql>CREATE TABLE opening_lines (id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,opening_line TEXT(500),author VARCHAR(200),title VARCHAR(200),FULLTEXT idx (opening_line)) ENGINE=InnoDB;mysql>SELECT table_id, name, space from INFORMATION_SCHEMA.INNODB_TABLESWHERE name LIKE 'test/%';+----------+----------------------------------------------------+-------+ | table_id | name | space | +----------+----------------------------------------------------+-------+ | 333 | test/fts_0000000000000147_00000000000001c9_index_1 | 289 | | 334 | test/fts_0000000000000147_00000000000001c9_index_2 | 290 | | 335 | test/fts_0000000000000147_00000000000001c9_index_3 | 291 | | 336 | test/fts_0000000000000147_00000000000001c9_index_4 | 292 | | 337 | test/fts_0000000000000147_00000000000001c9_index_5 | 293 | | 338 | test/fts_0000000000000147_00000000000001c9_index_6 | 294 | | 330 | test/fts_0000000000000147_being_deleted | 286 | | 331 | test/fts_0000000000000147_being_deleted_cache | 287 | | 332 | test/fts_0000000000000147_config | 288 | | 328 | test/fts_0000000000000147_deleted | 284 | | 329 | test/fts_0000000000000147_deleted_cache | 285 | | 327 | test/opening_lines | 283 | +----------+----------------------------------------------------+-------+
        The first six index tables comprise the inverted index and are
        referred to as auxiliary index tables. When incoming documents
        are tokenized, the individual words (also referred to as
        “tokens”) are inserted into the index tables along
        with position information and an associated
        DOC_ID. The words are fully sorted and
        partitioned among the six index tables based on the character
        set sort weight of the word's first character.
      
        The inverted index is partitioned into six auxiliary index
        tables to support parallel index creation. By default, two
        threads tokenize, sort, and insert words and associated data
        into the index tables. The number of threads that perform this
        work is configurable using the
        innodb_ft_sort_pll_degree
        variable. Consider increasing the number of threads when
        creating full-text indexes on large tables.
      
        Auxiliary index table names are prefixed with
        fts_ and postfixed with
        index_. Each
        auxiliary index table is associated with the indexed table by a
        hex value in the auxiliary index table name that matches the
        #table_id of the indexed table. For example,
        the table_id of the
        test/opening_lines table is
        327, for which the hex value is 0x147. As
        shown in the preceding example, the “147” hex value
        appears in the names of auxiliary index tables that are
        associated with the test/opening_lines table.
      
        A hex value representing the index_id of the
        full-text index also appears in auxiliary index table names. For
        example, in the auxiliary table name
        test/fts_0000000000000147_00000000000001c9_index_1,
        the hex value 1c9 has a decimal value of 457.
        The index defined on the opening_lines table
        (idx) can be identified by querying the
        Information Schema INNODB_INDEXES
        table for this value (457).
      
mysql>SELECT index_id, name, table_id, space from INFORMATION_SCHEMA.INNODB_INDEXESWHERE index_id=457;+----------+------+----------+-------+ | index_id | name | table_id | space | +----------+------+----------+-------+ | 457 | idx | 327 | 283 | +----------+------+----------+-------+
Index tables are stored in their own tablespace if the primary table is created in a file-per-table tablespace. Otherwise, index tables are stored in the tablespace where the indexed table resides.
The other index tables shown in the preceding example are referred to as common index tables and are used for deletion handling and storing the internal state of full-text indexes. Unlike the inverted index tables, which are created for each full-text index, this set of tables is common to all full-text indexes created on a particular table.
        Common index tables are retained even if full-text indexes are
        dropped. When a full-text index is dropped, the
        FTS_DOC_ID column that was created for the
        index is retained, as removing the FTS_DOC_ID
        column would require rebuilding the previously indexed table.
        Common index tables are required to manage the
        FTS_DOC_ID column.
      
            fts_*_deleted and
            fts_*_deleted_cache
          
            Contain the document IDs (DOC_ID) for documents that are
            deleted but whose data is not yet removed from the full-text
            index. The fts_*_deleted_cache is the
            in-memory version of the fts_*_deleted
            table.
          
            fts_*_being_deleted and
            fts_*_being_deleted_cache
          
            Contain the document IDs (DOC_ID) for documents that are
            deleted and whose data is currently in the process of being
            removed from the full-text index. The
            fts_*_being_deleted_cache table is the
            in-memory version of the
            fts_*_being_deleted table.
          
            fts_*_config
          
            Stores information about the internal state of the full-text
            index. Most importantly, it stores the
            FTS_SYNCED_DOC_ID, which identifies
            documents that have been parsed and flushed to disk. In case
            of crash recovery, FTS_SYNCED_DOC_ID
            values are used to identify documents that have not been
            flushed to disk so that the documents can be re-parsed and
            added back to the full-text index cache. To view the data in
            this table, query the Information Schema
            INNODB_FT_CONFIG table.
          
        When a document is inserted, it is tokenized, and the individual
        words and associated data are inserted into the full-text index.
        This process, even for small documents, can result in numerous
        small insertions into the auxiliary index tables, making
        concurrent access to these tables a point of contention. To
        avoid this problem, InnoDB uses a full-text
        index cache to temporarily cache index table insertions for
        recently inserted rows. This in-memory cache structure holds
        insertions until the cache is full and then batch flushes them
        to disk (to the auxiliary index tables). You can query the
        Information Schema
        INNODB_FT_INDEX_CACHE table to view
        tokenized data for recently inserted rows.
      
The caching and batch flushing behavior avoids frequent updates to auxiliary index tables, which could result in concurrent access issues during busy insert and update times. The batching technique also avoids multiple insertions for the same word, and minimizes duplicate entries. Instead of flushing each word individually, insertions for the same word are merged and flushed to disk as a single entry, improving insertion efficiency while keeping auxiliary index tables as small as possible.
        The innodb_ft_cache_size
        variable is used to configure the full-text index cache size (on
        a per-table basis), which affects how often the full-text index
        cache is flushed. You can also define a global full-text index
        cache size limit for all tables in a given instance using the
        innodb_ft_total_cache_size
        variable.
      
The full-text index cache stores the same information as auxiliary index tables. However, the full-text index cache only caches tokenized data for recently inserted rows. The data that is already flushed to disk (to the auxiliary index tables) is not brought back into the full-text index cache when queried. The data in auxiliary index tables is queried directly, and results from the auxiliary index tables are merged with results from the full-text index cache before being returned.
        InnoDB uses a unique document identifier
        referred to as the DOC_ID to map words in the
        full-text index to document records where the word appears. The
        mapping requires an FTS_DOC_ID column on the
        indexed table. If an FTS_DOC_ID column is not
        defined, InnoDB automatically adds a hidden
        FTS_DOC_ID column when the full-text index is
        created. The following example demonstrates this behavior.
      
        The following table definition does not include an
        FTS_DOC_ID column:
      
mysql>CREATE TABLE opening_lines (id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,opening_line TEXT(500),author VARCHAR(200),title VARCHAR(200)) ENGINE=InnoDB;
        When you create a full-text index on the table using
        CREATE FULLTEXT INDEX syntax, a warning is
        returned which reports that InnoDB is
        rebuilding the table to add the FTS_DOC_ID
        column.
      
mysql>CREATE FULLTEXT INDEX idx ON opening_lines(opening_line);Query OK, 0 rows affected, 1 warning (0.19 sec) Records: 0 Duplicates: 0 Warnings: 1 mysql>SHOW WARNINGS;+---------+------+--------------------------------------------------+ | Level | Code | Message | +---------+------+--------------------------------------------------+ | Warning | 124 | InnoDB rebuilding table to add column FTS_DOC_ID | +---------+------+--------------------------------------------------+
        The same warning is returned when using
        ALTER TABLE to add a full-text
        index to a table that does not have an
        FTS_DOC_ID column. If you create a full-text
        index at CREATE TABLE time and do
        not specify an FTS_DOC_ID column,
        InnoDB adds a hidden
        FTS_DOC_ID column, without warning.
      
        Defining an FTS_DOC_ID column at
        CREATE TABLE time is less
        expensive than creating a full-text index on a table that is
        already loaded with data. If an FTS_DOC_ID
        column is defined on a table prior to loading data, the table
        and its indexes do not have to be rebuilt to add the new column.
        If you are not concerned with CREATE FULLTEXT
        INDEX performance, leave out the
        FTS_DOC_ID column to have
        InnoDB create it for you.
        InnoDB creates a hidden
        FTS_DOC_ID column along with a unique index
        (FTS_DOC_ID_INDEX) on the
        FTS_DOC_ID column. If you want to create your
        own FTS_DOC_ID column, the column must be
        defined as BIGINT UNSIGNED NOT NULL and named
        FTS_DOC_ID (all uppercase), as in the
        following example:
      
          The FTS_DOC_ID column does not need to be
          defined as an AUTO_INCREMENT column, but
          doing so could make loading data easier.
        
mysql>CREATE TABLE opening_lines (FTS_DOC_ID BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,opening_line TEXT(500),author VARCHAR(200),title VARCHAR(200)) ENGINE=InnoDB;
        If you choose to define the FTS_DOC_ID column
        yourself, you are responsible for managing the column to avoid
        empty or duplicate values. FTS_DOC_ID values
        cannot be reused, which means FTS_DOC_ID
        values must be ever increasing.
      
        Optionally, you can create the required unique
        FTS_DOC_ID_INDEX (all uppercase) on the
        FTS_DOC_ID column.
      
mysql> CREATE UNIQUE INDEX FTS_DOC_ID_INDEX on opening_lines(FTS_DOC_ID);
        If you do not create the FTS_DOC_ID_INDEX,
        InnoDB creates it automatically.
      
          FTS_DOC_ID_INDEX cannot be defined as a
          descending index because the InnoDB SQL
          parser does not use descending indexes.
        
        The permitted gap between the largest used
        FTS_DOC_ID value and new
        FTS_DOC_ID value is 65535.
      
        To avoid rebuilding the table, the FTS_DOC_ID
        column is retained when dropping a full-text index.
      
        Deleting a record that has a full-text index column could result
        in numerous small deletions in the auxiliary index tables,
        making concurrent access to these tables a point of contention.
        To avoid this problem, the DOC_ID of a
        deleted document is logged in a special
        FTS_*_DELETED table whenever a record is
        deleted from an indexed table, and the indexed record remains in
        the full-text index. Before returning query results, information
        in the FTS_*_DELETED table is used to filter
        out deleted DOC_IDs. The benefit of this
        design is that deletions are fast and inexpensive. The drawback
        is that the size of the index is not immediately reduced after
        deleting records. To remove full-text index entries for deleted
        records, run OPTIMIZE TABLE on the indexed
        table with
        innodb_optimize_fulltext_only=ON
        to rebuild the full-text index. For more information, see
        Optimizing InnoDB Full-Text Indexes.
      
        InnoDB full-text indexes have special
        transaction handling characteristics due its caching and batch
        processing behavior. Specifically, updates and insertions on a
        full-text index are processed at transaction commit time, which
        means that a full-text search can only see committed data. The
        following example demonstrates this behavior. The full-text
        search only returns a result after the inserted lines are
        committed.
      
mysql>CREATE TABLE opening_lines (id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,opening_line TEXT(500),author VARCHAR(200),title VARCHAR(200),FULLTEXT idx (opening_line)) ENGINE=InnoDB;mysql>BEGIN;mysql>INSERT INTO opening_lines(opening_line,author,title) VALUES('Call me Ishmael.','Herman Melville','Moby-Dick'),('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'),('I am an invisible man.','Ralph Ellison','Invisible Man'),('Where now? Who now? When now?','Samuel Beckett','The Unnamable'),('It was love at first sight.','Joseph Heller','Catch-22'),('All this happened, more or less.','Kurt Vonnegut','Slaughterhouse-Five'),('Mrs. Dalloway said she would buy the flowers herself.','Virginia Woolf','Mrs. Dalloway'),('It was a pleasure to burn.','Ray Bradbury','Fahrenheit 451');mysql>SELECT COUNT(*) FROM opening_lines WHERE MATCH(opening_line) AGAINST('Ishmael');+----------+ | COUNT(*) | +----------+ | 0 | +----------+ mysql>COMMIT;mysql>SELECT COUNT(*) FROM opening_lines->WHERE MATCH(opening_line) AGAINST('Ishmael');+----------+ | COUNT(*) | +----------+ | 1 | +----------+
        You can monitor and examine the special text-processing aspects
        of InnoDB full-text indexes by querying the
        following INFORMATION_SCHEMA tables:
      
        You can also view basic information for full-text indexes and
        tables by querying INNODB_INDEXES
        and INNODB_TABLES.
      
For more information, see Section 17.15.4, “InnoDB INFORMATION_SCHEMA FULLTEXT Index Tables”.