零停机地将 MongoDB 迁移到 Oracle AI Database

了解如何在零停机的情况下使用 OCI GoldenGate 到 MongoDB 到 Oracle AI Database 。

OCI GoldenGate 大数据支持在零停机时间内从 MongoDB 和 MongoDB Atlas 迁移到以下目标：

Oracle Autonomous AI JSON Database
Oracle Autonomous AI Database
Oracle AI Database

Autonomous AI JSON Database 和 Autonomous AI Database 提供了预配置的 Oracle API for MongoDB 连接字符串， OCI GoldenGate 使用这些字符串连接到目标 Oracle AI Database 系统。有关详情，请参阅使用 Oracle Database API for MongoDB 。

如果您的目标是内部部署 Oracle AI Database ，则可以使用 Oracle Rest Data Services 将 Oracle Database API for MongoDB 与您的内部部署 Oracle AI Database 一起启用。

开始之前

要成功完成此快速入门，请确保您具有：

MongoDB 源，版本 5 或更高版本
创建了 OCI GoldenGate Big Data deployment （版本 23.7 或更高版本）
MongoDB 数据库工具，包括 mongodump 和 mongostore，已安装以及添加到 PATH 环境变量的相应目录路径

设置并运行迁移

在源 MongoDB 中，运行 MongoDB 转储实用程序。

使用 --oplog 选项运行 mongodump 以创建源 MongoDB 数据库的快照：

$ ./bin/mongodump --uri="mongodb://localhost:27021" --oplog -v

该命令返回以下内容：

<date+timestamp>    getting most recent oplog timestamp
<date+timestamp>    writing admin.system.version to dump/admin/system.version.bson
<date+timestamp>    done dumping admin.system.version (1 document)
<date+timestamp>    dumping up to 4 collections in parallel
<date+timestamp>    writing testDB.coll1 to dump/testDB/coll1.bson
<date+timestamp>    writing testDB.coll2 to dump/testDB/coll2.bson
<date+timestamp>    writing testDB.coll3 to dump/testDB/coll3.bson
<date+timestamp>    done dumping testDB.coll3 (10000 documents)
<date+timestamp>    done dumping testDB.coll1 (10000 documents)
<date+timestamp>    done dumping testDB.coll2 (10000 documents)
<date+timestamp>    writing captured oplog to
<date+timestamp>    dumped 7 oplog entries

这将在以下位置生成一个转储文件夹，其中包含所有数据库和集合的二进制归档数据文件：

dump/database-name/collection-name/collection-name.bson

它还直接在转储文件夹下创建 oplog.bson。

要检查 oplog.bson 文件（采用二进制格式）的内容，可以使用 bsondump 实用程序将其转换为 JSON：
```
$ ./bin/bsondump --pretty --outFile /path/to/oplog.json dump/oplog.bson
```
该命令返回以下内容：
```
<date+timestamp>    7 objects found
```

从 oplong.bson 提取第一个和最后一个工序时间戳：
1. 下载 OplogLSN.sh 脚本
2. 在源 MongoDB 中运行 OplongLSN.sh 脚本（位于 mongodump 的同一目录中），并将位置作为参数传递到 oplog.bson，如下所示：
```
$./oplogLSN.sh /path/to/dump/oplog.bson
```
  该命令返回以下内容：
```
<date+timestamp> 1 objects found
First LSN: 1740663867.1
Last LSN: 1740663946.211
```
3. 检查 Oplog 条目。如果转储期间有任何传入操作，则 oplog.bson 文件包含这些操作的条目，每个条目都有一个时间戳。您可以使用 OplogLSN.sh 捕获第一个和最后一个工序时间戳，或者将其转换为 JSON 文件进行手动检查，如上一步所示。

运行 MongoDB 恢复实用程序。使用 mongorestore 实用程序可将所选集合从转储恢复到目标 MongoDB 实例：

$ ./mongorestore --uri="mongodb://localhost:27021" --nsInclude=testDB.coll1 --nsInclude=testDB.coll2 /path/to/dump -v

该命令返回以下内容：

<date+timestamp>    using write concern: &{majority <nil> 0s}
<date+timestamp>    using default 'dump' directory
<date+timestamp>    preparing collections to restore from
<date+timestamp>    found collection admin.system.version bson to restore to admin.system.version
<date+timestamp>    found collection metadata from admin.system.version to restore to admin.system.version
<date+timestamp>    don't know what to do with file "dump/oplog.json", skipping...
<date+timestamp>    found collection testDB.coll1 bson to restore to testDB.coll1
<date+timestamp>    found collection metadata from testDB.coll1 to restore to testDB.coll1
<date+timestamp>    found collection testDB.coll2 bson to restore to testDB.coll2
<date+timestamp>    found collection metadata from testDB.coll2 to restore to testDB.coll2
<date+timestamp>    reading metadata for testDB.coll1 from dump/testDB/coll1.metadata.json
<date+timestamp>    reading metadata for testDB.coll2 from dump/testDB/coll2.metadata.json
<date+timestamp>    creating collection testDB.coll1 with no metadata
<date+timestamp>    creating collection testDB.coll2 with no metadata
<date+timestamp>    restoring testDB.coll1 from dump/testDB/coll1.bson
<date+timestamp>    restoring testDB.coll2 from dump/testDB/coll2.bson
<date+timestamp>    finished restoring testDB.coll1 (10000 documents, 0 failures)
<date+timestamp>    finished restoring testDB.coll2 (10000 documents, 0 failures)
<date+timestamp>    no indexes to restore for collection testDB.coll1
<date+timestamp>    no indexes to restore for collection testDB.coll2
<date+timestamp>    20000 document(s) restored successfully. 0 document(s) failed to restore.

该命令将指定集合的数据、元数据和索引定义恢复到目标 MongoDB/ORDS 实例。例如，如上面的输出所示，数据库 testDB 中的 coll1 和 coll2。

在 MongoDB 的源大数据部署上创建和运行更改数据捕获 (Change Data Capture，CDC) 提取。从步骤 2b 中获取的第一个操作时间戳 (First LSN) 启动 MongoDB CDC 提取。这可确保 CDC Extract 捕获在转储过程启动后发生的操作。
创建和运行 MongoDB Replicat 。
1. 使用步骤 4 中生成的 CDC 线索文件。
2. 将 oplongReplayLastLsn 设置为在步骤 2b 中获取的最后一个操作时间戳 (Last LSN)，或者自动获取到 oplog.bson 和最后一个 LSN 的路径。这可确保复制在 oplong-replay 模式下运行，从而避免冲突并保证精确启动，而不会丢失或复制数据。处理最后一个时间戳后，复制将以正常模式继续。

提示

以下是一些提示，可确保您的迁移顺利进行：

建议使用 mongodb-database-tools 版本 100.10.0 或更低版本。
在运行 mongodump 之前，清理现有转储文件夹以删除不一致的数据。
MongoDB 还原：可以使用 mongorestore 命令中的多个 --nsInclude 选项复制数据库的多个集合。但是，ORDS 多个数据库无法使用多个 --nsInclude 命令还原。必须使用多个还原命令，每个数据库都有一个还原命令。

标题和版权信息

Oracle 和/或其关联公司。