21.14 Errors That Migrator Configuration Alone Can't Fix

Even if you configure the migrator to not consider any fields or their values to be outliers, the migrator can detect other kinds of problems. A simple example shows detection of data contradiction between different document sets.

Trying to migrate document sets can sometimes uncover other potential data problems, besides the existence of outlier fields, which means that the migrator can raise errors even when converting with zero values for configuration fields minFieldFrequency and minTypeFrequency. This can be another reason you might want to begin a migration to duality views by using the simplified migration recipe: default behavior except no outliers.

An example of this is shown here: two document sets that apparently contradict each other, making it impossible to reconcile them without some data changes. Each document set is coherent on its own, both structurally and in terms of field types, but the two sets don't fit together.

(The migrator can help you discover some data coherency problems such as this, even if you're not migrating any data!)

The data used here is the same as that presented in other migrator topics, except for this difference: a course document says that Natalie C. teaches course MATH101 and a teacher document says that Abdul J. teaches it. The data used in other topics has Abdul J. as the teacher in the course document as well; it has Natalie C. teaching no courses.

Example 21-38 Course MATH101 Document with Teacher Natalie C.

This is the input course document that has Natalie C. as the teacher of MATH101. All other course documents are as in the example "Course Document Set (Migrator Input)" in School Administration Example, Migrator Input Documents, and the teacher and student documents are all as before.

{"courseId"         : "MATH101",
 "name"             : "Algebra",
 "creditHours"      : 3,
 "students"         : [ {"studentId" : 1, "name" : "Donald P."},
                        {"studentId" : 5, "name" : "Hye E."} ],
 "teacher"          : {"teacherId" : 104, "name" : "Natalie C."},
 "Notes"            : "Prerequisite for Advanced Algebra"}
...
  • The teacher document for Natalie C. shows no coursesTaught, but the course document for MATH101 shows Natalie C. as the teacher.

  • The teacher document for Abdul J. shows that Abdul teaches both MATH101 and MATH102, but the course document for MATH101 shows Natalie C., not Abdul, as the teacher.

The importer succeeds, as before (with zero values for minFieldFrequency and minTypeFrequency). But DBMS_JSON_DUALITY.validate_import_report finds and reports those contradictions.

Example 21-39 VALIDATE_IMPORT_REPORT for Contradictory Document Sets

See Migrating To Duality, Simplified Recipe for the simplified recipe that we assume is followed here as well. All of the steps are the same; the only difference is that the course document for MATH101 used here is as shown in the preceding example, "Course MATH101 Document with Teacher Natalie C".

It's the import validation report for the teacher document set that reports the error; the documents for teachers Abdul J. and Natalie C. are reported.

SELECT * FROM DBMS_JSON_DUALITY.validate_import_report(
                                  table_name => 'TEACHER_TAB',
                                  view_name => 'TEACHER');
DATA
----
{"_id":101,"name":"Abdul J.","phoneNumber":["222-555-011","222-555-012"],"salary
":200000,"department":"Mathematics","coursesTaught":[{"courseId":"MATH101","name
":"Algebra","classType":"Online"},{"courseId":"MATH102","name":"Calculus","class
Type":"In-person"}],"studentsAdvised":[{"studentId":4,"name":"Georgia D.","dormI
d":203},{"studentId":7,"name":"Jatin S.","dormId":204},{"studentId":9,"name":"Lu
is F.","dormId":201},{"studentId":10,"name":"Ming L.","dormId":202}]}

{"_id":104,"name":"Natalie C.","phoneNumber":"222-555-044","salary":180000,"depa
rtment":"Computer Science","coursesTaught":[],"studentsAdvised":[]}

ERRORS
------
[{"op":"remove","path":"/coursesTaught/0"}]
[{"op":"replace","path":"/coursesTaught","value":[{"name":"Algebra","courseId":"
MATH101","classType":"Online"}]}]

Compare this example with the example "VALIDATE_IMPORT_REPORT for No Outlier Use Case" in Migrating To Duality, Simplified Recipe.

The import validation report suggests fixing the problem by removing MATH101 from Abdul's teacher document and adding it to Natalie's.

  • In the first error-log row returned, column data has Abdul's teacher document; in the second row, column data has Natalie's document.

  • Column errors in the first row (which corresponds to Adbul's document) says to remove the first element of array coursesTaught (array indexing is zero-based, so the path is /coursesTaught/0). That's this object, which describes course MATH101: {"courseId":"MATH101","name":"Algebra","classType":"Online"}.

    Column errors in the second row (which corresponds to Natalie's document) says to replace the empty array that's the value of field coursesTaught with this array: [{"name":"Algebra","courseId":"MATH101","classType":"Online"}].

The validation report suggests that one remedy. But an alternative fix would be to change the course document for MATH101 to fit the teacher documents: change its teacher to Abdul. Only you know, for your application, whether some particular data is an anomaly, according to your use of it, and only you know which ways to reconcile misfits are more appropriate.

If in fact that seemingly conflicted data is correct, then presumably the teacher and course documents should not share their teacher-course assignments. In that case, the remedy would be to use separate underlying tables in the teacher and course duality-view definitions.