Performing Data Conversion
This section provides overviews of the conversion process and referential integrity and discusses how to:
Map data for conversion.
Prepare data for conversion.
Verify converted data.
Use data dictionaries.
Use data load programs.
Keep systems in sync.
Estimate disk usage space.
See PeopleTools: Data Management
Here is a review of the scope of the conversion process, so that you can gain an understanding of the phases involved:
Decide how much data to convert.
You may choose to convert a different amount of data for each application. For example, you may want to convert 10 years' worth of student records data, two years of admissions data or two years of financial aid data.
Map data to PeopleSoft data fields.
To effectively plan a conversion, dig into the old system and identify that one obscure field on that one obscure table and figure out how it maps into the new PeopleSoft system. In some cases, it is difficult, and at times impossible, to map all the codes and tables from the legacy system directly to PeopleSoft tables. In addition, think about the impact that these legacy fields or codes have on reports generated by the PeopleSoft system.
Export the data, using a Structured Query Report (SQR) or another method to interim tables.
Use the interim tables to review and clean up the data before moving it to the PeopleSoft system. Corruption of data often is the result of data entry errors. You do not want to move corrupt data from the old system to the new PeopleSoft system.
Import the data using an SQR.
An important topic to consider and test for as part of the conversion effort is referential integrity. You can convert a lot of data into the correct tables, but the overall system may not have the necessary connections. If you use the normal posting processes in the various applications to convert most of the legacy data, you should not have referential integrity problems because those processes contain many of the necessary referential integrity rules.
An example in Student Financials is third-party contract information. If you link a third party or a transaction to a contract, the contract needs to exist. However, if the conversion neglects to convert contracts, then a referential integrity problem can occur.
When you map data, you are identifying legacy system data and redefining it in PeopleSoft record definitions. This step can be performed while setting the conversion timeline; however, your data mapping efforts will be more focused if the conversion time line has been set.
Issues to consider when mapping data include defining the amount of historical data to be converted, code validation, workflow, reporting, and other processes. The amount of historical data converted may vary across PeopleSoft applications. The complexity of how the legacy system stores history may contribute to less history mapped into the PeopleSoft system. With code validation, valid codes for the institution may have changed over time, and this too must be considered in the conversion effort.
To map data for conversion:
Define each legacy data element.
Determine if the legacy data element is pertinent to the implementation scope.
Map the legacy data element to the Campus Solutions data element.
If the legacy data element cannot be mapped directly within your Campus Solutions product line, determine if the legacy data should be tracked any more or if it could be tracked in another PeopleSoft product line, such as PeopleSoft Human Resources Management.
Reformat the legacy data to fit, for example, the Campus Solutions field size, format or data type.
If certain legacy data cannot be reformatted, indicate those as gaps in the gap analysis.
Spreadsheets are helpful tools when mapping data. At a minimum, these spreadsheets should include the legacy system data source (if data comes from more than one source), the legacy system data element and data type, the specific Campus Solutions or Contributor Relations record, and the specific field and data type in that record. Optional description fields might include related prompt tables, conversion translation values, and legacy system data position.
Another option is to build a mini-application using PeopleTools to map data. You can build Structured Query Language (SQL) views to link various PeopleTools tables together, such as PSRECFIELD and PSRECDEFN, which creates a robust method of identifying field attributes. You can create special reports to identify key fields, required fields, or legacy system data not yet mapped.
After you have identified the legacy data to be converted, you must prepare it for conversion. For example, when coming from the legacy system, data may be packed in EBCDIC format. These fields should be reformatted into a conversion-ready unpacked flat file in ASCII standard format. Usually, a COBOL program is written and run on the database server to create a file placed on the legacy system.
Conversion-ready data does not have to reside in a single file. The program may be designed to create multiple files during the formatting process, with each file only containing data pertinent to a single or related set of Campus Solutions or Contributor Relations destination tables. Smaller file sizes are more manageable and result in faster conversion processing. Also, the data preparation program itself may be broken into multiple programs based on the destination table.
One of the largest time-consuming tasks of the conversion project can be data cleanup. One way to minimize this task is to clean up the data before importing it into the PeopleSoft system. This can be done by using the data preparation and extract program. This program contains more logic to ensure data validity and therefore takes a longer time to program and test, but the program reduces processing time during the data cleanup.
Another option to reduce the data cleanup effort for the conversion team is to create reports for review by appropriate functional experts, technical experts, employees, and managers. These reports can be used for data verification and correction before the data is entered in the database. Correct invalid legacy data before preparing for conversion.
Not all data conversion must occur through background programs. You may determine that online application data entry is necessary for prompt tables and other required PeopleSoft tables for which there was no legacy data.
After data has been converted to the PeopleSoft system, you must verify the data and check its integrity. Two methods are online verification and batch verification. Data verification must be performed to clean up invalid data that may not have been keyed, required, or validated in the preparation or conversion processes. This process can also identify and clean up duplicates in the legacy system before you finish converting and mapping the data to the new PeopleSoft tables.
You can write batch programs or SQL scripts to check for parent-child relationships and other validation checking. Writing the batch programs requires more time commitment but can quickly check more records.
Online data verification is less exact than the background process, because records are randomly accessed and require a longer amount of time. However, online data entry can be used to enter nonrequired data or supplemental data that was not available from the legacy system.
PeopleSoft does not issue a data dictionary as part of its documentation. PeopleTools provides the ability for online documentation of page and field relationships. The development team documents its material as much as is feasible online in PeopleTools. You can write an SQR that provides all the data dictionary information that you need.
After data preparation, design conversion programs that extract the appropriate file layout, to transfer the legacy data to Campus Solutions tables.
Often the input format for the conversion program is the flat file created by the data extract program on the legacy system and transferred—by using File Transfer Protocol—to the database server. Consider loading the data into temporary tables within the PeopleSoft system instead of directly into the "live" tables. By performing this task, you can use the tools already available to you to both clean up and properly convert data. The temporary tables would then be the input for the conversion program.
Conversion Program and Program Language
The conversion program transfers the legacy data from the input format to Campus Solutions tables, including reformatting, validating, and prompt table translating not encompassed in the data preparation programs. The conversion program must include defaults for Campus Solutions record keys and required fields whenever legacy data has no data map.
The two most commonly used languages for conversions are COBOL and SQR. Another option is a specific database management system (DBMS) import utility, if one exists—such as the Oracle SQL Loader. Determinants for making this decision include your skill set. For example, if you already know COBOL, there is no associated learning curve downtime, or increased resources if you use COBOL. Regardless of your skill set, consider if it would be advantageous to learn a language that is likely be used in the future for Campus Solutions modifications, conversion file size, level of translations and edits, and DBMS capability.
Field or Control
Has a more robust debugger, can process a larger number of records faster, and is more widely recognized in the industry.
Is more commonly used in reporting and minor modifications. In most cases, SQR code can be written in less time than COBOL code.
|DBMS Import Utility||
Requires the least conversion effort but is limited to no application-specific edits or translations.
Table Loading Sequence
Conversion programming may include a single program that incorporates all pertinent tables and their sequence or multiple programs that incorporate single tables or a few related tables. Multiple programs can run simultaneously—increasing efficiency and enabling the allocation of programming responsibility.
Errors and Exception Handling
Regardless of the number of conversion programs, there must be adequate and standard error and exception handling. Reports of database action failures, validation logic failures, and pertinent details help ensure a quality and thorough conversion.
In addition to errors and exceptions, totals controls also help in quality control. Totals controls include the number of records processed on both the source and destination sides, and dollar tallies whenever possible are pertinent for Financial Aid and Student Financials applications.
You can find out what fields are required for any table or page by searching PeopleSoft Application Designer in PeopleTools. You can also write an SQR to list all required data elements from tables that you want to use in the conversion. It is important that you also read and understand the PeopleCode. There are PeopleCode restrictions within records that can mean that in certain circumstances a field is required or a default value is required. Without understanding the PeopleCode, incorrect data could be entered.
Warning! If a field is not required, do not assume that you do not need to convert data into that field. It is recommended that when in doubt, you review the PeopleCode to determine whether you need data or default values for a particular field.
Note: Refer to individual discussions regarding each application for specific data conversion and data loading tips for applications within Campus Solutions.
Depending on the implementation plan, there are three scenarios for moving data processing from a legacy system to a PeopleSoft system. The first scenario is to cut over everything at once. When all students are fully implemented, you perform a cutover to the PeopleSoft system. Until that point, the legacy system is fully operational. No system synchronization is required.
The second scenario is the mutually exclusive implementation approach. For example, when a student is converted to an application in Campus Solutions, such as Financial Aid, the student is officially taken off of the legacy system's processing in that application. All of the student's other application processing—such as Student Records—is performed on the legacy system. There are two production systems. This approach increases the complexity of joining two data sources for interfaces and reports.
The third scenario is the parallel system approach. An alternative to having two production systems is to test each Campus Solutions application as necessary, but completely delete the application database and rerun a full conversion with the most up-to-date data from the legacy system until full cutover is achieved after full testing. This approach implies that the legacy system is in full use until cutover, but no new programs must be written—for example, the conversion program is reused and all reports continue to be produced from the legacy system. A variance of this approach is to write new data refresh programs that periodically update the various applications with student information entered in the legacy system. This variance provides for more up-to-date information for more accurate reporting and testing, but also implies the additional effort to program the refresh routine. Care must be taken in the refresh routine design to ensure that data in the PeopleSoft system is not duplicated or overlaid by the legacy data.
For Campus Solutions, PeopleSoft has done some preliminary research on the sizing of a production system. PeopleSoft realizes that the theoretical amount of disk space required by an application is different than the actual amount of disk space required by a database. The difference is due to comparing bytes with database blocks. For instance, if a particular table's average row length is 25 bytes and there are 1,000 rows in the table, theoretically the required amount of space is 25,000. However, if the database block size is 4K (4096), the number of database blocks required to hold the data is seven blocks or 28,673. Therefore, the actual amount of disk space required is greater than the theoretical. With this in mind and realizing that each installation may require a different block size, PeopleSoft states the size in theoretical terms.
For 1,000 student prospects, the amount of space needed for tables and indexes would be 40 megabytes (MB).
For enrollment, PeopleSoft needs to work with a specific, albeit theoretical, scenario. PeopleSoft's scenario was to enroll 1,000 undergraduate students over three terms into four classes per term. In the first term, there are no prerequisites to enroll in any of classes; the second term has one prerequisite for each of the classes; the third term has two prerequisites for each of the classes. During enrollment, the students have no time conflicts and there is no lack of facilities (rooms or seats). The amount of disk space excludes description information stored in long text fields. For 1,000 student enrollments, the amount of space needed for tables and indexes would be 20 MB.