Content Management
There are several important areas that need to be considered when deciding to use the Data Export REST API:
- Additional load on operational servers.
- Volume of data to be exported.
- Capacity of external repository, for both storage and network resources.
Server Load
Of primary importance is the stability of the server running the export request (reminder: one export request can contain multiple request items where each item exports data from one table). As the export requests will be running in real time, they will need to share the processing resources of the operational server. Priority is given to all other operational processes. Therefore, the number of simultaneous export request items running at any one time is limited by the system and is not user configurable. Several export requests can be submitted simultaneously but the internal workload of the server will be managed so that no more than the fixed number of long running database queries, current value is 3, will be running at any one time.
Data Volumes
A common use case for the API is expected to be the incremental export of recently completed data: orders, shipments, invoices etc. The data volumes for these loads are expected to be large and, furthermore, the initial export of the historical data will be even larger. Therefore, the export request process is designed to handle tens of millions of rows of data. Obviously, the query, transmission, and storage of this information in one session would be extremely complex and require very specialized tools.
The Data Export API supports such large exports by limiting the following:
- The total number of rows for any one table.
- The total number of rows for any one exported file.
- The maximum size (in GB) of any one exported file.
- The maximum length of time transmitting the contents of any one file.
Each completed export request item will show the total number of rows matching the query criteria. If this value exceeds the system limit, then a new request should be submitted for the remainder (and so on until the complete results are exported). See Using Request Offset for details.
If any of the other limits are exceeded, then the export will continue but will now produce multiple "parts". See Using Part Limits for details. To support multiple parts, the full "part name" for each part will contain the part number for that part within its "parent" item number.
The standard format for part names is:
DATA_<request id>_<table name>_<request item number>_<part
number>.csv[.gz | .zip]
The appropriate "gz" or "zip" suffix is added if content compression is used.
For example, if any request item would exceed the row limit then a new request (specifying the request "offset" property) would create files with a new request ID.
- DATA_12345_SHIPMENT_1_1.csv
- DATA_12345_SHIPMENT_1_2.csv
- DATA_12345_SHIPMENT_REFNUM_2_1.csv