Whether you are using a random or deterministic distribution strategy, it is strongly recommended that you use a timestamp format as the naming scheme for the update source data files.
This format ensures that Forge processes the files in the proper order of their creation.
For both strategies, a Perl expression in the record manipulator can use the timestamp part of the filename for the name of the output record file.
YYYYMMDDHHNNSS.extwhere YYYY is the four-digit year, MM is the two-digit month, DD is the two-digit day, HH is the two-digit hour, NN is the two-digit minute, and SS is the two-digit second, as this example:
20051023161408.txtThese files may contain new records that are distributed randomly to the Agraph partitions.
YYYYMMDDHHNNSS-partX.extwhere X is the number of the Agraph partition for which these records are intended. For example, records in this source data file are intended for partition3:
20050717151408-part3.txt
The Perl expression in the record manipulator parses the filename for the partition number and uses it to assign new records to that partition.
20050717151408-part3.records.xmlKeep in mind that if you pre-partition your baseline source files, you should also pre-partition the records to be added. That is, all ADD (or ADD_OR_REPLACE) records for the partition 0 Dgraph should be in one file, records for the partition1 Dgraph should be in a second file, and so on.