It is important to understand the following security considerations when you give application access to administrators and users:
- The CIC administrator role is very powerful and therefore must be granted judiciously.
The CIC administrator role grants access to the CIC Administration application. This administration application gives CIC administrators access to the ML Workbench page. On the ML Workbench page, administrators can explore and see the models to be trained or retrained and feature selections that can be made. When a model is retrained, if new data has been added into the training set, it could cause the current predictions to train. Therefore, granting access to administration application and ML Workbench page should be limited and restricted.
Note: CIC users however, have no access to the ML models, the model code, or the data used for training or testing. They also cannot change the actual models.
- Administrators should be cautious of input poisoning.
Data used in training shapes future predictions. Malicious or bad data can lead to bad future predictions. CIC administrators should be aware of the projects opted into the system and also aware of which projects are used for training the models that leads to prediction accuracy. Oracle recommends you to use Separation of Duty controls to ensure that those choosing the projects for CIC, which will also be used for training, opt in their target data appropriately. Similar to other Primavera applications, bad or misleading source data can affect outputs. CIC is delivered with multiple out-of-the-box (OOTB) Seed Models, which are trained with sample data. These are not the ideal models to use but they give your organization a good starting point for enabling the system, and to see a first round of predictions while you understand how to train with your data.
- Irrelevant features can precipitate confounding and spurious correlations.
It is important to understand how certain features affect your predictions or how your data is reflected in the feature set. For example, if you are an organization without costs, you may want to make sure no cost features are selected. To get a basic implementation with the models you can choose SeedModel customerData. This model will use the Seed Model features with your data. Therefore select only the relevant features applicable for your data.
Note: No PII is used in training data.
- Data used in training is not visible if the user does not have access to that data.
If you have bad predictions made, however, then it may be discerned that the training data is skewed in a negative fashion. For example, all projects are predicted to be delayed significantly indicates a skewed prediction. The models are continuously learning and adapting based on the data being pumped into the system. Therefore, it is recommended to keep access to the CIC Administrator role restricted and selective to ensure that non-admin CIC users cannot see the projects being used for training. CIC users can only see the prediction and the data they have access to.
Note: CIC users cannot see data they do not have access to in the source systems. They also do not see or can not access any of the training data in CIC.
- Model robustness attack
A malicious user may be able to precipitate bad prediction by modifying the associated input data imperceptibly, and with plausible deniability. If the source application data is manipulated in a way that gives a skewed direction for the predictions then that can be reflected in CIC predictions as well. For example, if you select projects for CIC to include only those projects that are far behind and excessively over budget, then it is likely that the predictions will skew in the similar direction as well. Therefore, it is recommended to keep the access to the CIC Administrator role restricted and selective.
Note: At no point in time, are the models exposed through any user interface to any organization that enables them to change, access, or inject any malicious adjustments to the model.