Skip to main content

Table 12 Cyrus - Input transformation process transparency framework

From: Automatic transparency evaluation for open knowledge extraction systems

Categories

Dimensions

Model provenance information

1. Model title

2. A link or other access point to the Model

3. Implementation information

   \(\bullet\)Software/s used [15, 49]

   \(\bullet\)Codes [49, 67]

   \(\bullet\)Documentation for the codes [49]

   \(\bullet\)If and when an algorithm is being employed [15]

4. High-level visualisation [49]

5. Model use

   \(\bullet\)Primary intended use cases [42, 44, 68]

   \(\bullet\)Out-of-scope use cases [42]

   \(\bullet\)Primary intended users [42, 44, 49, 66]

   \(\bullet\)Information on how to use the system [42, 66]

6. Model limitations [49, 63]

7. Paper or other resource for more information [42, 44]

8. Licensing information [42, 66]

9. The stakeholdersa [15, 38, 42, 44, 66]

10. A contact point [42, 66]

11. Model versions and dates [42, 44]

12. Metadata dates and versions [44]

13. Citation details [42]

14. Sources of funding [49, 63]

Modelling information

1. Modelling method [42,43,44, 49, 62, 63, 65]

2. Information about the model output/s

   \(\bullet\)Model output/s (Model questions) [49, 63]

   \(\bullet\)A link or other access point to the model output/s

   \(\bullet\)A link or other access point to the outputs’ transparency information

3. Training dataset/s [15, 42, 44, 62, 67]

   \(\bullet\)List of the training datasets [44]

   \(\bullet\)Training dataset transparency information [44]

   \(\bullet\)Dataset size information

   \(\bullet\)Sample size [44, 63, 64]

   \(\bullet\)Rationale for the sample size [63]

   \(\bullet\)Preprocessing techniques used [42]

4. Information about model input/s

   \(\bullet\)Input data

   \(\bullet\)Input data transparency information [44]

   \(\bullet\)Dataset size information: Sample size [44, 63, 64], Rationale for the sample size [63]

   \(\bullet\)Preprocessing techniques used [42]

   \(\bullet\)Features or variables used in the algorithm [15]

   \(\bullet\)Feature weights or regression coefficients [15, 43, 49, 63]

   \(\bullet\)Modelling assumptions [15, 49]

   \(\bullet\)Statistical analysis methods for attributesb [42, 43, 63]

   \(\bullet\)Other resources

   \(\bullet\)List of other resources used as inputs for the system

   \(\bullet\)Links or other access points to the provenance information for each of the resources

5. Information about model parameters

   \(\bullet\)Model parameters and values[42, 49]

   \(\bullet\)Model calibration (parameter estimation) [42, 49]

6. Model updates or adjustments [63, 64]

   \(\bullet\)Any model updates or adjustmentsc arising from the validation

   \(\bullet\)Results from any model updatingd

7. Model evaluation [15, 42, 44, 49, 64]

   \(\bullet\)Evaluation dataset/s [42, 44, 67]

   \(\bullet\)Test and holdout data transparency information [44]

   \(\bullet\)Dataset size information: Sample size [44, 63, 64], Rationale for the sample size [63]

   \(\bullet\)Preprocessing techniques used [42]

   \(\bullet\)Comparison between validation and development datasets [63, 64]

   \(\bullet\)Methods used to evaluate model performance, e.g., cross validation [64]

   \(\bullet\)Performance measures results [15, 42,43,44, 49, 63, 64, 69]

   \(\bullet\)Rationale for performance measures [44]

   \(\bullet\)Benchmarking against standard datasets [44, 49]

   \(\bullet\)Reliability analysis, e.g., baseline survival [64]

   \(\bullet\)FAIR [57]

   \(\bullet\)Third party performance verifications [44]

   \(\bullet\)Concept drift [44]

   \(\bullet\)Interpretation of results [63, 64]

   \(\bullet\)If objectives are met considering the results

   \(\bullet\)Model limitations

   \(\bullet\)Model generalisability

   \(\bullet\)Sources of errors

8. Model Explainability [42, 44, 62, 65, 66]

   \(\bullet\)Explainability/interpretability approaches used [42, 44, 62, 66]

   \(\bullet\)The target user of the explanation [44]

   \(\bullet\)Any human validation of the explainability of the algorithms [44]

Ethical information (bias/unfairness, privacy, security, risk related factors, and conformance of the model/system)

1. Personal information

   \(\bullet\)Personal information used [15]

   \(\bullet\)Informed consent

   \(\bullet\)Information consent form

   \(\bullet\)Options to withdraw personal data [44]

   \(\bullet\)The withdrawal procedure [44]

2. Information about possible adverse outcomes [15, 38, 42,43,44, 62, 66, 68]

   \(\bullet\)An analysis of possible adverse outcomes [63]

   \(\bullet\)Impact of possible adverse outcomes

   \(\bullet\)Unfairness and bias analysis

   \(\bullet\)Sources of bias or unfairness [44]

   \(\bullet\)Bias/Unfairness measures

   \(\bullet\)Remediation for possible adverse outcomes [38, 44, 68]

   \(\bullet\)Remediation procedures

   \(\bullet\)value of bias estimates before and after remediation

   \(\bullet\)Performance metrics changes after remediation

3. Privacy and security management information [42, 44, 62]

   \(\bullet\)Possible privacy and security weaknesses, i.e., ways the system can be attacked or abused

   \(\bullet\)Privacy and security management approaches, e.g., ways to handle potential security breaches

4. Information about the filtered elements of a curated experience [15]

5. Potential conflicts of interests [49]

Model review information

1. Plan for continuous monitoring [15, 38, 44, 68]

2. Retrospective analysis of disasters [38]

  1. aThe owner/s, person, or organisation developing the model
  2. bSuch as min, max, and median values at the top-10 and over-all
  3. cSuch as recalibration, recalibration, predictor effects adjusted, or new predictors added
  4. di.e., model specification, model performance