Using Distribution Fitting for Assumptions

When historical data is available, you can use distribution fitting to select an appropriate distribution when defining an assumption. For an overview of distribution fitting, see Fitting Distributions to Historical Data.

  To use distribution fitting when creating or editing an assumption:

  1. Select the cell where you want to create an assumption.

    It can be blank or contain a simple value, not a formula.

  2. Click the lower half of the Define Assumption icon, Define Assumption button.

  3. Select Fit Distribution to select the source of the fitted data.

    Note:

    You can also click the upper half of the Define Assumption icon and select Fit in the Distribution Gallery.

    The Fit Distribution dialog opens.

  4. Select a data location.

    • If the historical data is in a worksheet in the active workbook, select Range, and then enter the data’s cell range. If the range has a name, you can enter the name, preceded by an = sign.

    • If the historical data is in a separate text file, click Text File, and then either enter the path and name of the file or click Browse to search for the file. If you want, you can select Column and enter the number of columns in the text file.

      When you use a file as the source of data, each data value in the file must be separated by either a comma, a tab character, a space character, or a list separator defined in Windows’ Regional and Language Options panel. If actual values in the file contain commas or the designated list separator, those values must be enclosed in quotation marks. Allowable formats for values are identical to those allowed within the assumption parameter dialog, including date, time, currency, and numbers.

  5. Specify which distributions are to be fitted:

    • AutoSelect performs a basic analysis of the data to select a distribution fitting option and ranking method. If the data includes only integers, fitting to all discrete distributions (with the exception of Yes-No) is completed using the Chi-square ranking statistic choice.

    • All Continuous fits the data to all of the built-in continuous distributions (these distributions are displayed as solid shapes on the Distribution Gallery).

    • All Discrete fits to all discrete distributions except yes-no and uses the Chi-square ranking statistic.

    • Choose displays another dialog where you can select a subset of the distributions to include in the fitting.

    • The final setting selects the distribution that was highlighted on the Distribution Gallery when you clicked the Fit button.

      If you try to fit negative data to a distribution that can only accept positive data, that distribution will not be fitted to the data.

  6. Specify how the distributions should be ranked.

    In ranking the distributions, you can use any one of three standard goodness-of-fit tests:

    • Anderson-Darling. This method closely resembles the Kolmogorov-Smirnov method, except that it weights the differences between the two distributions at their tails greater than at their mid-ranges. This weighting of the tails helps to correct the Kolmogorov-Smirnov method’s tendency to over-emphasize discrepancies in the central region.

    • Kolmogorov-Smirnov. The result of this test is essentially the largest vertical distance between the two cumulative distributions.

    • Chi-Square. This test is the oldest and most common of the goodness-of-fit tests. It gauges the general accuracy of the fit. The test breaks down the distribution into areas of equal probability and compares the data points within each area to the number of expected data points. The chi-square test in Crystal Ball does not use the associated p-value the way other statistical tests (e.g., t or F) do.

    The first setting, AutoSelect, selects the ranking statistic automatically based on several factors. If all data values are integers, Chi-Square is selected.

  7. Optional: If you know the data corresponds to certain shape, location, or other special parameter values for some distributions, select Lock parameters and enter appropriate values in the Lock Parameters dialog (Locking Parameters When Fitting Distributions).

  8. Optional: By default, only values for the selected ranking statistic are displayed in the Comparison Chart dialog. To show values for all three statistics, select Show All Goodness-of-fit Statistics at the bottom of the Fit Distribution dialog.

  9. Optional: To filter data for fitting by excluding or including certain value ranges, select Filter data (Filtering Values When Fitting Distributions).

  10. Click OK.

    The Comparison Chart opens (Confirming the Fitted Distribution).