furness_pandas_wrapper#

caf.distribute.furness.furness_pandas_wrapper(seed_values, row_targets, col_targets, *, max_iters=2000, seed_infill=0.001, normalise_seeds=True, tol=1e-09, idx_col='model_zone_id', unique_col='trips', round_dp=8, unique_zones=None, unique_zones_join_fn=<built-in function and_>)#

Create wrapper around doubly_constrained_furness() to handle pandas in/out.

Internally checks and converts the pandas inputs into numpy in order to run doubly_constrained_furness(). Converts the output back into pandas at the end

Parameters:
  • seed_values (DataFrame) – The seed values to use for the furness. The index and columns must match the idx_col of row_targets and col_targets.

  • row_targets (DataFrame) – The target values for the sum of each row. In production/attraction furnessing, this would be the productions. The idx_col must match the idx_col of col_targets.

  • col_targets (DataFrame) – The target values for the sum of each column. In production/attraction furnessing, this would be the attractions. The idx_col must match the idx_col of row_targets.

  • max_iters (int) – The maximum number of iterations to complete before exiting.

  • tol (float) – The maximum difference between the achieved and the target values to tolerate before exiting early. R^2 is used to calculate the difference.

  • seed_infill (float) – The value to infill any seed values that are 0.

  • normalise_seeds (bool) – Whether to normalise the seeds so they total to one before sending them to the furness.

  • idx_col (str) – Name of the columns in row_targets and col_targets that contain the index data that matches seed_values index/columns

  • unique_col (str) – Name of the columns in row_targets and col_targets that contain the values to target during the furness.

  • round_dp (int) – The number of decimal places to round the output values of the furness to. Uses 4 by default.

  • unique_zones (list[int] | None) – A list of unique zones to keep in the seed matrix when starting the furness. The given productions and attractions will also be limited to these zones as well.

  • unique_zones_join_fn (Callable) – The function to call on the column and index masks to join them for the seed matrices. By default, a bitwise and is used. See pythons builtin operator library for more options.

Returns:

  • furnessed_matrix – The final furnessed matrix, in the same format as seed_values

  • completed_iters – The number of completed iterations before exiting

  • achieved_rmse – The Root Mean Squared Error difference achieved before exiting

Return type:

tuple[DataFrame, int, float]