furness_pandas_wrapper#
- caf.distribute.furness.furness_pandas_wrapper(seed_values, row_targets, col_targets, *, max_iters=2000, seed_infill=0.001, normalise_seeds=True, tol=1e-09, idx_col='model_zone_id', unique_col='trips', round_dp=8, unique_zones=None, unique_zones_join_fn=<built-in function and_>)#
Create wrapper around doubly_constrained_furness() to handle pandas in/out.
Internally checks and converts the pandas inputs into numpy in order to run doubly_constrained_furness(). Converts the output back into pandas at the end
- Parameters:
seed_values (DataFrame) – The seed values to use for the furness. The index and columns must match the idx_col of row_targets and col_targets.
row_targets (DataFrame) – The target values for the sum of each row. In production/attraction furnessing, this would be the productions. The idx_col must match the idx_col of col_targets.
col_targets (DataFrame) – The target values for the sum of each column. In production/attraction furnessing, this would be the attractions. The idx_col must match the idx_col of row_targets.
max_iters (int) – The maximum number of iterations to complete before exiting.
tol (float) – The maximum difference between the achieved and the target values to tolerate before exiting early. R^2 is used to calculate the difference.
seed_infill (float) – The value to infill any seed values that are 0.
normalise_seeds (bool) – Whether to normalise the seeds so they total to one before sending them to the furness.
idx_col (str) – Name of the columns in row_targets and col_targets that contain the index data that matches seed_values index/columns
unique_col (str) – Name of the columns in row_targets and col_targets that contain the values to target during the furness.
round_dp (int) – The number of decimal places to round the output values of the furness to. Uses 4 by default.
unique_zones (list[int] | None) – A list of unique zones to keep in the seed matrix when starting the furness. The given productions and attractions will also be limited to these zones as well.
unique_zones_join_fn (Callable) – The function to call on the column and index masks to join them for the seed matrices. By default, a bitwise and is used. See pythons builtin operator library for more options.
- Returns:
furnessed_matrix – The final furnessed matrix, in the same format as seed_values
completed_iters – The number of completed iterations before exiting
achieved_rmse – The Root Mean Squared Error difference achieved before exiting
- Return type:
tuple[DataFrame, int, float]