ipf#
- caf.distribute.iterative_proportional_fitting.ipf(seed_mat: ndarray, target_marginals: list[ndarray], target_dimensions: list[list[int]], convergence_fn: Callable[[Collection[ndarray], Collection[ndarray]], float] | None = None, max_iterations: int = 5000, tol: float = 1e-09, min_tol_rate: float = 1e-09, use_sparse: bool = False, show_pbar: bool = False, pbar_kwargs: dict[str, Any] | None = None) tuple[ndarray, int, float]#
- caf.distribute.iterative_proportional_fitting.ipf(seed_mat: COO, target_marginals: list[ndarray | COO], target_dimensions: list[list[int]], convergence_fn: Callable[[Collection[ndarray], Collection[ndarray]], float] | None = None, max_iterations: int = 5000, tol: float = 1e-09, min_tol_rate: float = 1e-09, use_sparse: bool = False, show_pbar: bool = False, pbar_kwargs: dict[str, Any] | None = None) tuple[COO, int, float]
Adjust a matrix iteratively towards targets until convergence met.
https://en.wikipedia.org/wiki/Iterative_proportional_fitting
- Parameters:
seed_mat – The starting matrix that should be adjusted.
target_marginals – A list of the aggregates to adjust matrix towards. Aggregates are the target values to aim for when aggregating across one or several other axis. Directly corresponds to target_dimensions.
target_dimensions – A list of target dimensions for each aggregate. Each target dimension lists the axes that should be preserved when calculating the achieved aggregates for the corresponding target_marginals. Another way to look at this is a list of the numpy axis which should NOT be summed from mat when calculating the achieved marginals.
convergence_fn – The function that should be called to calculate the convergence of mat after all target_marginals adjustments have been made. If a callable is given it must take the form: fn(targets: list[np.ndarray], achieved: list[np.ndarray])
max_iterations – The maximum number of iterations to complete before exiting
tol – The target convergence to achieve before exiting early. This is one condition which allows exiting before max_iterations is reached. The convergence is calculated via convergence_fn.
min_tol_rate – The minimum value that the convergence can change by between iterations before exiting early. This is one condition which allows exiting before max_iterations is reached. The convergence is calculated via convergence_fn.
use_sparse – Whether to use sparse matrices when doing the ipf calculations. This is useful then the given numpy array is very sparse. If a sparse.COO matrix is given, this argument is ignored.
show_pbar – Whether to show a progress bar of the current ipf iterations or not. If pbar_kwargs is not None, then this argument is ignored. If pbar_kwargs is set, and you want to disable it, add {“disable”: True} as a kwarg.
pbar_kwargs – A dictionary of keyword arguments to pass into a progress bar. This dictionary is passed into tqdm.tqdm(**pbar_kwargs) when building the progress bar.
- Returns:
fit_matrix – The final fit matrix.
completed_iterations – The number of completed iterations before exiting.
achieved_convergence – The final achieved convergence - achieved by fit_matrix
- Raises:
ValueError: – If any of the marginals or dimensions are not valid when passed in.