choicedesign.design
Classes for constructing efficient experimental designs.
The main user-facing class is EffDesign, which combines initial
design generation, constrained random-swap optimisation, and optional blocking.
Typical workflow:
design = EffDesign(X=[alt1_A, alt2_A], ncs=18)
init = design.gen_initdesign(cond=['alt1_A > alt2_A'], seed=42)
result = design.optimise(init, V={1: V1, 2: V2}, time_lim=1)
# result = (optimal_design, init_derr, final_derr, n_iter, utility_balance)
Classes
|
Efficient design for a discrete choice experiment. |
Full-factorial design covering all combinations of attribute levels. |
- class choicedesign.design.EffDesign(X: dict, ncs: int)
Efficient design for a discrete choice experiment.
Combines initial design generation, random-swap optimisation, and optional blocking in a single object.
- Parameters:
Examples
>>> from choicedesign.design import EffDesign >>> from choicedesign.expressions import Attribute, Parameter >>> alt1_A = Attribute('alt1_A', [1, 2, 3]) >>> alt2_A = Attribute('alt2_A', [1, 2, 3]) >>> beta_A = Parameter('beta_A', -0.1) >>> V = {1: beta_A * alt1_A, 2: beta_A * alt2_A} >>> design = EffDesign(X=[alt1_A, alt2_A], ncs=18) >>> init = design.gen_initdesign(seed=42) >>> result = design.optimise(init, V=V, iter_lim=500)
- evaluate(design: DataFrame, V: dict, model: str = 'mnl', criterion: str = 'd', cost_param=None, wtp_params=None, bayes_draws: int = None, seed: int = None)
Evaluate design
Evaluates a design stored in a Pandas data frame
- Parameters:
design (pd.DataFrame) – Design to evaluate
V (dict) – A dictionary with the utility function.
model (str) – The base model for the efficient design, by default ‘mnl’
criterion (str) – Optimality criterion:
'd'(D-error, default),'a'(A-error), or'c'(C-error / WTP variance). When'c',cost_paramandwtp_paramsare required.cost_param (Parameter, optional) – The cost (denominator) parameter. Required when
criterion='c'.wtp_params (list[Parameter], optional) – Parameters whose WTP variances are evaluated. Required when
criterion='c'.bayes_draws (int, optional) – Number of Monte Carlo draws for Db-error evaluation. Only valid with
criterion='d'.seed (int, optional) – Random seed for Bayesian draws, by default None
- Returns:
perf (float) – The criterion value of the design
ubalance_ratio (float) – Utility balance ratio
- export_design(design: DataFrame, attr_names: dict, filepath: str, opt_out: bool = False, alt_names: list = None)
Export design to Excel in respondent-facing choice situation format.
Writes one sheet per block (or a single sheet when the design has no
Blockcolumn). Inside each sheet, choice situations are stacked downward; each row is an attribute and each column is an alternative.- Parameters:
design (pd.DataFrame) – Design from
optimise()orgen_blocks().attr_names (dict) – Mapping from internal column names to display row labels. Columns that share the same display label appear in the same row (one per alternative column). Example:
{'alt1_time': 'Travel time', 'alt2_time': 'Travel time', 'alt1_cost': 'Cost', 'alt2_cost': 'Cost'}filepath (str) – Destination path, e.g.
'design.xlsx'.opt_out (bool, optional) – Add an opt-out column with no attribute levels, by default False.
alt_names (list[str], optional) – Custom headers for the alternative columns. Defaults to
['Alt 1', 'Alt 2', …].
- export_output(filepath: str)
Save a plain-text optimisation summary to a file.
The summary mirrors the information printed by
optimise()whenverbose=True, plus design configuration and stopping-criteria details.optimise()must have been called at least once before invoking this method.- Parameters:
filepath (str) – Destination path, e.g.
'optimisation_summary.txt'.- Raises:
RuntimeError – If
optimise()has not yet been called on this object.
- gen_blocks(design: DataFrame, n_blocks: int, n_iter: int = 1000)
Assign choice situations to blocks.
Minimises the correlation between the block assignment and all attribute columns by evaluating
n_iterrandom permutations and keeping the best one.- Parameters:
design (pandas.DataFrame) – Optimised design from
optimise()(must include aCScolumn).n_blocks (int) – Number of blocks.
n_iter (int, optional) – Number of random permutations evaluated by the search, by default 1000.
- Returns:
design (pandas.DataFrame) – Design with an additional
Blockcolumn.corr_list (list[float]) – History of best total absolute correlation found at each improvement.
- gen_initdesign(cond: list = None, seed: bool = None)
Generate initial design matrix
It generates the initial design matrix. The user can define a set of conditions that must be satisfied.
- Parameters:
cond (list[str], optional) –
List of conditions that the final design must hold. Each element is a string that contains a single condition. Supported forms:
Binary relation:
'X > Y'(attribute vs attribute or value)Conditional:
'if X > a then Y < b'Compound (AND):
'X > a & Y < b'Arithmetic expressions on either side:
'(X + Y + Z) > 0','if (X + Y) > 0 then P >= 0'
Arithmetic expressions support
+,-,*,/and parentheses with any mix of attribute names and numeric constants. By default None.seed (bool, None) – Random seed, by default None
- Returns:
init_design – A Pandas DataFrame with the initial design matrix.
- Return type:
pandas.DataFrame
- optimise(init_design: DataFrame, V: dict, model: str = 'mnl', algorithm: str = 'swap', criterion: str = 'd', cost_param=None, wtp_params=None, bayes_draws: int = None, iter_lim: int = None, noimprov_lim: int = None, time_lim: int = None, seed: int = None, verbose: bool = False)
Optimise the design using a random-search algorithm.
Starts from an initial design and iteratively improves it according to the selected criterion and stopping rules. At least one stopping criterion (
iter_lim,noimprov_lim, ortime_lim) must be supplied.- Parameters:
init_design (pandas.DataFrame) – The initial design matrix, typically from
gen_initdesign().V (dict) – A dictionary with the utility functions, keyed by alternative index. e.g.
{1: V1, 2: V2}model (str) – The base model for the efficient design, by default ‘mnl’
algorithm (str) – Optimisation algorithm to use. Options:
'swap'(random swapping, default),'rsc'(random Relabelling, Swapping, Cycling), or'federov'(Modified Federov — tries all full-factorial candidates per row per iteration; slower per iteration but more systematic).criterion (str) – Optimality criterion:
'd'(D-error, default),'a'(A-error), or'c'(C-error / WTP variance). When'c',cost_paramandwtp_paramsare required.cost_param (Parameter, optional) – The cost (denominator) parameter used to compute WTP ratios. Required when
criterion='c'.wtp_params (list[Parameter], optional) – Parameters whose WTP ratios are minimised. Required when
criterion='c'.bayes_draws (int, optional) – Number of Monte Carlo draws for Db-efficient (Bayesian) design. Only valid with
criterion='d'. When set, the optimizer minimises the expected D-error averaged over draws from the prior distributions of parameters that haveprior_stddefined.iter_lim (int, optional) – Number of iterations before the algorithm stops, by default None
noimprov_lim (int, optional) – Number of iterations without improvement before the algorithm stops, by default None
time_lim (int, optional) – Time (in minutes) before the algorithm stops, by default None
seed (int, optional) – Random seed, by default None
verbose (bool, optional) – Whether status messages and progress are shown, by default False
- Returns:
optimal_design (pandas.DataFrame) – The final (optimal) design
init_perf (float) – Criterion value of the initial design
final_perf (float) – Criterion value of the final design
final_iter (int) – Total number of iterations
ubalance_ratio (float) – Utility balance ratio
- class choicedesign.design.FullFactDesign(X: list)
Full-factorial design covering all combinations of attribute levels.
Notes
This class is provided for completeness. Its dependency
pyDOE2is incompatible with Python 3.12+ and is not installed by default.- gen_design()
Generate full-factorial design matrix
- Returns:
design – A Pandas DataFrame with all combinations of attribute levels.
- Return type:
pandas.DataFrame