Example of a Db-efficient RUM design with ChoiceDesign

This notebook illustrates how to use ChoiceDesign to generate a Db-efficient (Bayesian) experimental design for a Random Utility Maximisation (RUM) model.

What is a Db-efficient design?

A standard Dp-efficient design minimises the D-error at fixed prior parameter values. The problem is that priors are always uncertain: if the assumed values are wrong, the resulting design can be suboptimal for estimation.

A Db-efficient design (Sandor & Wedel, 2001) addresses this by treating the researcher’s priors as random variables following a probability distribution. Instead of minimising a point D-error, it minimises the Db-error — the expected D-error integrated over the entire prior distribution:

\[D_b\text{-error} = \mathbb{E}_{\beta \sim \pi}\left[\det\left(I(\beta)^{-1}\right)^{1/K}\right]\]

This is estimated by Monte Carlo: draw \(R\) samples from the prior distribution, evaluate the D-error at each draw, and take the mean. The result is a design that performs well on average across the range of plausible true parameter values.

Note: This is different from a Random Parameters (Mixed Logit) design. In a Db design the uncertainty is the researcher’s uncertainty about fixed true parameters. The underlying estimation model is still plain MNL.

Step 1: Load modules, define design parameters and set attributes

The following lines load:

  • EffDesign: the class of efficient designs,

  • Attribute and Parameter: the classes of attributes and parameters, respectively.

[1]:
from choicedesign.design import EffDesign
from choicedesign.expressions import Attribute, Parameter

Each attribute is defined by the Attribute class. The arguments of this class are:

  • name: a string with the attribute name,

  • levels: a list of levels of the attribute.

Each attribute is alternative-specific. The following lines define 2 alternatives, named alt1 and alt2, and 4 attributes named from \(A\) to \(D\):

[2]:
alt1_A = Attribute('alt1_A', [1, 2, 3])
alt1_B = Attribute('alt1_B', [10, 15, 15.5])
alt1_C = Attribute('alt1_C', [0, 3, 5])
alt1_D = Attribute('alt1_D', [0, 1, 2])

alt2_A = Attribute('alt2_A', [1, 2, 3])
alt2_B = Attribute('alt2_B', [10, 15, 15.5])
alt2_C = Attribute('alt2_C', [0, 3, 5])
alt2_D = Attribute('alt2_D', [0, 1, 2])

Step 2: Construct efficient design object and generate initial design matrix

The second step consists of constructing the experimental design object, which requires the following parameters:

  • X: A list of Attribute class elements,

  • ncs: The number of choice situations.

[3]:
design = EffDesign(
    X=[alt1_A, alt1_B, alt1_C, alt1_D,
       alt2_A, alt2_B, alt2_C, alt2_D],
    ncs=18)
[4]:
init_design = design.gen_initdesign(seed=42)
init_design
[4]:
alt1_A alt1_B alt1_C alt1_D alt2_A alt2_B alt2_C alt2_D
0 1.0 10.0 5.0 1.0 2.0 15.5 5.0 2.0
1 2.0 15.0 3.0 2.0 3.0 15.0 5.0 0.0
2 3.0 10.0 5.0 1.0 2.0 15.5 5.0 1.0
3 3.0 15.5 0.0 1.0 1.0 10.0 3.0 2.0
4 1.0 15.5 3.0 0.0 1.0 15.5 5.0 0.0
5 2.0 15.0 3.0 2.0 3.0 15.0 3.0 0.0
6 2.0 10.0 3.0 2.0 2.0 10.0 0.0 1.0
7 1.0 10.0 0.0 0.0 3.0 15.5 0.0 0.0
8 3.0 15.5 5.0 0.0 1.0 10.0 5.0 2.0
9 3.0 10.0 0.0 2.0 2.0 15.0 0.0 0.0
10 1.0 10.0 0.0 0.0 1.0 15.5 0.0 0.0
11 3.0 15.0 5.0 1.0 3.0 10.0 3.0 1.0
12 2.0 15.0 5.0 2.0 1.0 15.0 3.0 1.0
13 1.0 15.5 3.0 2.0 3.0 15.0 0.0 2.0
14 2.0 15.5 3.0 0.0 3.0 15.5 0.0 1.0
15 2.0 15.0 5.0 1.0 2.0 15.0 5.0 2.0
16 3.0 15.5 0.0 1.0 2.0 10.0 3.0 2.0
17 1.0 15.0 0.0 0.0 1.0 10.0 3.0 1.0

Step 3: Set the utility functions with uncertain priors

For a Db-efficient design, each parameter can carry a prior_std argument that specifies the standard deviation of its prior distribution. The prior value is interpreted as the mean.

  • Parameters with prior_std set are treated as uncertain — draws are taken from \(\mathcal{N}(\text{prior}, \text{prior\_std})\) during Db-error computation.

  • Parameters without prior_std (or with prior_std=None) are fixed — they contribute no uncertainty.

Here, attributes \(A\) and \(B\) have uncertain priors; \(C\) and \(D\) are treated as known:

[5]:
beta_A = Parameter('beta_A', -0.1,  prior_std=0.01)   # uncertain: N(-0.1, 0.01)
beta_B = Parameter('beta_B', -0.2,  prior_std=0.03)   # uncertain: N(-0.2, 0.03)
beta_C = Parameter('beta_C',  0.1)                    # fixed prior
beta_D = Parameter('beta_D',  0.15)                   # fixed prior

The utility functions use the same linear MNL structure as any other example. There are no random draws inside the utility expression — the Bayesian averaging happens at the criterion level, not inside the utility tree:

[6]:
V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D

V = {1: V1, 2: V2}

Step 4: Optimise the design minimising the Db-error

The optimise() method accepts a bayes_draws argument. When set, the swapping algorithm minimises the Db-error (expected D-error over prior draws) instead of the point D-error.

  • bayes_draws: number of Monte Carlo draws from the prior distributions per D-error evaluation. Higher values reduce Monte Carlo noise at the cost of longer run times. Values between 200 and 1000 are typically sufficient.

All other stopping criteria work as usual:

[7]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(
    init_design=init_design,
    V=V,
    bayes_draws=500,
    time_lim=1,
    verbose=True
)
Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.042138
Elapsed time: 0:01:00
D-error of initial design:  0.099028
D-error of last stored design:  0.042138
Utility Balance ratio:  82.87 %
Algorithm iterations:  471

Step 5: Compare Db-error vs point D-error

The evaluate() method also accepts bayes_draws. Calling it with and without this argument shows the difference between the Db-error (what the design was optimised for) and the point D-error (what a Dp design would report at the prior means).

A well-calibrated Db design will have a higher point D-error than a Dp design optimised at the same prior means — that is the cost of robustness. The benefit is a design that degrades more gracefully when the true parameters differ from the assumed priors.

[8]:
db_error, ubalance_db = design.evaluate(optimal_design, V, bayes_draws=500)
dp_error, ubalance_dp = design.evaluate(optimal_design, V)

print(f'Db-error (robust, averaged over prior draws): {db_error:.6f}')
print(f'Dp-error (point, evaluated at prior means):   {dp_error:.6f}')
print(f'Utility balance: {ubalance_dp:.2f} %')
Db-error (robust, averaged over prior draws): 0.042240
Dp-error (point, evaluated at prior means):   0.042150
Utility balance: 82.87 %

(optional) Block the design

The optimal design can be blocked using gen_blocks(). This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all attributes:

[9]:
optimal_design_blocked, corr_history = design.gen_blocks(optimal_design, n_blocks=3)
optimal_design_blocked
[9]:
CS alt1_A alt1_B alt1_C alt1_D alt2_A alt2_B alt2_C alt2_D Block
0 1.0 1.0 10.0 5.0 1.0 2.0 15.0 0.0 2.0 2
1 2.0 2.0 15.5 0.0 2.0 1.0 10.0 5.0 0.0 1
2 3.0 3.0 10.0 0.0 0.0 1.0 15.0 5.0 2.0 3
3 4.0 2.0 15.0 0.0 1.0 2.0 10.0 5.0 1.0 3
4 5.0 3.0 15.0 5.0 0.0 1.0 15.5 0.0 2.0 2
5 6.0 2.0 15.0 5.0 2.0 2.0 15.5 0.0 0.0 3
6 7.0 3.0 15.5 3.0 1.0 2.0 10.0 0.0 1.0 2
7 8.0 1.0 10.0 5.0 1.0 3.0 15.0 3.0 1.0 1
8 9.0 3.0 15.0 0.0 1.0 1.0 15.0 3.0 1.0 1
9 10.0 1.0 10.0 0.0 2.0 3.0 15.5 5.0 0.0 3
10 11.0 1.0 15.0 0.0 0.0 3.0 15.5 5.0 2.0 1
11 12.0 2.0 15.5 5.0 2.0 3.0 10.0 0.0 0.0 3
12 13.0 3.0 10.0 3.0 2.0 1.0 15.5 5.0 0.0 2
13 14.0 1.0 15.0 3.0 0.0 3.0 15.0 3.0 1.0 2
14 15.0 1.0 15.5 3.0 0.0 3.0 10.0 0.0 2.0 2
15 16.0 2.0 10.0 3.0 1.0 2.0 15.5 3.0 1.0 1
16 17.0 2.0 15.5 3.0 2.0 2.0 10.0 3.0 0.0 1
17 18.0 3.0 15.5 5.0 0.0 1.0 15.0 3.0 2.0 3

References

[1] Sandor, Z., & Wedel, M. (2001). Designing conjoint choice experiments using managers’ prior beliefs. Journal of Marketing Research, 38(4), 430–444.

[2] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.