Example of a D-efficient RUM design with ChoiceDesign

This notebook illustrates how to use ChoiceDesign to generate a simple D-efficient experimental design for a Random Utility Maximisation (RUM) model. Given a set of attributes and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.

Step 1: Load modules, define design parameters and set attributes

The following lines load:

  • EffDesign: the class of efficient designs,

  • Attribute and Parameter: the classes of attributes and parameters, respectively.

[ ]:
from choicedesign.design import EffDesign
from choicedesign.expressions import Attribute, Parameter

Each attribute is defined by the Attribute class. The arguments of this class are:

  • name: a string with the attribute name,

  • levels: a list of levels of the attribute,

Each attribute is alternative-specific. Hence, attributes must be defined for each alternative that contains them.

The following lines define 2 alternatives, named alt1 and alt2, and 4 attributes named from \(A\) to \(D\):

[ ]:
alt1_A = Attribute('alt1_A',[1,2,3])
alt1_B = Attribute('alt1_B',[10,15,15.5])
alt1_C = Attribute('alt1_C',[0,3,5])
alt1_D = Attribute('alt1_D',[0,1,2])

alt2_A = Attribute('alt2_A',[1,2,3])
alt2_B = Attribute('alt2_B',[10,15,15.5])
alt2_C = Attribute('alt2_C',[0,3,5])
alt2_D = Attribute('alt2_D',[0,1,2])

Step 2: Construct efficient design object and generate initial design matrix

The second step consists of constructing the experimental design object, which requires the following parameters:

  • X: A list of Attribute class elements,

  • ncs: The number of choice situations.

The following lines define a object named design using EffDesign of 16 choice situations:

[ ]:
design = EffDesign(
    X = [alt1_A,alt1_B,alt1_C,alt1_D,
         alt2_A,alt2_B,alt2_C,alt2_D],
    ncs=18)

After the design object is defined, the method gen_initdesign() generates the initial design matrix. This method accepts the following optional parameters:

  • cond: List of conditions that the final design must hold. Each element is a string that contains a single condition. Conditions can be of the form of binary relations (e.g., X > Y where X and Y are attributes of a specific alternative) or conditional relations (e.g., if X > a then Y < b where a and b are values). Users can specify multiple conditions when the operator if is defined, separated by the operator &.

  • seed: Random seed

For this example, neither of the arguments above will be used:

[ ]:
init_design = design.gen_initdesign()
init_design
alt1_A alt1_B alt1_C alt1_D alt2_A alt2_B alt2_C alt2_D
0 2.0 15.5 3.0 0.0 1.0 15.0 0.0 2.0
1 3.0 15.5 5.0 2.0 1.0 15.5 0.0 2.0
2 1.0 15.0 3.0 2.0 1.0 15.0 3.0 0.0
3 2.0 10.0 3.0 2.0 3.0 10.0 0.0 1.0
4 3.0 10.0 0.0 0.0 3.0 10.0 5.0 2.0
5 1.0 15.5 3.0 2.0 1.0 15.5 5.0 2.0
6 2.0 15.5 5.0 2.0 2.0 10.0 3.0 0.0
7 3.0 15.0 3.0 0.0 2.0 15.0 5.0 2.0
8 2.0 10.0 0.0 1.0 3.0 10.0 0.0 2.0
9 3.0 15.0 0.0 0.0 2.0 10.0 3.0 1.0
10 1.0 15.5 0.0 1.0 1.0 15.5 0.0 0.0
11 3.0 15.5 0.0 1.0 3.0 15.5 3.0 0.0
12 3.0 15.0 5.0 0.0 2.0 15.5 5.0 1.0
13 2.0 10.0 5.0 2.0 2.0 15.5 5.0 1.0
14 2.0 15.0 0.0 1.0 3.0 15.0 3.0 0.0
15 1.0 15.0 5.0 1.0 3.0 10.0 3.0 1.0
16 1.0 10.0 3.0 1.0 2.0 15.0 0.0 1.0
17 1.0 10.0 5.0 0.0 1.0 15.0 5.0 0.0

Step 3: Set the utility functions

ChoiceDesign uses a native expression system to define utility functions. Parameters and attributes are combined using standard arithmetic operators. For this, we use the Parameter class, which requires the following arguments:

  • name: The parameter name

  • prior: The prior value

The following lines define four parameters:

[ ]:
beta_A = Parameter('beta_A',-0.1)
beta_B = Parameter('beta_B',-0.02)
beta_C = Parameter('beta_C',0.1)
beta_D = Parameter('beta_D',0.15)

Then, the utility functions are defined using standard arithmetic operators. We will assume a linear utility function for each alternative.

[ ]:
V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D

The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:

[ ]:
V = {1: V1, 2: V2}

Step 3: Optimise the initial design, given the utility functions and priors:

The method optimise() starts the D-error minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:

  • init_design: The objective design matrix to optimise

  • V: The dictionary object with utility functions

  • model: The base model of the efficient design. By default is 'mnl' for a Multinomial Logit model.

In addition, optimise() admits the following optional parameters:

  • iter_lim: number of iterations before the algorithm stops.

  • noimprov_lim: Number of iterations without improvement before the algorithm stops,

  • time_lim: time (in minutes) before the algorithm stops,

  • seed: Random seed

  • verbose: Whether status messages and progress are shown.

The outputs of optimise are:

  • optimal_design: The optimised design matrix

  • init_perf: The initial D-Error

  • final_perf: The D-error of the last stored design

  • final_iter: The last iteration number

  • ubalance_ratio: The utility balance ratio. A 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares.

The following line starts the optimisation routine during 1 minute:

[ ]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)
Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.034223
Elapsed time: 0:01:00
D-error of initial design:  0.080275
D-error of last stored design:  0.034223
Utility Balance ratio:  95.08 %
Algorithm iterations:  50290

Blocking the design

The optimal design can be blocked using the method gen_blocks(). This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all the attributes. The method allows for the following arguments:

  • optimal_design: the experimental design

  • n_blocks: number of blocks.

  • n_iter (optional): number of iterations of the blocking algorithm

The following line creates 4 blocks in the optimal design:

[ ]:
optimal_design_blocked = design.gen_blocks(optimal_design,n_blocks=3)

Lastly, the optimal design can be printed:

[ ]:
optimal_design_blocked
CS alt1_A alt1_B alt1_C alt1_D alt2_A alt2_B alt2_C alt2_D Block
0 1.0 2.0 15.5 5.0 0.0 2.0 10.0 0.0 2.0 2
1 2.0 3.0 10.0 5.0 1.0 1.0 15.5 0.0 1.0 2
2 3.0 2.0 15.5 5.0 1.0 2.0 10.0 0.0 1.0 3
3 4.0 1.0 10.0 3.0 0.0 3.0 15.5 3.0 2.0 1
4 5.0 1.0 15.0 0.0 0.0 3.0 15.0 5.0 2.0 1
5 6.0 3.0 15.5 5.0 1.0 1.0 10.0 0.0 1.0 2
6 7.0 3.0 10.0 5.0 2.0 1.0 15.5 0.0 0.0 3
7 8.0 3.0 10.0 3.0 0.0 1.0 15.5 3.0 2.0 3
8 9.0 2.0 15.0 0.0 0.0 2.0 15.0 5.0 2.0 2
9 10.0 2.0 15.0 0.0 2.0 2.0 15.0 5.0 0.0 1
10 11.0 3.0 15.0 0.0 2.0 1.0 15.0 5.0 0.0 1
11 12.0 1.0 15.5 0.0 2.0 3.0 10.0 5.0 0.0 2
12 13.0 3.0 15.5 3.0 1.0 1.0 10.0 3.0 1.0 3
13 14.0 1.0 10.0 3.0 2.0 3.0 15.5 3.0 0.0 2
14 15.0 2.0 15.0 0.0 1.0 2.0 15.0 5.0 1.0 1
15 16.0 1.0 15.5 5.0 1.0 3.0 10.0 0.0 1.0 3
16 17.0 2.0 15.0 3.0 0.0 2.0 15.0 3.0 2.0 1
17 18.0 1.0 10.0 3.0 2.0 3.0 15.5 3.0 0.0 3

(optional) Evaluate the design

The method evaluate() allows to evaluate a design stored in a data frame, under the specification provided when EffDesign was initialised. evaluate() requires the following parameters:

  • optimal_design: The objective design matrix to evaluate

  • V: The dictionary object with utility functions

  • model: The base model of the efficient design. By default is mnl for a Multinomial Logit model.

[ ]:
perf, ubalance = design.evaluate(optimal_design,V,model='mnl')

print(perf, ubalance)
0.03422292145964665 95.07533532170814

Export the design

Export the optimised design to an Excel file. Each row is an attribute and each column is an alternative. Because the design has been blocked, one sheet per block will be created automatically if you pass the blocked design; here we export the unblocked version for simplicity.

[ ]:
attr_names = {
    'alt1_A': 'Attribute A', 'alt2_A': 'Attribute A',
    'alt1_B': 'Attribute B', 'alt2_B': 'Attribute B',
    'alt1_C': 'Attribute C', 'alt2_C': 'Attribute C',
    'alt1_D': 'Attribute D', 'alt2_D': 'Attribute D',
}
design.export_design(optimal_design, attr_names, 'rum_simple_design.xlsx')

Save the optimisation summary

After calling optimise(), the method export_output() writes a plain-text summary of the optimisation run — design configuration, stopping criteria, criterion values, utility balance, elapsed time, and iteration count — to a file.

[ ]:
design.export_output('rum_simple_output.txt')

References

[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.