Example of a D-efficient RUM design with ChoiceDesign
This notebook illustrates how to use ChoiceDesign to generate a simple D-efficient experimental design for a Random Utility Maximisation (RUM) model. Given a set of attributes and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.
Step 1: Load modules, define design parameters and set attributes
The following lines load:
EffDesign: the class of efficient designs,AttributeandParameter: the classes of attributes and parameters, respectively.
[ ]:
from choicedesign.design import EffDesign
from choicedesign.expressions import Attribute, Parameter
Each attribute is defined by the Attribute class. The arguments of this class are:
name: a string with the attribute name,levels: a list of levels of the attribute,
Each attribute is alternative-specific. Hence, attributes must be defined for each alternative that contains them.
The following lines define 2 alternatives, named alt1 and alt2, and 4 attributes named from \(A\) to \(D\):
[ ]:
alt1_A = Attribute('alt1_A',[1,2,3])
alt1_B = Attribute('alt1_B',[10,15,15.5])
alt1_C = Attribute('alt1_C',[0,3,5])
alt1_D = Attribute('alt1_D',[0,1,2])
alt2_A = Attribute('alt2_A',[1,2,3])
alt2_B = Attribute('alt2_B',[10,15,15.5])
alt2_C = Attribute('alt2_C',[0,3,5])
alt2_D = Attribute('alt2_D',[0,1,2])
Step 2: Construct efficient design object and generate initial design matrix
The second step consists of constructing the experimental design object, which requires the following parameters:
X: A list ofAttributeclass elements,ncs: The number of choice situations.
The following lines define a object named design using EffDesign of 16 choice situations:
[ ]:
design = EffDesign(
X = [alt1_A,alt1_B,alt1_C,alt1_D,
alt2_A,alt2_B,alt2_C,alt2_D],
ncs=18)
After the design object is defined, the method gen_initdesign() generates the initial design matrix. This method accepts the following optional parameters:
cond: List of conditions that the final design must hold. Each element is a string that contains a single condition. Conditions can be of the form of binary relations (e.g.,X > YwhereXandYare attributes of a specific alternative) or conditional relations (e.g.,if X > a then Y < bwhereaandbare values). Users can specify multiple conditions when the operatorifis defined, separated by the operator&.seed: Random seed
For this example, neither of the arguments above will be used:
[ ]:
init_design = design.gen_initdesign()
init_design
| alt1_A | alt1_B | alt1_C | alt1_D | alt2_A | alt2_B | alt2_C | alt2_D | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2.0 | 15.5 | 3.0 | 0.0 | 1.0 | 15.0 | 0.0 | 2.0 |
| 1 | 3.0 | 15.5 | 5.0 | 2.0 | 1.0 | 15.5 | 0.0 | 2.0 |
| 2 | 1.0 | 15.0 | 3.0 | 2.0 | 1.0 | 15.0 | 3.0 | 0.0 |
| 3 | 2.0 | 10.0 | 3.0 | 2.0 | 3.0 | 10.0 | 0.0 | 1.0 |
| 4 | 3.0 | 10.0 | 0.0 | 0.0 | 3.0 | 10.0 | 5.0 | 2.0 |
| 5 | 1.0 | 15.5 | 3.0 | 2.0 | 1.0 | 15.5 | 5.0 | 2.0 |
| 6 | 2.0 | 15.5 | 5.0 | 2.0 | 2.0 | 10.0 | 3.0 | 0.0 |
| 7 | 3.0 | 15.0 | 3.0 | 0.0 | 2.0 | 15.0 | 5.0 | 2.0 |
| 8 | 2.0 | 10.0 | 0.0 | 1.0 | 3.0 | 10.0 | 0.0 | 2.0 |
| 9 | 3.0 | 15.0 | 0.0 | 0.0 | 2.0 | 10.0 | 3.0 | 1.0 |
| 10 | 1.0 | 15.5 | 0.0 | 1.0 | 1.0 | 15.5 | 0.0 | 0.0 |
| 11 | 3.0 | 15.5 | 0.0 | 1.0 | 3.0 | 15.5 | 3.0 | 0.0 |
| 12 | 3.0 | 15.0 | 5.0 | 0.0 | 2.0 | 15.5 | 5.0 | 1.0 |
| 13 | 2.0 | 10.0 | 5.0 | 2.0 | 2.0 | 15.5 | 5.0 | 1.0 |
| 14 | 2.0 | 15.0 | 0.0 | 1.0 | 3.0 | 15.0 | 3.0 | 0.0 |
| 15 | 1.0 | 15.0 | 5.0 | 1.0 | 3.0 | 10.0 | 3.0 | 1.0 |
| 16 | 1.0 | 10.0 | 3.0 | 1.0 | 2.0 | 15.0 | 0.0 | 1.0 |
| 17 | 1.0 | 10.0 | 5.0 | 0.0 | 1.0 | 15.0 | 5.0 | 0.0 |
Step 3: Set the utility functions
ChoiceDesign uses a native expression system to define utility functions. Parameters and attributes are combined using standard arithmetic operators. For this, we use the Parameter class, which requires the following arguments:
name: The parameter nameprior: The prior value
The following lines define four parameters:
[ ]:
beta_A = Parameter('beta_A',-0.1)
beta_B = Parameter('beta_B',-0.02)
beta_C = Parameter('beta_C',0.1)
beta_D = Parameter('beta_D',0.15)
Then, the utility functions are defined using standard arithmetic operators. We will assume a linear utility function for each alternative.
[ ]:
V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D
The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:
[ ]:
V = {1: V1, 2: V2}
Step 3: Optimise the initial design, given the utility functions and priors:
The method optimise() starts the D-error minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:
init_design: The objective design matrix to optimiseV: The dictionary object with utility functionsmodel: The base model of the efficient design. By default is'mnl'for a Multinomial Logit model.
In addition, optimise() admits the following optional parameters:
iter_lim: number of iterations before the algorithm stops.noimprov_lim: Number of iterations without improvement before the algorithm stops,time_lim: time (in minutes) before the algorithm stops,seed: Random seedverbose: Whether status messages and progress are shown.
The outputs of optimise are:
optimal_design: The optimised design matrixinit_perf: The initial D-Errorfinal_perf: The D-error of the last stored designfinal_iter: The last iteration numberubalance_ratio: The utility balance ratio. A 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares.
The following line starts the optimisation routine during 1 minute:
[ ]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)
Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.034223
Elapsed time: 0:01:00
D-error of initial design: 0.080275
D-error of last stored design: 0.034223
Utility Balance ratio: 95.08 %
Algorithm iterations: 50290
Blocking the design
The optimal design can be blocked using the method gen_blocks(). This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all the attributes. The method allows for the following arguments:
optimal_design: the experimental designn_blocks: number of blocks.n_iter(optional): number of iterations of the blocking algorithm
The following line creates 4 blocks in the optimal design:
[ ]:
optimal_design_blocked = design.gen_blocks(optimal_design,n_blocks=3)
Lastly, the optimal design can be printed:
[ ]:
optimal_design_blocked
| CS | alt1_A | alt1_B | alt1_C | alt1_D | alt2_A | alt2_B | alt2_C | alt2_D | Block | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2.0 | 15.5 | 5.0 | 0.0 | 2.0 | 10.0 | 0.0 | 2.0 | 2 |
| 1 | 2.0 | 3.0 | 10.0 | 5.0 | 1.0 | 1.0 | 15.5 | 0.0 | 1.0 | 2 |
| 2 | 3.0 | 2.0 | 15.5 | 5.0 | 1.0 | 2.0 | 10.0 | 0.0 | 1.0 | 3 |
| 3 | 4.0 | 1.0 | 10.0 | 3.0 | 0.0 | 3.0 | 15.5 | 3.0 | 2.0 | 1 |
| 4 | 5.0 | 1.0 | 15.0 | 0.0 | 0.0 | 3.0 | 15.0 | 5.0 | 2.0 | 1 |
| 5 | 6.0 | 3.0 | 15.5 | 5.0 | 1.0 | 1.0 | 10.0 | 0.0 | 1.0 | 2 |
| 6 | 7.0 | 3.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.5 | 0.0 | 0.0 | 3 |
| 7 | 8.0 | 3.0 | 10.0 | 3.0 | 0.0 | 1.0 | 15.5 | 3.0 | 2.0 | 3 |
| 8 | 9.0 | 2.0 | 15.0 | 0.0 | 0.0 | 2.0 | 15.0 | 5.0 | 2.0 | 2 |
| 9 | 10.0 | 2.0 | 15.0 | 0.0 | 2.0 | 2.0 | 15.0 | 5.0 | 0.0 | 1 |
| 10 | 11.0 | 3.0 | 15.0 | 0.0 | 2.0 | 1.0 | 15.0 | 5.0 | 0.0 | 1 |
| 11 | 12.0 | 1.0 | 15.5 | 0.0 | 2.0 | 3.0 | 10.0 | 5.0 | 0.0 | 2 |
| 12 | 13.0 | 3.0 | 15.5 | 3.0 | 1.0 | 1.0 | 10.0 | 3.0 | 1.0 | 3 |
| 13 | 14.0 | 1.0 | 10.0 | 3.0 | 2.0 | 3.0 | 15.5 | 3.0 | 0.0 | 2 |
| 14 | 15.0 | 2.0 | 15.0 | 0.0 | 1.0 | 2.0 | 15.0 | 5.0 | 1.0 | 1 |
| 15 | 16.0 | 1.0 | 15.5 | 5.0 | 1.0 | 3.0 | 10.0 | 0.0 | 1.0 | 3 |
| 16 | 17.0 | 2.0 | 15.0 | 3.0 | 0.0 | 2.0 | 15.0 | 3.0 | 2.0 | 1 |
| 17 | 18.0 | 1.0 | 10.0 | 3.0 | 2.0 | 3.0 | 15.5 | 3.0 | 0.0 | 3 |
(optional) Evaluate the design
The method evaluate() allows to evaluate a design stored in a data frame, under the specification provided when EffDesign was initialised. evaluate() requires the following parameters:
optimal_design: The objective design matrix to evaluateV: The dictionary object with utility functionsmodel: The base model of the efficient design. By default ismnlfor a Multinomial Logit model.
[ ]:
perf, ubalance = design.evaluate(optimal_design,V,model='mnl')
print(perf, ubalance)
0.03422292145964665 95.07533532170814
Export the design
Export the optimised design to an Excel file. Each row is an attribute and each column is an alternative. Because the design has been blocked, one sheet per block will be created automatically if you pass the blocked design; here we export the unblocked version for simplicity.
[ ]:
attr_names = {
'alt1_A': 'Attribute A', 'alt2_A': 'Attribute A',
'alt1_B': 'Attribute B', 'alt2_B': 'Attribute B',
'alt1_C': 'Attribute C', 'alt2_C': 'Attribute C',
'alt1_D': 'Attribute D', 'alt2_D': 'Attribute D',
}
design.export_design(optimal_design, attr_names, 'rum_simple_design.xlsx')
Save the optimisation summary
After calling optimise(), the method export_output() writes a plain-text summary of the optimisation run — design configuration, stopping criteria, criterion values, utility balance, elapsed time, and iteration count — to a file.
[ ]:
design.export_output('rum_simple_output.txt')
References
[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.