Example of a D-efficient RUM design with conditions in ChoiceDesign
This notebook illustrates how to use ChoiceDesign to generate a D-efficient experimental design with conditions for a Random Utility Maximisation (RUM) model. Given a set of attributes and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.
Step 1: Load modules, define design parameters and set attributes
The following lines load:
EffDesign: the class of efficient designs,AttributeandParameter: the classes of attributes and parameters, respectively.
[1]:
from choicedesign.design import EffDesign
from choicedesign.expressions import Attribute, Parameter
Each attribute is defined by the Attribute class. The arguments of this class are:
name: a string with the attribute name,levels: a list of levels of the attribute,
Each attribute is alternative-specific. Hence, attributes must be defined for each alternative that contains them.
The following lines define 2 alternatives, named alt1 and alt2, and 4 attributes named from \(A\) to \(D\):
[2]:
alt1_A = Attribute('alt1_A',[1,2,3])
alt1_B = Attribute('alt1_B',[10,15,15.5])
alt1_C = Attribute('alt1_C',[0,3,5])
alt1_D = Attribute('alt1_D',[0,1,2])
alt2_A = Attribute('alt2_A',[1,2,3])
alt2_B = Attribute('alt2_B',[10,15,15.5])
alt2_C = Attribute('alt2_C',[0,3,5])
alt2_D = Attribute('alt2_D',[0,1,2])
Step 2: Construct efficient design object and generate initial design matrix
The second step consists of constructing the experimental design object, which requires the following parameters:
X: A list ofAttributeclass elements,ncs: The number of choice situations.
The following lines define a object named design using EffDesign of 16 choice situations:
[3]:
design = EffDesign(
X = [alt1_A,alt1_B,alt1_C,alt1_D,
alt2_A,alt2_B,alt2_C,alt2_D],
ncs=18)
After the design object is defined, the method gen_initdesign() generates the initial design matrix. This method accepts the following optional parameters:
cond: List of conditions that the final design must hold. Each element is a string that contains a single condition. The supported forms are:Binary relation:
X > Y— attribute vs attribute or attribute vs a numeric value.Conditional:
if X > a then Y < b— material implication; the consequent must hold whenever the antecedent does.Compound (AND):
X > a & Y < b— all sub-conditions separated by&must hold.Arithmetic expressions:
(X + Y + Z) > 0— any mix of attribute names and numeric constants combined with+,-,*,/and parentheses is valid on either side of a comparison, including insideif/thenclauses (e.g.,if (X + Y) > 0 then P >= 0).
seed: Random seed
We will define four conditions for this design. The last one uses an arithmetic expression. The following lines define the conditions and integrate them into the initial design matrix:
[4]:
cond = ['alt1_A > alt2_A',
'if alt1_B > 10 then alt2_A < 3',
'if alt1_A > 1 then alt2_A < 3',
'if (alt1_C + alt1_D) > 0 then alt1_A > 1'
]
[5]:
init_design = design.gen_initdesign(cond=cond)
init_design
[5]:
| alt1_A | alt1_B | alt1_C | alt1_D | alt2_A | alt2_B | alt2_C | alt2_D | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2.0 | 10.0 | 0.0 | 2.0 | 1.0 | 10.0 | 5.0 | 0.0 |
| 1 | 3.0 | 15.0 | 3.0 | 1.0 | 1.0 | 15.0 | 5.0 | 0.0 |
| 2 | 2.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.5 | 0.0 | 0.0 |
| 3 | 2.0 | 10.0 | 3.0 | 1.0 | 1.0 | 10.0 | 5.0 | 1.0 |
| 4 | 3.0 | 15.5 | 3.0 | 2.0 | 1.0 | 15.0 | 3.0 | 0.0 |
| 5 | 2.0 | 10.0 | 0.0 | 0.0 | 1.0 | 15.0 | 3.0 | 1.0 |
| 6 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 1.0 |
| 7 | 2.0 | 10.0 | 5.0 | 0.0 | 1.0 | 15.0 | 0.0 | 2.0 |
| 8 | 3.0 | 10.0 | 0.0 | 2.0 | 2.0 | 10.0 | 5.0 | 2.0 |
| 9 | 2.0 | 10.0 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 0.0 |
| 10 | 2.0 | 10.0 | 0.0 | 1.0 | 1.0 | 15.0 | 5.0 | 1.0 |
| 11 | 3.0 | 15.5 | 3.0 | 0.0 | 2.0 | 15.5 | 0.0 | 1.0 |
| 12 | 3.0 | 10.0 | 5.0 | 1.0 | 1.0 | 10.0 | 5.0 | 0.0 |
| 13 | 3.0 | 15.5 | 5.0 | 1.0 | 1.0 | 15.5 | 0.0 | 2.0 |
| 14 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 5.0 | 2.0 |
| 15 | 3.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 2.0 |
| 16 | 3.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.0 | 0.0 | 1.0 |
| 17 | 2.0 | 15.0 | 5.0 | 0.0 | 1.0 | 15.5 | 0.0 | 0.0 |
Step 3: Set the utility functions
ChoiceDesign uses a native expression system to define utility functions. Parameters and attributes are combined using standard arithmetic operators. For this, we use the Parameter class, which requires the following arguments:
name: The parameter nameprior: The prior value
The following lines define four parameters:
[6]:
beta_A = Parameter('beta_A',-0.1)
beta_B = Parameter('beta_B',-0.02)
beta_C = Parameter('beta_C',0.1)
beta_D = Parameter('beta_D',0.15)
Then, the utility functions are defined using standard arithmetic operators. We will assume a linear utility function for each alternative.
[7]:
V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D
The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:
[8]:
V = {1: V1, 2: V2}
Step 3: Optimise the initial design, given the utility functions and priors:
The method optimise() starts the D-error minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:
init_design: The objective design matrix to optimiseV: The dictionary object with utility functionsmodel: The base model of the efficient design. By default is'mnl'for a Multinomial Logit model.
In addition, optimise() admits the following optional parameters:
iter_lim: number of iterations before the algorithm stops.noimprov_lim: Number of iterations without improvement before the algorithm stops,time_lim: time (in minutes) before the algorithm stops,seed: Random seedverbose: Whether status messages and progress are shown.
The outputs of optimise are:
optimal_design: The optimised design matrixinit_perf: The initial D-Errorfinal_perf: The D-error of the last stored designfinal_iter: The last iteration numberubalance_ratio: The utility balance ratio. A 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares.
The following line starts the optimisation routine during 1 minute:
[9]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)
Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.034146
Elapsed time: 0:01:00
D-error of initial design: 0.052387
D-error of last stored design: 0.034146
Utility Balance ratio: 95.14 %
Algorithm iterations: 46180
Blocking the design
The optimal design can be blocked using the method gen_blocks(). This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all the attributes. The method allows for the following arguments:
optimal_design: the experimental designn_blocks: number of blocks.n_iter(optional): number of iterations of the blocking algorithm
The following line creates 4 blocks in the optimal design:
[10]:
optimal_design_blocked = design.gen_blocks(optimal_design,n_blocks=3)
Lastly, the optimal design can be printed:
[11]:
optimal_design
[11]:
| CS | alt1_A | alt1_B | alt1_C | alt1_D | alt2_A | alt2_B | alt2_C | alt2_D | Block | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 3.0 | 10.0 | 0.0 | 2.0 | 1.0 | 15.0 | 5.0 | 1.0 | 1 |
| 1 | 2.0 | 3.0 | 10.0 | 3.0 | 0.0 | 2.0 | 15.0 | 5.0 | 2.0 | 2 |
| 2 | 3.0 | 2.0 | 10.0 | 5.0 | 1.0 | 1.0 | 15.5 | 0.0 | 1.0 | 3 |
| 3 | 4.0 | 3.0 | 15.5 | 5.0 | 0.0 | 1.0 | 10.0 | 3.0 | 2.0 | 1 |
| 4 | 5.0 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 0.0 | 2 |
| 5 | 6.0 | 2.0 | 15.0 | 0.0 | 1.0 | 1.0 | 10.0 | 5.0 | 1.0 | 3 |
| 6 | 7.0 | 3.0 | 15.5 | 3.0 | 2.0 | 1.0 | 10.0 | 5.0 | 0.0 | 1 |
| 7 | 8.0 | 3.0 | 10.0 | 3.0 | 0.0 | 1.0 | 15.0 | 3.0 | 2.0 | 1 |
| 8 | 9.0 | 3.0 | 10.0 | 5.0 | 2.0 | 2.0 | 15.5 | 3.0 | 0.0 | 1 |
| 9 | 10.0 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 0.0 | 3 |
| 10 | 11.0 | 2.0 | 10.0 | 0.0 | 2.0 | 1.0 | 15.5 | 5.0 | 0.0 | 2 |
| 11 | 12.0 | 3.0 | 15.5 | 3.0 | 1.0 | 1.0 | 10.0 | 5.0 | 2.0 | 1 |
| 12 | 13.0 | 2.0 | 10.0 | 0.0 | 2.0 | 1.0 | 15.5 | 5.0 | 0.0 | 2 |
| 13 | 14.0 | 3.0 | 10.0 | 5.0 | 0.0 | 1.0 | 15.0 | 0.0 | 2.0 | 2 |
| 14 | 15.0 | 2.0 | 10.0 | 5.0 | 1.0 | 1.0 | 15.0 | 0.0 | 1.0 | 3 |
| 15 | 16.0 | 2.0 | 15.5 | 5.0 | 1.0 | 1.0 | 10.0 | 0.0 | 1.0 | 2 |
| 16 | 17.0 | 2.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.0 | 0.0 | 0.0 | 3 |
| 17 | 18.0 | 2.0 | 15.0 | 5.0 | 2.0 | 1.0 | 10.0 | 0.0 | 1.0 | 3 |
(optional) Evaluate the design
The method evaluate() allows to evaluate a design stored in a data frame, under the specification provided when EffDesign was initialised. evaluate() requires the following parameters:
optimal_design: The objective design matrix to evaluateV: The dictionary object with utility functionsmodel: The base model of the efficient design. By default ismnlfor a Multinomial Logit model.
[12]:
perf, ubalance = design.evaluate(optimal_design,V,model='mnl')
print(perf, ubalance)
0.034145726914017696 95.1394849419413
Export the design
Export the optimised design to Excel. The exported table will reflect the constrained levels produced by the conditions.
[ ]:
attr_names = {
'alt1_A': 'Attribute A', 'alt2_A': 'Attribute A',
'alt1_B': 'Attribute B', 'alt2_B': 'Attribute B',
'alt1_C': 'Attribute C', 'alt2_C': 'Attribute C',
'alt1_D': 'Attribute D', 'alt2_D': 'Attribute D',
}
design.export_design(optimal_design, attr_names, 'rum_conds_design.xlsx')
Save the optimisation summary
After calling optimise(), the method export_output() writes a plain-text summary of the optimisation run — design configuration, stopping criteria, criterion values, utility balance, elapsed time, and iteration count — to a file.
[ ]:
design.export_output('rum_conds_output.txt')
References
[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.