Example of a D-efficient RUM design with conditions in ChoiceDesign
This notebook illustrates how to use ChoiceDesign to generate a D-efficient experimental design with conditions for a Random Utility Maximisation (RUM) model. Given a set of attributes and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.
Step 1: Load modules, define design parameters and set attributes
The following lines load:
EffDesign: the class of efficient designs,AttributeandParameter: the classes of attributes and parameters, respectively.
[1]:
from choicedesign.design import EffDesign
from choicedesign.expressions import Attribute, Parameter
Each attribute is defined by the Attribute class. The arguments of this class are:
name: a string with the attribute name,levels: a list of levels of the attribute,
Each attribute is alternative-specific. Hence, attributes must be defined for each alternative that contains them.
The following lines define 2 alternatives, named alt1 and alt2, and 4 attributes named from \(A\) to \(D\):
[2]:
alt1_A = Attribute('alt1_A',[1,2,3])
alt1_B = Attribute('alt1_B',[10,15,15.5])
alt1_C = Attribute('alt1_C',[0,3,5])
alt1_D = Attribute('alt1_D',[0,1,2])
alt2_A = Attribute('alt2_A',[1,2,3])
alt2_B = Attribute('alt2_B',[10,15,15.5])
alt2_C = Attribute('alt2_C',[0,3,5])
alt2_D = Attribute('alt2_D',[0,1,2])
Step 2: Construct efficient design object and generate initial design matrix
The second step consists of constructing the experimental design object, which requires the following parameters:
X: A list ofAttributeclass elements,ncs: The number of choice situations.
The following lines define a object named design using EffDesign of 16 choice situations:
[3]:
design = EffDesign(
X = [alt1_A,alt1_B,alt1_C,alt1_D,
alt2_A,alt2_B,alt2_C,alt2_D],
ncs=18)
After the design object is defined, the method gen_initdesign() generates the initial design matrix. This method accepts the following optional parameters:
cond: List of conditions that the final design must hold. Each element is a string that contains a single condition. Conditions can be of the form of binary relations (e.g.,X > YwhereXandYare attributes of a specific alternative) or conditional relations (e.g.,if X > a then Y < bwhereaandbare values). Users can specify multiple conditions when the operatorifis defined, separated by the operator&.seed: Random seed
We will define three conditions for this design. The following lines define the conditions and integrate them into the initial design matrix:
[4]:
cond = ['alt1_A > alt2_A',
'if alt1_B > 10 then alt2_A < 3',
'if alt1_A > 1 then alt2_A < 3'
]
[5]:
init_design = design.gen_initdesign(cond=cond)
init_design
[5]:
| alt1_A | alt1_B | alt1_C | alt1_D | alt2_A | alt2_B | alt2_C | alt2_D | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2.0 | 10.0 | 0.0 | 2.0 | 1.0 | 10.0 | 5.0 | 0.0 |
| 1 | 3.0 | 15.0 | 3.0 | 1.0 | 1.0 | 15.0 | 5.0 | 0.0 |
| 2 | 2.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.5 | 0.0 | 0.0 |
| 3 | 2.0 | 10.0 | 3.0 | 1.0 | 1.0 | 10.0 | 5.0 | 1.0 |
| 4 | 3.0 | 15.5 | 3.0 | 2.0 | 1.0 | 15.0 | 3.0 | 0.0 |
| 5 | 2.0 | 10.0 | 0.0 | 0.0 | 1.0 | 15.0 | 3.0 | 1.0 |
| 6 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 1.0 |
| 7 | 2.0 | 10.0 | 5.0 | 0.0 | 1.0 | 15.0 | 0.0 | 2.0 |
| 8 | 3.0 | 10.0 | 0.0 | 2.0 | 2.0 | 10.0 | 5.0 | 2.0 |
| 9 | 2.0 | 10.0 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 0.0 |
| 10 | 2.0 | 10.0 | 0.0 | 1.0 | 1.0 | 15.0 | 5.0 | 1.0 |
| 11 | 3.0 | 15.5 | 3.0 | 0.0 | 2.0 | 15.5 | 0.0 | 1.0 |
| 12 | 3.0 | 10.0 | 5.0 | 1.0 | 1.0 | 10.0 | 5.0 | 0.0 |
| 13 | 3.0 | 15.5 | 5.0 | 1.0 | 1.0 | 15.5 | 0.0 | 2.0 |
| 14 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 5.0 | 2.0 |
| 15 | 3.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 2.0 |
| 16 | 3.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.0 | 0.0 | 1.0 |
| 17 | 2.0 | 15.0 | 5.0 | 0.0 | 1.0 | 15.5 | 0.0 | 0.0 |
Step 3: Set the utility functions
ChoiceDesign uses a native expression system to define utility functions. Parameters and attributes are combined using standard arithmetic operators. For this, we use the Parameter class, which requires the following arguments:
name: The parameter nameprior: The prior value
The following lines define four parameters:
[6]:
beta_A = Parameter('beta_A',-0.1)
beta_B = Parameter('beta_B',-0.02)
beta_C = Parameter('beta_C',0.1)
beta_D = Parameter('beta_D',0.15)
Then, the utility functions are defined using standard arithmetic operators. We will assume a linear utility function for each alternative.
[7]:
V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D
The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:
[8]:
V = {1: V1, 2: V2}
Step 3: Optimise the initial design, given the utility functions and priors:
The method optimise() starts the D-error minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:
init_design: The objective design matrix to optimiseV: The dictionary object with utility functionsmodel: The base model of the efficient design. By default is'mnl'for a Multinomial Logit model.
In addition, optimise() admits the following optional parameters:
iter_lim: number of iterations before the algorithm stops.noimprov_lim: Number of iterations without improvement before the algorithm stops,time_lim: time (in minutes) before the algorithm stops,seed: Random seedverbose: Whether status messages and progress are shown.
The outputs of optimise are:
optimal_design: The optimised design matrixinit_perf: The initial D-Errorfinal_perf: The D-error of the last stored designfinal_iter: The last iteration numberubalance_ratio: The utility balance ratio. A 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares.
The following line starts the optimisation routine during 1 minute:
[9]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)
Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.034146
Elapsed time: 0:01:00
D-error of initial design: 0.052387
D-error of last stored design: 0.034146
Utility Balance ratio: 95.14 %
Algorithm iterations: 46180
Blocking the design
The optimal design can be blocked using the method gen_blocks(). This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all the attributes. The method allows for the following arguments:
optimal_design: the experimental designn_blocks: number of blocks.n_iter(optional): number of iterations of the blocking algorithm
The following line creates 4 blocks in the optimal design:
[10]:
optimal_design_blocked = design.gen_blocks(optimal_design,n_blocks=3)
Lastly, the optimal design can be printed:
[11]:
optimal_design
[11]:
| CS | alt1_A | alt1_B | alt1_C | alt1_D | alt2_A | alt2_B | alt2_C | alt2_D | Block | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 3.0 | 10.0 | 0.0 | 2.0 | 1.0 | 15.0 | 5.0 | 1.0 | 1 |
| 1 | 2.0 | 3.0 | 10.0 | 3.0 | 0.0 | 2.0 | 15.0 | 5.0 | 2.0 | 2 |
| 2 | 3.0 | 2.0 | 10.0 | 5.0 | 1.0 | 1.0 | 15.5 | 0.0 | 1.0 | 3 |
| 3 | 4.0 | 3.0 | 15.5 | 5.0 | 0.0 | 1.0 | 10.0 | 3.0 | 2.0 | 1 |
| 4 | 5.0 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 0.0 | 2 |
| 5 | 6.0 | 2.0 | 15.0 | 0.0 | 1.0 | 1.0 | 10.0 | 5.0 | 1.0 | 3 |
| 6 | 7.0 | 3.0 | 15.5 | 3.0 | 2.0 | 1.0 | 10.0 | 5.0 | 0.0 | 1 |
| 7 | 8.0 | 3.0 | 10.0 | 3.0 | 0.0 | 1.0 | 15.0 | 3.0 | 2.0 | 1 |
| 8 | 9.0 | 3.0 | 10.0 | 5.0 | 2.0 | 2.0 | 15.5 | 3.0 | 0.0 | 1 |
| 9 | 10.0 | 2.0 | 15.5 | 5.0 | 2.0 | 1.0 | 10.0 | 3.0 | 0.0 | 3 |
| 10 | 11.0 | 2.0 | 10.0 | 0.0 | 2.0 | 1.0 | 15.5 | 5.0 | 0.0 | 2 |
| 11 | 12.0 | 3.0 | 15.5 | 3.0 | 1.0 | 1.0 | 10.0 | 5.0 | 2.0 | 1 |
| 12 | 13.0 | 2.0 | 10.0 | 0.0 | 2.0 | 1.0 | 15.5 | 5.0 | 0.0 | 2 |
| 13 | 14.0 | 3.0 | 10.0 | 5.0 | 0.0 | 1.0 | 15.0 | 0.0 | 2.0 | 2 |
| 14 | 15.0 | 2.0 | 10.0 | 5.0 | 1.0 | 1.0 | 15.0 | 0.0 | 1.0 | 3 |
| 15 | 16.0 | 2.0 | 15.5 | 5.0 | 1.0 | 1.0 | 10.0 | 0.0 | 1.0 | 2 |
| 16 | 17.0 | 2.0 | 10.0 | 5.0 | 2.0 | 1.0 | 15.0 | 0.0 | 0.0 | 3 |
| 17 | 18.0 | 2.0 | 15.0 | 5.0 | 2.0 | 1.0 | 10.0 | 0.0 | 1.0 | 3 |
(optional) Evaluate the design
The method evaluate() allows to evaluate a design stored in a data frame, under the specification provided when EffDesign was initialised. evaluate() requires the following parameters:
optimal_design: The objective design matrix to evaluateV: The dictionary object with utility functionsmodel: The base model of the efficient design. By default ismnlfor a Multinomial Logit model.
[12]:
perf, ubalance = design.evaluate(optimal_design,V,model='mnl')
print(perf, ubalance)
0.034145726914017696 95.1394849419413
References
[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.