Example of a D-efficient RUM design with conditions in ChoiceDesign

This notebook illustrates how to use ChoiceDesign to generate a D-efficient experimental design with conditions for a Random Utility Maximisation (RUM) model. Given a set of attributes and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.

Step 1: Load modules, define design parameters and set attributes

The following lines load:

EffDesign: the class of efficient designs,
Attribute and Parameter: the classes of attributes and parameters, respectively.

[1]:

from choicedesign.design import EffDesign
from choicedesign.expressions import Attribute, Parameter

Each attribute is defined by the Attribute class. The arguments of this class are:

name: a string with the attribute name,
levels: a list of levels of the attribute,

Each attribute is alternative-specific. Hence, attributes must be defined for each alternative that contains them.

The following lines define 2 alternatives, named alt1 and alt2, and 4 attributes named from \(A\) to \(D\):

[2]:

alt1_A = Attribute('alt1_A',[1,2,3])
alt1_B = Attribute('alt1_B',[10,15,15.5])
alt1_C = Attribute('alt1_C',[0,3,5])
alt1_D = Attribute('alt1_D',[0,1,2])

alt2_A = Attribute('alt2_A',[1,2,3])
alt2_B = Attribute('alt2_B',[10,15,15.5])
alt2_C = Attribute('alt2_C',[0,3,5])
alt2_D = Attribute('alt2_D',[0,1,2])

Step 2: Construct efficient design object and generate initial design matrix

The second step consists of constructing the experimental design object, which requires the following parameters:

X: A list of Attribute class elements,
ncs: The number of choice situations.

The following lines define a object named design using EffDesign of 16 choice situations:

[3]:

design = EffDesign(
    X = [alt1_A,alt1_B,alt1_C,alt1_D,
         alt2_A,alt2_B,alt2_C,alt2_D],
    ncs=18)

After the design object is defined, the method gen_initdesign() generates the initial design matrix. This method accepts the following optional parameters:

cond: List of conditions that the final design must hold. Each element is a string that contains a single condition. The supported forms are:
- Binary relation: X > Y — attribute vs attribute or attribute vs a numeric value.
- Conditional: if X > a then Y < b — material implication; the consequent must hold whenever the antecedent does.
- Compound (AND): X > a & Y < b — all sub-conditions separated by & must hold.
- Arithmetic expressions: (X + Y + Z) > 0 — any mix of attribute names and numeric constants combined with +, -, *, / and parentheses is valid on either side of a comparison, including inside if/then clauses (e.g., if (X + Y) > 0 then P >= 0).
seed: Random seed

We will define four conditions for this design. The last one uses an arithmetic expression. The following lines define the conditions and integrate them into the initial design matrix:

[4]:

cond = ['alt1_A > alt2_A',
        'if alt1_B > 10 then alt2_A < 3',
        'if alt1_A > 1 then alt2_A < 3',
        'if (alt1_C + alt1_D) > 0 then alt1_A > 1'
        ]

[5]:

init_design = design.gen_initdesign(cond=cond)
init_design

[5]:

	alt1_A	alt1_B	alt1_C	alt1_D	alt2_A	alt2_B	alt2_C	alt2_D
0	2.0	10.0	0.0	2.0	1.0	10.0	5.0	0.0
1	3.0	15.0	3.0	1.0	1.0	15.0	5.0	0.0
2	2.0	10.0	5.0	2.0	1.0	15.5	0.0	0.0
3	2.0	10.0	3.0	1.0	1.0	10.0	5.0	1.0
4	3.0	15.5	3.0	2.0	1.0	15.0	3.0	0.0
5	2.0	10.0	0.0	0.0	1.0	15.0	3.0	1.0
6	2.0	15.5	5.0	2.0	1.0	10.0	3.0	1.0
7	2.0	10.0	5.0	0.0	1.0	15.0	0.0	2.0
8	3.0	10.0	0.0	2.0	2.0	10.0	5.0	2.0
9	2.0	10.0	5.0	2.0	1.0	10.0	3.0	0.0
10	2.0	10.0	0.0	1.0	1.0	15.0	5.0	1.0
11	3.0	15.5	3.0	0.0	2.0	15.5	0.0	1.0
12	3.0	10.0	5.0	1.0	1.0	10.0	5.0	0.0
13	3.0	15.5	5.0	1.0	1.0	15.5	0.0	2.0
14	2.0	15.5	5.0	2.0	1.0	10.0	5.0	2.0
15	3.0	15.5	5.0	2.0	1.0	10.0	3.0	2.0
16	3.0	10.0	5.0	2.0	1.0	15.0	0.0	1.0
17	2.0	15.0	5.0	0.0	1.0	15.5	0.0	0.0

Step 3: Set the utility functions

ChoiceDesign uses a native expression system to define utility functions. Parameters and attributes are combined using standard arithmetic operators. For this, we use the Parameter class, which requires the following arguments:

name: The parameter name
prior: The prior value

The following lines define four parameters:

[6]:

beta_A = Parameter('beta_A',-0.1)
beta_B = Parameter('beta_B',-0.02)
beta_C = Parameter('beta_C',0.1)
beta_D = Parameter('beta_D',0.15)

Then, the utility functions are defined using standard arithmetic operators. We will assume a linear utility function for each alternative.

[7]:

V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D

The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:

[8]:

V = {1: V1, 2: V2}

Step 3: Optimise the initial design, given the utility functions and priors:

The method optimise() starts the D-error minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:

init_design: The objective design matrix to optimise
V: The dictionary object with utility functions
model: The base model of the efficient design. By default is 'mnl' for a Multinomial Logit model.

In addition, optimise() admits the following optional parameters:

iter_lim: number of iterations before the algorithm stops.
noimprov_lim: Number of iterations without improvement before the algorithm stops,
time_lim: time (in minutes) before the algorithm stops,
seed: Random seed
verbose: Whether status messages and progress are shown.

The outputs of optimise are:

optimal_design: The optimised design matrix
init_perf: The initial D-Error
final_perf: The D-error of the last stored design
final_iter: The last iteration number
ubalance_ratio: The utility balance ratio. A 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares.

The following line starts the optimisation routine during 1 minute:

[9]:

optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)

Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.034146
Elapsed time: 0:01:00
D-error of initial design:  0.052387
D-error of last stored design:  0.034146
Utility Balance ratio:  95.14 %
Algorithm iterations:  46180

Blocking the design

The optimal design can be blocked using the method gen_blocks(). This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all the attributes. The method allows for the following arguments:

optimal_design: the experimental design
n_blocks: number of blocks.
n_iter (optional): number of iterations of the blocking algorithm

The following line creates 4 blocks in the optimal design:

[10]:

optimal_design_blocked = design.gen_blocks(optimal_design,n_blocks=3)

Lastly, the optimal design can be printed:

[11]:

optimal_design

[11]:

	CS	alt1_A	alt1_B	alt1_C	alt1_D	alt2_A	alt2_B	alt2_C	alt2_D	Block
0	1.0	3.0	10.0	0.0	2.0	1.0	15.0	5.0	1.0	1
1	2.0	3.0	10.0	3.0	0.0	2.0	15.0	5.0	2.0	2
2	3.0	2.0	10.0	5.0	1.0	1.0	15.5	0.0	1.0	3
3	4.0	3.0	15.5	5.0	0.0	1.0	10.0	3.0	2.0	1
4	5.0	2.0	15.5	5.0	2.0	1.0	10.0	3.0	0.0	2
5	6.0	2.0	15.0	0.0	1.0	1.0	10.0	5.0	1.0	3
6	7.0	3.0	15.5	3.0	2.0	1.0	10.0	5.0	0.0	1
7	8.0	3.0	10.0	3.0	0.0	1.0	15.0	3.0	2.0	1
8	9.0	3.0	10.0	5.0	2.0	2.0	15.5	3.0	0.0	1
9	10.0	2.0	15.5	5.0	2.0	1.0	10.0	3.0	0.0	3
10	11.0	2.0	10.0	0.0	2.0	1.0	15.5	5.0	0.0	2
11	12.0	3.0	15.5	3.0	1.0	1.0	10.0	5.0	2.0	1
12	13.0	2.0	10.0	0.0	2.0	1.0	15.5	5.0	0.0	2
13	14.0	3.0	10.0	5.0	0.0	1.0	15.0	0.0	2.0	2
14	15.0	2.0	10.0	5.0	1.0	1.0	15.0	0.0	1.0	3
15	16.0	2.0	15.5	5.0	1.0	1.0	10.0	0.0	1.0	2
16	17.0	2.0	10.0	5.0	2.0	1.0	15.0	0.0	0.0	3
17	18.0	2.0	15.0	5.0	2.0	1.0	10.0	0.0	1.0	3

(optional) Evaluate the design

The method evaluate() allows to evaluate a design stored in a data frame, under the specification provided when EffDesign was initialised. evaluate() requires the following parameters:

optimal_design: The objective design matrix to evaluate
V: The dictionary object with utility functions
model: The base model of the efficient design. By default is mnl for a Multinomial Logit model.

[12]:

perf, ubalance = design.evaluate(optimal_design,V,model='mnl')

print(perf, ubalance)

0.034145726914017696 95.1394849419413

Export the design

Export the optimised design to Excel. The exported table will reflect the constrained levels produced by the conditions.

[ ]:

attr_names = {
    'alt1_A': 'Attribute A', 'alt2_A': 'Attribute A',
    'alt1_B': 'Attribute B', 'alt2_B': 'Attribute B',
    'alt1_C': 'Attribute C', 'alt2_C': 'Attribute C',
    'alt1_D': 'Attribute D', 'alt2_D': 'Attribute D',
}
design.export_design(optimal_design, attr_names, 'rum_conds_design.xlsx')

Save the optimisation summary

After calling optimise(), the method export_output() writes a plain-text summary of the optimisation run — design configuration, stopping criteria, criterion values, utility balance, elapsed time, and iteration count — to a file.

[ ]:

design.export_output('rum_conds_output.txt')

References

[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.