{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Example of a D-efficient RUM design with conditions in ChoiceDesign\n", "\n", "This notebook illustrates how to use **ChoiceDesign** to generate a D-efficient experimental design with conditions for a Random Utility Maximisation (RUM) model. Given a set of attributes and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.\n", "\n", "## Step 1: Load modules, define design parameters and set attributes\n", "\n", "The following lines load:\n", "- `EffDesign`: the class of efficient designs,\n", "- `Attribute` and `Parameter`: the classes of attributes and parameters, respectively." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from choicedesign.design import EffDesign\n", "from choicedesign.expressions import Attribute, Parameter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each attribute is defined by the `Attribute` class. The arguments of this class are:\n", "\n", "* `name`: a string with the attribute name,\n", "* `levels`: a list of levels of the attribute,\n", "\n", "Each attribute is alternative-specific. Hence, attributes must be defined for each alternative that contains them.\n", "\n", "The following lines define 2 alternatives, named `alt1` and `alt2`, and 4 attributes named from $A$ to $D$:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "alt1_A = Attribute('alt1_A',[1,2,3])\n", "alt1_B = Attribute('alt1_B',[10,15,15.5])\n", "alt1_C = Attribute('alt1_C',[0,3,5])\n", "alt1_D = Attribute('alt1_D',[0,1,2])\n", "\n", "alt2_A = Attribute('alt2_A',[1,2,3])\n", "alt2_B = Attribute('alt2_B',[10,15,15.5])\n", "alt2_C = Attribute('alt2_C',[0,3,5])\n", "alt2_D = Attribute('alt2_D',[0,1,2])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Construct efficient design object and generate initial design matrix\n", "\n", "The second step consists of constructing the experimental design object, which requires the following parameters:\n", "\n", "- `X`: A list of `Attribute` class elements,\n", "- `ncs`: The number of choice situations.\n", "\n", "The following lines define a object named `design` using `EffDesign` of 16 choice situations:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "design = EffDesign(\n", " X = [alt1_A,alt1_B,alt1_C,alt1_D,\n", " alt2_A,alt2_B,alt2_C,alt2_D],\n", " ncs=18)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After the design object is defined, the method `gen_initdesign()` generates the initial design matrix. This method accepts the following optional parameters:\n", "\n", "* `cond`: List of conditions that the final design must hold. Each element is a string that contains a single condition. The supported forms are:\n", " * **Binary relation**: `X > Y` — attribute vs attribute or attribute vs a numeric value.\n", " * **Conditional**: `if X > a then Y < b` — material implication; the consequent must hold whenever the antecedent does.\n", " * **Compound (AND)**: `X > a & Y < b` — all sub-conditions separated by `&` must hold.\n", " * **Arithmetic expressions**: `(X + Y + Z) > 0` — any mix of attribute names and numeric constants combined with `+`, `-`, `*`, `/` and parentheses is valid on either side of a comparison, including inside `if/then` clauses (e.g., `if (X + Y) > 0 then P >= 0`).\n", "\n", "* `seed`: Random seed\n", "\n", "We will define four conditions for this design. The last one uses an arithmetic expression. The following lines define the conditions and integrate them into the initial design matrix:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "cond = ['alt1_A > alt2_A',\n", " 'if alt1_B > 10 then alt2_A < 3',\n", " 'if alt1_A > 1 then alt2_A < 3',\n", " 'if (alt1_C + alt1_D) > 0 then alt1_A > 1'\n", " ]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
alt1_Aalt1_Balt1_Calt1_Dalt2_Aalt2_Balt2_Calt2_D
02.010.00.02.01.010.05.00.0
13.015.03.01.01.015.05.00.0
22.010.05.02.01.015.50.00.0
32.010.03.01.01.010.05.01.0
43.015.53.02.01.015.03.00.0
52.010.00.00.01.015.03.01.0
62.015.55.02.01.010.03.01.0
72.010.05.00.01.015.00.02.0
83.010.00.02.02.010.05.02.0
92.010.05.02.01.010.03.00.0
102.010.00.01.01.015.05.01.0
113.015.53.00.02.015.50.01.0
123.010.05.01.01.010.05.00.0
133.015.55.01.01.015.50.02.0
142.015.55.02.01.010.05.02.0
153.015.55.02.01.010.03.02.0
163.010.05.02.01.015.00.01.0
172.015.05.00.01.015.50.00.0
\n", "
" ], "text/plain": [ " alt1_A alt1_B alt1_C alt1_D alt2_A alt2_B alt2_C alt2_D\n", "0 2.0 10.0 0.0 2.0 1.0 10.0 5.0 0.0\n", "1 3.0 15.0 3.0 1.0 1.0 15.0 5.0 0.0\n", "2 2.0 10.0 5.0 2.0 1.0 15.5 0.0 0.0\n", "3 2.0 10.0 3.0 1.0 1.0 10.0 5.0 1.0\n", "4 3.0 15.5 3.0 2.0 1.0 15.0 3.0 0.0\n", "5 2.0 10.0 0.0 0.0 1.0 15.0 3.0 1.0\n", "6 2.0 15.5 5.0 2.0 1.0 10.0 3.0 1.0\n", "7 2.0 10.0 5.0 0.0 1.0 15.0 0.0 2.0\n", "8 3.0 10.0 0.0 2.0 2.0 10.0 5.0 2.0\n", "9 2.0 10.0 5.0 2.0 1.0 10.0 3.0 0.0\n", "10 2.0 10.0 0.0 1.0 1.0 15.0 5.0 1.0\n", "11 3.0 15.5 3.0 0.0 2.0 15.5 0.0 1.0\n", "12 3.0 10.0 5.0 1.0 1.0 10.0 5.0 0.0\n", "13 3.0 15.5 5.0 1.0 1.0 15.5 0.0 2.0\n", "14 2.0 15.5 5.0 2.0 1.0 10.0 5.0 2.0\n", "15 3.0 15.5 5.0 2.0 1.0 10.0 3.0 2.0\n", "16 3.0 10.0 5.0 2.0 1.0 15.0 0.0 1.0\n", "17 2.0 15.0 5.0 0.0 1.0 15.5 0.0 0.0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "init_design = design.gen_initdesign(cond=cond)\n", "init_design" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Set the utility functions\n", "\n", "`ChoiceDesign` uses a native expression system to define utility functions.\n", "Parameters and attributes are combined using standard arithmetic operators.\n", "For this, we use the `Parameter` class, which requires the following arguments:\n", "\n", "* `name`: The parameter name\n", "* `prior`: The prior value\n", "\n", "The following lines define four parameters:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "beta_A = Parameter('beta_A',-0.1)\n", "beta_B = Parameter('beta_B',-0.02)\n", "beta_C = Parameter('beta_C',0.1)\n", "beta_D = Parameter('beta_D',0.15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, the utility functions are defined using standard arithmetic operators. We will assume a linear utility function for each alternative." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D\n", "V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "V = {1: V1, 2: V2}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Optimise the initial design, given the utility functions and priors:\n", "\n", "The method `optimise()` starts the D-error minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:\n", "\n", "* `init_design`: The objective design matrix to optimise\n", "* `V`: The dictionary object with utility functions\n", "* `model`: The base model of the efficient design. By default is `'mnl'` for a Multinomial Logit model.\n", "\n", "In addition, `optimise()` admits the following optional parameters:\n", "\n", "* `iter_lim`: number of iterations before the algorithm stops.\n", "* `noimprov_lim`: Number of iterations without improvement before the algorithm stops,\n", "* `time_lim`: time (in minutes) before the algorithm stops,\n", "* `seed`: Random seed\n", "* `verbose`: Whether status messages and progress are shown.\n", "\n", "The outputs of `optimise` are:\n", "\n", "* `optimal_design`: The optimised design matrix\n", "* `init_perf`: The initial D-Error\n", "* `final_perf`: The D-error of the last stored design\n", "* `final_iter`: The last iteration number\n", "* `ubalance_ratio`: The utility balance ratio. A 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares.\n", "\n", "The following line starts the optimisation routine during 1 minute:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating initial design\n", "Optimization complete 0:00:59 / D-error: 0.034146\n", "Elapsed time: 0:01:00\n", "D-error of initial design: 0.052387\n", "D-error of last stored design: 0.034146\n", "Utility Balance ratio: 95.14 %\n", "Algorithm iterations: 46180\n", "\n" ] } ], "source": [ "optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Blocking the design\n", "\n", "The optimal design can be blocked using the method `gen_blocks()`. This method randomly creates candidate blocks and keeps the one with the minimum correlation between the blocking column and all the attributes. The method allows for the following arguments:\n", "\n", "- `optimal_design`: the experimental design\n", "- `n_blocks`: number of blocks.\n", "- `n_iter` (optional): number of iterations of the blocking algorithm\n", "\n", "The following line creates 4 blocks in the optimal design:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "optimal_design_blocked = design.gen_blocks(optimal_design,n_blocks=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lastly, the optimal design can be printed:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CSalt1_Aalt1_Balt1_Calt1_Dalt2_Aalt2_Balt2_Calt2_DBlock
01.03.010.00.02.01.015.05.01.01
12.03.010.03.00.02.015.05.02.02
23.02.010.05.01.01.015.50.01.03
34.03.015.55.00.01.010.03.02.01
45.02.015.55.02.01.010.03.00.02
56.02.015.00.01.01.010.05.01.03
67.03.015.53.02.01.010.05.00.01
78.03.010.03.00.01.015.03.02.01
89.03.010.05.02.02.015.53.00.01
910.02.015.55.02.01.010.03.00.03
1011.02.010.00.02.01.015.55.00.02
1112.03.015.53.01.01.010.05.02.01
1213.02.010.00.02.01.015.55.00.02
1314.03.010.05.00.01.015.00.02.02
1415.02.010.05.01.01.015.00.01.03
1516.02.015.55.01.01.010.00.01.02
1617.02.010.05.02.01.015.00.00.03
1718.02.015.05.02.01.010.00.01.03
\n", "
" ], "text/plain": [ " CS alt1_A alt1_B alt1_C alt1_D alt2_A alt2_B alt2_C alt2_D \\\n", "0 1.0 3.0 10.0 0.0 2.0 1.0 15.0 5.0 1.0 \n", "1 2.0 3.0 10.0 3.0 0.0 2.0 15.0 5.0 2.0 \n", "2 3.0 2.0 10.0 5.0 1.0 1.0 15.5 0.0 1.0 \n", "3 4.0 3.0 15.5 5.0 0.0 1.0 10.0 3.0 2.0 \n", "4 5.0 2.0 15.5 5.0 2.0 1.0 10.0 3.0 0.0 \n", "5 6.0 2.0 15.0 0.0 1.0 1.0 10.0 5.0 1.0 \n", "6 7.0 3.0 15.5 3.0 2.0 1.0 10.0 5.0 0.0 \n", "7 8.0 3.0 10.0 3.0 0.0 1.0 15.0 3.0 2.0 \n", "8 9.0 3.0 10.0 5.0 2.0 2.0 15.5 3.0 0.0 \n", "9 10.0 2.0 15.5 5.0 2.0 1.0 10.0 3.0 0.0 \n", "10 11.0 2.0 10.0 0.0 2.0 1.0 15.5 5.0 0.0 \n", "11 12.0 3.0 15.5 3.0 1.0 1.0 10.0 5.0 2.0 \n", "12 13.0 2.0 10.0 0.0 2.0 1.0 15.5 5.0 0.0 \n", "13 14.0 3.0 10.0 5.0 0.0 1.0 15.0 0.0 2.0 \n", "14 15.0 2.0 10.0 5.0 1.0 1.0 15.0 0.0 1.0 \n", "15 16.0 2.0 15.5 5.0 1.0 1.0 10.0 0.0 1.0 \n", "16 17.0 2.0 10.0 5.0 2.0 1.0 15.0 0.0 0.0 \n", "17 18.0 2.0 15.0 5.0 2.0 1.0 10.0 0.0 1.0 \n", "\n", " Block \n", "0 1 \n", "1 2 \n", "2 3 \n", "3 1 \n", "4 2 \n", "5 3 \n", "6 1 \n", "7 1 \n", "8 1 \n", "9 3 \n", "10 2 \n", "11 1 \n", "12 2 \n", "13 2 \n", "14 3 \n", "15 2 \n", "16 3 \n", "17 3 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "optimal_design" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## (optional) Evaluate the design\n", "The method `evaluate()` allows to evaluate a design stored in a data frame, under the specification provided when `EffDesign` was initialised. `evaluate()` requires the following parameters:\n", "\n", "* `optimal_design`: The objective design matrix to evaluate\n", "* `V`: The dictionary object with utility functions\n", "* `model`: The base model of the efficient design. By default is `mnl` for a Multinomial Logit model." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.034145726914017696 95.1394849419413\n" ] } ], "source": [ "perf, ubalance = design.evaluate(optimal_design,V,model='mnl')\n", "\n", "print(perf, ubalance)" ] }, { "cell_type": "markdown", "id": "c69ae1cd", "metadata": {}, "source": [ "## Export the design\n", "\n", "Export the optimised design to Excel. The exported table will reflect the constrained levels produced by the conditions." ] }, { "cell_type": "code", "execution_count": null, "id": "856aa186", "metadata": {}, "outputs": [], "source": [ "attr_names = {\n", " 'alt1_A': 'Attribute A', 'alt2_A': 'Attribute A',\n", " 'alt1_B': 'Attribute B', 'alt2_B': 'Attribute B',\n", " 'alt1_C': 'Attribute C', 'alt2_C': 'Attribute C',\n", " 'alt1_D': 'Attribute D', 'alt2_D': 'Attribute D',\n", "}\n", "design.export_design(optimal_design, attr_names, 'rum_conds_design.xlsx')" ] }, { "cell_type": "markdown", "id": "b38f8077", "metadata": {}, "source": [ "## Save the optimisation summary\n", "\n", "After calling `optimise()`, the method `export_output()` writes a plain-text summary of the optimisation run — design configuration, stopping criteria, criterion values, utility balance, elapsed time, and iteration count — to a file." ] }, { "cell_type": "code", "execution_count": null, "id": "1783f277", "metadata": {}, "outputs": [], "source": [ "design.export_output('rum_conds_output.txt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "choicedesign-oSBhddzi-py3.13", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.5" } }, "nbformat": 4, "nbformat_minor": 2 }