install.packages("mlogit")
library("mlogit")
getwd()
setwd()
cbc.df <- read.csv("C:\\Users\\armop\\Dropbox\\PHD\\Teaching\\AGBU505\\2019\\Homework 5\\conjoint_yogurt.csv", colClasses = c(flav="factor",size="factor",diet = "factor", price = "factor"))
head(cbc.df)
The first three rows in cbc.df describe the first question that was asked of respondent 1, which is the question shown in Figure in Lecture Notes.
The choice column shows that this respondent chose the third alternative, which was a 6-passenger gas engine minivan with 3 ft of cargo capacity at a price of $30,000$ (represented in $1,000$s as “30”).
resp.id indicates which respondent answered this question,
summary(cbc.df)
However, a more informative way to summarize choice data is to compute choice counts, which are cross tabs on the number of times respondents chose an alternative at each feature level. We can do this easily using xtabs()
xtabs(choice ∼ price, data=cbc.df)
xtabs(choice ∼ cargo, data=cbc.df)
You should compute choice counts for each attribute before estimating a choice model. If you find that your model’s estimates or predicted shares are not consistent with the raw counts, consider whether there could be a mistake in the data formatting.
mlogit requires the choice data to be in a special data format created using the mlogit.data() function. You pass your choice data to mlogit.data, along with a few parameters telling it how the data is organized. mlogit.data accepts data in either a “long” or a “wide” format and you tell it which you have using the shape parameter. The choice, varying and id.var parameters indicate which columns contain the response data, the attributes and the respondent ids, respectively.
cbc.mlogit <- mlogit.data(data=cbc.df, choice="choice", shape="long",varying=3:6, alt.levels=paste("pos",1:3),id.var="resp.id")
head(cbc.mlogit)
m1 <- mlogit(choice ∼ 0 + size + shp +price, data = cbc.mlogit)
summary(m1)
The Estimate lists the mean values for each level; these must be interpreted relative to the base levels of each attribute. For example, the estimate for seat7 measures the attractiveness of 7 passenger minivans relative to 6 passenger minivans. The negative sign tells us that, on average, our customers preferred 6 seat minivans to 7 seat minivans. Estimates that are larger in magnitude indicate stronger preferences, so we can see that customers strongly disliked electric engines (relative to the base level, which is gas) and disliked the $40K$ price (relative to the base level price of $30K$). These parameter estimates are on the logit scale and typically range between −2 and 2.
To convert into odds ratio we need to take exponetial of the coefficeints
round(exp(coef(m1)),3)
Based on the odds interpratation, the coeefient of $cargofit=1.612$ means that customers are 1.612 more likely to use miniven when cargo has 3ft compare to 2ft. Another way to think about this is that the 3ft cargo increases the likelihood by $61.2 \%$.
The Std. Error column gives a sense of how precise the estimate is, given the data, along with a statistical test of whether the coefficient is different than zero. A non-significant test result indicates that there is no detectible difference in preference for that level relative to the base level. Just as with any statistical model, the more data you have in you conjoint study (for a given set of attributes), the smaller the standard errors will be.
The good question is why we included $0 + $ in the formula for $m1$. It indicates that we did not want an intercept included in our model. We could estimate a model with an intercept:
m2 <- mlogit(choice ∼ seat + cargo + eng + price, data = cbc.mlogit)
summary(m2)
When we include the intercept, mlogit adds two additional parameters that indicate preference for the different positions in the question (left, right, or middle in survey figure from lecture notes):$\\ $
pos2:(intercept) indicates the relative preference of the second position in the question (versus the first) and pos3:(intercept) indicates the preference for the third position (versus the first.) These are sometimes called alternative specific constants or ASC’s to differentiate them from the single intercept in a linear model.
In a typical conjoint analysis study, we don’t expect that people will choose a minivan because it is on the left or the right in a survey question! For that reason, we would not expect the estimated alternative specific constants to differ from zero. If we found one of these parameters to be significant, that might indicate that some respondents are simply choosing the first or the last option without considering the question.
In this model, the intercept parameter estimates are non-significant and close to zero. This suggests that it was reasonable to leave them out of our first model, but we can test this formally using lrtest():
lrtest(m1, m2)
This function performs a statistical test called a likelihood ratio test, which can be used to compare two choice models where one model has a subset of the parameters of another model.$\\$ Comparing m1 to m2 results in a p-value Pr(>Chisq)) of $0.7122$. Since the p-value is much greater than $0.05$, we can conclude that m1 and m2 fit the data equally well. This suggests that we don’t need the alternative specific constants to fit the present data.
We don’t have to treat every attribute in a conjoint study as a factor. As with
linear models, some predictors may be factors while others are numeric. For example,
we can include price as a numeric predictor with a simple change to
the model formula.
In the model formula, we convert price to character vector
using as.character and then to a number using as.numeric.
m3 <- mlogit(choice ∼ diet + as.numeric(as.character(price)), data = cbc.mlogit)
table(cbc.mlogit$diet)
cbc.mlogit$diet=as.factor(cbc.mlogit$diet)
levels(cbc.mlogit$diet)
The output now shows a single parameter for price. The estimate is negative indicating that people prefer lower prices to higher prices. A quick likelihood ratio test suggests that the model with a single price parameter fits just as well as our first model.
lrtest(m1, m3)
Given this finding, we choose $m3$ as our preferred model because it has fewer parameters.
Since the coefficients measure relative preference for the levels, it makes them difficult to understand and interpret. So, instead of presenting the coefficients, most choice modelers prefer to focus on using the model to make choice share predictions or to compute willingness-to-pay for each attribute.
We can compute the average willingness-to-pay for a particular level of an attribute by dividing the coefficient for that level by the price coefficient.
coef(m3)["cargo3ft"]/(-coef(m3)["as.numeric(as.character(price))"]/1000)
The result is a number measured in dollars, $\$2750.60$ in this case. (We divide by $1000$ because our prices were recorded in $1,000$s of dollars.)
Willingness-to-pay is a bit of a misnomer; the proper interpretation of this number is that, on average, customers would be equally divided between a minivan with 2 ft of cargo space and
a minivan with 3 ft of cargo space that costs $\$2750.60$ more. Another way to think
of it is that $\$2750.60$ is the price at which customers become indifferent between the
two cargo capacity options.
You can compute willingness to pay value for
every attribute in the study and reported to decision makers to help them understand
how much customers value various features.
The idea is to use the above model to make share predictions. A share simulator allows you to define a number of different alternatives and then use the model to predict how customers would choose among those new alternatives. For example, you could use the model to predict choice share for the company’s new minivan design against a set of key competitors. By varying the attributes of the planned minivan design, you can see how changes in the design affect the choice share.
predict.mnl <- function(model, data) {
# Function for predicting shares from a multinomial logit model
# model: mlogit object returned by mlogit()
# data: a data frame containing the set of designs for which you want to
# predict shares. Same format as the data used to estimate model.
data.model <- model.matrix(update(model$formula, 0 ∼ .), data = data)[,-1]
utility <- data.model%*%model$coef
share <- exp(utility)/sum(exp(utility))
cbind(share, data)
}
Now we need to create new data for prediction.
attrib <- list(seat = c("6", "7", "8"),
cargo = c("2ft", "3ft"),
eng = c("gas", "hyb", "elec"),
price = c("30", "35", "40"))
new.data <- expand.grid(attrib)[c(8, 1, 3, 41, 49, 26), ]
predict.mnl(m3, new.data)
The model-predicted shares are shown in the column labeled share
and we can see
that among this set of products, we would expect respondents to choose the 7-seat
hybrid engine minivan with 2 ft of cargo space at $\$30K$ a little more than $11\%$ of
the time. If a company was planning to launch a minivan like this, they could use
the model to see how changing the attributes of this product would affect the choice
shares. Note that these share predictions are always made relative to a particular set
of competitors; the share for the first minivan would change if the competitive set
were different.