You are here:
Home |
White Papers
| Improving Customer Satisfaction and Loyalty
Download PDF
Version
Improving Customer Satisfaction and Loyalty
with
Time-Series Cross-Sectional Models
By
John Colias, Ph.D.; Beth Horn, Ph.D.; and Ellen Wilkshire
Customer satisfaction and loyalty surveys typically track brand perceptions
both overall and with respect to specific performance areas. For example, a
survey might ask customers to rate brands based on overall satisfaction, likelihood
to purchase again, likelihood to recommend, customer service, product performance,
and brand image.
Since brand ratings are usually tracked over time, two types of data variability
are available for analysis:
- Across units (e.g., stores) within a time period
- Across time periods within a unit.

In our experience, most regression modeling approaches in customer satisfaction
and loyalty research use only data variation across stores, providers, or other
units. That is, most lack the time-series component or variation in unit performance
over time.
Predictive accuracy could be strengthened if variability due to ratings of
units over time (e.g., ratings of a store from one month to the next) were included
in the modeling. Since customer satisfaction research typically collects data
over time, including the time-series component in models is a natural extension
to current practice.
In the article, we report an application of time-series cross-sectional (TSCS)
modeling, which incorporates both across-units and across-time variation in
data variables. The results from this application illustrate the value of adding
the time-series component to the analysis.
Before presenting our application, we present the most common statistical modeling
approaches. Then, we report our application with actual data. Finally, we discuss
the implications of our findings for marketing research.
Common Statistical Modeling Approaches
Four modeling techniques can be used in modeling customer satisfaction or
loyalty–structural equation modeling, limited dependent variable regression,
latent class regression, and Hierarchical Bayes (HB) regression. The first three
techniques increase reliability and accuracy, but do not necessarily use the
information contained in across-time variation.
Structural equation modeling is a statistical tool that describes complex
relationships by combining multivariate regression and confirmatory factor analysis.
This technique delivers more reliable measurement by using multiple measures
of each factor. It delivers more accurate predictions when brand attribute ratings
that relate to product and service are highly collinear.
Regression with a limited dependent variable determines the probability of
each scale point. By modeling probability mass at each scale point, we avoid
the assumption of a normal distribution of the dependent variable. In fact,
scale ratings do not typically have a normal distribution, being usually skewed
to the high-end of the rating scale. By using a more flexible distribution assumption
as the basis for the model, greater prediction accuracy can be achieved.
Latent-class and Hierarchical Bayes regression can also model a limited dependent
variable, but takes an even further step towards realism. Latent-class regression
assumes the possibility of separate regression parameters for latent, or unobserved,
segments. Hierarchical Bayes assumes the possibility of separate regression
parameters for each individual customer.
Time-series cross-sectional (TSCS) modeling can use any of the aforementioned
techniques. For example, our application of time-series cross-sectional modeling
that we report in this paper uses a latent-class regression model.
What distinguishes TSCS is that it uses more than one time period to develop
the model parameters. By incorporating multiple time periods of data, TSCS delivers
more accurate predictions.
Application of Time-Series Cross-Sectional Vs. Cross-Sectional Only
Modeling
The data used in this current research were collected via the Internet. Four
time periods of customer satisfaction data for six brands were obtained. The
respondents were members of American Consumer Opinion® Online, an online
panel of consumers that have agreed to participate in Internet surveys. For
this study, all panel members were based in a large, U.S. metropolitan area.
Respondents rated brands on 21 attributes. The attributes addressed product
quality, product appeal, customer service, and overall satisfaction. Ratings
were based on 10-point scales that ranged from 1 (Very Poor) to 10 (Excellent),
with an option of
“Don’t Know.” Respondents only rated brands with which they
were familiar. For confidentiality reasons, we cannot reveal the category, the
attributes, or the brand names.
Data were aggregated by zip code into separate market areas for each brand.
The original intent of the survey data was to answer questions about specific
store brands’ image and performance within the category. Thus data were
not collected by store location, just by the store brand in general.
We initially wanted to define a cross-section as a specific unit location
(such as store #24 in the New York area). However, sample size was limited at
the store level. So, to form an approximation for store locations, the data
were aggregated by brand within zip code. The 102 zip codes collected in the
four studies were aggregated into 17 groups based on a geomapping technique.
The six brands had adequate representation in each of the 17 zip code groups.
Respondents who lived within the same cluster of zip codes were assumed to be
rating stores within the same market area.

Overall satisfaction varied substantially by store and zip code area. This
variability is depicted in the following histogram where, out of 296 observations
(an observation is a set of stores within a zip code cluster), a significant
portion have an average rating less than 6.0 or greater than 8.0.

Even more important, however, is the variation of brand ratings across time,
as this across-time variation is what we would like to use to better understand
the drivers of customer satisfaction. The next chart shows that Brands 1 and
6 showed significant improvement over time in overall satisfaction. The TSCS
model attempts to explain which store attributes caused this improvement over
time.

Two models were developed:
- A cross-sectional model (from Time Period 3 only), which consisted of seven
attribute factors as predictors of overall satisfaction by store.
- A time-series cross-sectional model, which consisted of the seven factors
as predictors of overall satisfaction by store from Time Periods 1, 2, and
3.
Latent-class regression was used to develop the two models in order to avoid
the erroneous assumption that one single regression model describes all members
of a population. Latent-class regression, in contrast, yields regression coefficients
that are different across unobserved (latent) groups.
Each model was used to predict the change in overall satisfaction between
Time Periods 3 and 4. The CSO (cross-sectional only) model predicted Time Period
4 with an MAE (mean absolute error of predicted vs. actual) of .22 rating points
across the six brands.
The TSCS (time-series cross-sectional) model predicted more accurately with
an MAE of 0.11.

The evidence from this data suggests that predictive accuracy in customer satisfaction
and loyalty research is enhanced when time-series variation is used. We believe
that we might have achieved a greater increase in predictive accuracy for TSCS
vs. CSO if (a) we had been able to use exact store locations instead of zip
code clusters as the cross-sections within the analysis and (b) more time periods
had been available to develop the model.
While the improvement in prediction accuracy is nice to have, a more important
finding is that the TSCS model tells a different story about what factors drive
overall satisfaction.
As indicated in the bar chart, both the CSO model (using only cross-sectional
variation) and the TSCS model (using both time-series and cross-sectional variation)
suggests that Factor 7 is the strongest driver of overall satisfaction. However,
the CSO model points to Factor 3 as the second strongest driver, while the TSCS
model points to Factor 1.
Given that the two approaches delivered different answers about what drives
overall satisfaction, we are left with a choice of which approach to choose
when applying the model to business decision making. We believe that the stronger
methodological approach should be chosen. The time-series cross-sectional model
uses more information (across-time variation) to score and rank the drivers
of overall satisfaction. For this reason, we believe that the time-series cross-sectional
model provides the better result.

Because of multicollinearity of the attribute ratings that potentially predict
overall satisfaction, 20 attributes were reduced to seven factors via factor
analysis.
Implications for Marketing Research
If your business currently tracks brand ratings on overall satisfaction or
customer loyalty over time, we recommend that you consider developing time-series
cross-sectional models to explain customer satisfaction or customer loyalty.
If your business desires to set priorities for brand, product, and customer
service improvements, time-series cross-sectional models provide an answer that
(a) better predicts future market outcomes and (b) delivers a better assessment
of the true drivers of customer satisfaction or customer loyalty.
Copyright © 2007 by Decision Analyst, Inc.
This article may not be copied, published, or used in any way without written
permission of Decision Analyst.
About the Authors
John Colias (jcolias@decisionanalyst.com)
is Senior Vice President at Decision Analyst. Beth Horn (ehorn@decisionanalyst.com)
is Vice President at Decision Analyst. Ellen Wilkshire (ewilksh@decisionanalyst.com)
is Senior Statistical Analyst at Decision Analyst. They may be reached at 1-800-262-5974
or 1-817-640-6166.
Additional Resources from Decision Analyst
Related Services
Case Studies
Related White Papers