Tracking the Age Wave: Parsimonious Estimation in Cohort Analysis

Roland T. Rust, Vanderbilt University
Kary Wan-Yu Yeung, Vanderbilt University
ABSTRACT - The American population is experiencing major changes in its age distribution, and the burgeoning elderly market is capturing the attention of consumer researchers. The needs and wants of this segment change when different cohorts (generations) fill it at different time periods, and thus a knowledge of the impact of cohort membership, age, and period-specific factors on consumption behavior can help to identify potential marketing opportunities. Cohort analysis refers to research methodologies designed to estimate these effects. A new approach to cohort analysis, parsimonious cohort estimation (PACE), is proposed in this paper. Unlike the predominant existing approach, constrained multiple regression (CMR), PACE is based on a single, well-defined, theoretical criterion, and thus does not require ad hoc constraints. Instead of relying upon ad hoc constraints, PACE finds its solution based on a well accepted theoretical principle, the principle of parsimony, or Occam's razor. Besides its apparent theoretical advantage, PACE also performs better empirically, based on a simulation.
[ to cite ]:
Roland T. Rust and Kary Wan-Yu Yeung (1995) ,"Tracking the Age Wave: Parsimonious Estimation in Cohort Analysis", in NA - Advances in Consumer Research Volume 22, eds. Frank R. Kardes and Mita Sujan, Provo, UT : Association for Consumer Research, Pages: 680-685.

Advances in Consumer Research Volume 22, 1995      Pages 680-685

TRACKING THE AGE WAVE: PARSIMONIOUS ESTIMATION IN COHORT ANALYSIS

Roland T. Rust, Vanderbilt University

Kary Wan-Yu Yeung, Vanderbilt University

ABSTRACT -

The American population is experiencing major changes in its age distribution, and the burgeoning elderly market is capturing the attention of consumer researchers. The needs and wants of this segment change when different cohorts (generations) fill it at different time periods, and thus a knowledge of the impact of cohort membership, age, and period-specific factors on consumption behavior can help to identify potential marketing opportunities. Cohort analysis refers to research methodologies designed to estimate these effects. A new approach to cohort analysis, parsimonious cohort estimation (PACE), is proposed in this paper. Unlike the predominant existing approach, constrained multiple regression (CMR), PACE is based on a single, well-defined, theoretical criterion, and thus does not require ad hoc constraints. Instead of relying upon ad hoc constraints, PACE finds its solution based on a well accepted theoretical principle, the principle of parsimony, or Occam's razor. Besides its apparent theoretical advantage, PACE also performs better empirically, based on a simulation.

INTRODUCTION

The American population is undergoing widespread demographic changes as we approach the 21st century. Business strategies designed to achieve long run viability should take these projected demographic shifts in the market into consideration. Demographic shifts affect the definition, size, and characteristics of existing market segments. The changes occurring in its existing and potential target markets affect a firm's marketing strategies and profitability. One of the most important current demographic developments is the changing age distribution. Due to the graying of the baby boom generation, the declining fertility rate, and the rising life expectancy, the median age of the population in the year 2040 will be ten years older than the median age today (Leventhal 1990). The size of the 45-54 age group is projected to have an increase of 46% during the last decade of the 20th century (MRI 1990).

Not only is the relative size of the older population changing, but the characteristics of the members of this segment are also changing. The baby boomers, those born from about 1946 through 1964, will comprise the majority of the older population in the following decades. On average, members of the baby boom generation are better educated and more affluent than their parents. Hence their consumption behavior and response to marketing variables are very different from those of their parents, who typically experienced both the Great Depression and the Second World War. Therefore companies which are targeting the older population may find that they need to reevaluate their existing strategies. A study of soft drink consumption (Rentz, Reynolds and Stout 1983) showed that the soft drink industry was unlikely to suffer from the changing age distribution even though most of the consumers were younger people. The authors found that the consumption of soft drinks remained fairly stable over a person's life-course. Hence, the total amount of soft drink consumption should increase as the present older population is replaced by the baby boom cohort. The predictions of those researchers have so far been borne out. According to Beverage Industry Annual Manual (Edgell 1991), soft drink consumption has gone up 4% since 1983.

Since the wants and needs of the members of an age group may change as different cohorts fill it, a firm that is targeting an age group is at the same time also targeting a specific cohort. Therefore, to profit from the expanding elderly market, it is important for manufacturers and marketers of products and services to have a good knowledge of the changing needs and consumption behavior of this segment due to the cohort effect (factors that are peculiar to a particular cohort). Besides cohort- and age-based factors, a person's consumption pattern at a specific point in time is also influenced by the events happening during that time period. Hence time-specific factors should also be taken into consideration in formulating marketing strategy.

Estimating the age, cohort and period effects is problematic because the effects cannot be unambiguously determined. For instance, the change in the behavior of the baby boomers from their teenage stage to their mature stage is a function of both the age effect and the period effect. Only the combined effects of the two factors can be estimated unambiguously. In other words, there is no unique solution for each individual effect. Research methodology that formulates criteria for choosing the solution of the effect estimates is called cohort analysis. Cohort analysis has been applied in the areas of sociology, medical research, demographic and marketing research (Adams et al. 1990; Akers 1965; Clogg 1982; Frost 1939; Rentz et al. 1983; Rentz and Reynolds 1991; Sacher 1987; Sasaki and Susuki 1987). This paper proposes to use the principle of parsimony or Occam's razor (Cohen and Nagel 1934; Hempel 1966), a criterion which is widely applied in science, for choosing the solution of the cohort, period, and age effects. According to this principle, the simplest explanation among all competing explanations to the phenomenon of interest is preferred. The parsimonious solution is preferred because it is more readily falsified and therefore better testable.

Section 2 describes the essence of cohort analysis and Section 3 presents the parsimonious cohort estimation (PACE), its formulation and estimation procedure. In Section 4, the performance of PACE is compared to the performance of the predominant existing approach. Conclusions are given in Section 5.

COHORT ANALYSIS

The Conceptual Basis of Cohort Analysis

A cohort is an aggregation of people who enter a social system within the same time period. An alternative and more commonly used definition of cohort is a group of people who share an experience within the same time period (Glenn 1977). The experience used to delineate the population varies across studies. It can be marriage, education status, geographical location, or birth. Since the focus of this study is on birth cohorts, the term "cohort" will be used hereafter to mean "birth cohort". The boundary of the time period categorizing each cohort is arbitrary, with the interval length usually ranging from one to ten years. Homogeneity is not assumed within each cohort, but members of each cohort share the same macro-political, -cultural, -socio, -economic environment. Due to the differences in historical experience and peer group socializing process, each cohort has its unique characteristics. As a result, members of one cohort exhibit distinct behavioral patterns from members of other cohorts. The study of the differences in a phenomenon due to cohort effects (effects which are associated with cohort membership) is the domain of cohort analysis. Specifically, cohort analysis is a group of research techniques which examine and estimate the relationships between cohort identification and the phenomenon of interest.

Demographic researchers have long been studying the extent to which cohort membership affects the fertility rate and mortality rate of certain age categories (Mason and Fienberg 1985). Medical researchers have found that the changes in the rate of tuberculosis mortality rate are cohort-based (Frost 1939; Mason and Fienberg 1985). In social science, there are studies of the impact of cohort membership on the changes in political sentiment (Adams et al. 1990; Hout and Knoke 1975; Mason and Fienberg 1985). In marketing, Rentz, Reynolds and Stout (1983) and Rentz and Reynolds (1991) have studied the impact of cohort membership on soft drink consumption.

Cross-sectional comparison of cohorts, without taking the age factor into consideration can be misleading. For instance, the difference in fertility rate between women in their 20's and women in their 40's is mainly age-related. Due to the differences in physiological condition caused by aging, younger women have a higher fertility rate than older women. The fact that the younger cohort is more educated, more mobile, more career oriented, and has access to advanced birth control devices contributes less to the difference in fertility rate. In general, an older person differs from a younger person in several dimensions, such as physical condition and the extent of exposure to social influence. Therefore, when two cohorts are compared at one point of time, their age difference has to be accounted for. Without including age effects (factors associated with aging), spurious findings may be obtained.

Investigating the aging effect is important not only for a better understanding of the cohort effect; it itself is the motivation of many cohort studies. The effects of aging are examined in many attitudinal and behavioral studies such as party identification, voter turnout, ideological conformity, church attendance, and alcoholic beverages consumption (Adams et al. 1990; Hout and Knoke 1974; Knoke and Hout 1975; Mason and Fienberg 1985). In these studies, the cohort effect is a nuisance variable that masks the aging effect.

In most cohort analyses, intra-cohort comparison is also used to measure the period effect. Period effect is the impact of factors that are associated with the time period under survey. The factors are events which have significant influence on individuals during that particular period. Period effect influences all cohorts at the same point in time and therefore at different lifestages of each cohort. Researchers measure characteristics of two or more cohorts at two or more points in time to examine the impact of the period-specific stimuli such as economic recession and political instability. In this paper, we are looking at both inter- and intra-cohort comparison.

The Cohort Analysis Model

Cohort analysis, due to its cross-sectional as well as longitudinal nature, is applied mostly to data in the form of age by period. A typical cohort data structure may be illustrated by a table in which age groups are rows and time periods are columns (see Table 1).

Each entry in the table is the measurement of a response variable of interest(Y) such as mortality rate, fertility rate, consumption rate. The ij entry is the measurement of Y of the sample segment that belongs to age category i at time j. The columns are equally spaced time periods. The rows are age categories with the same spacing as the time periods. Each of the I+J-1 diagonals corresponds to one cohort. The I by J cohort table contains information for both the inter- and intra-cohort comparisons. Information on inter- and intra-cohort comparisons can be traced by reading down the columns and the diagonals respectively. Time trends at each age level can be traced by reading across the corresponding row.

The cohort model assumes that the response variable Y is a linear function of the age effects, the cohort effects and the period effects. The basic model is formulated as:

EQUATION (1)

where m is the overall mean, Ai's are the age effects, Pj's are the period effects, Ck's are the cohort effects, and eijk is the random error which is assumed to be normally distributed with mean 0 and variance se2. The expected value of Yijk is:

EQUATION (2)

"i, Ck and Pj are the deviations from the average due to aging, cohort membership, and effects of events happened at specific time periods respectively. They are the parameters to be estimated. The predominant method of estimating a cohort model is constrained multiple regression (CMR) (Mason et al. 1973).

There are a number of applications of CMR in cohort studies (e.g., Hout and Knoke 1974; Knoke and Hout 1975; Rentz et al. 1983; Rentz and Reynolds 1991). Rentz, Reynolds, and Stout (1983) and Rentz and Reynolds (1991) applied this method in their study of soft drink consumption in the United States. In the first paper, the variables which were of the least interest were chosen for the equality constraints. In the period models, the effects of two ages and the effects of two cohorts were set to be equal. The effects of two periods and the effects of two cohorts were set to be equal in the age models and the effects of two ages and the effects of two periods were set to be equal in the cohort models. The cohort models had the best fit. In the second paper, the use of prior knowledge and all available side information was emphasized. Specifically, the period effect of 1950 and 1960 were set to equal because there was substantial decline in coffee consumption due to concern about caffeine in the 1960's and 1970's. Since there was no specific prior knowledge about the age effects and cohort effects, minimum inter-class difference within each category was used as the criterion for choosing the equality constraints.

A NEW APPROACH TO COHORT ANALYSIS

Parsimonious Cohort Estimation (PACE)

The principle of parsimony (simplicity or Occam's razor) is one of the criteria used by scientists to determine the acceptability of a hypothesis or a theory, compared with that of alternative theories which would account for the same phenomenon. It is often illustrated by reference to the Copernican heliocentric theory which was considerably simpler than Ptolemy's geocentric theory (Cohen and Nagle 1934; Hempel 1966). Both theories can explain the apparent motions of the sun, moon, and planets, and in the sixteenth century, both produced the same predictions (except for the phases of Venus). The heliocentric theory was found to be simpler by Copernicus and his contemporaries and therefore was preferred.

A parsimonious theory does not have to be a familiar one. If familiarity is used as a criterion of parsimony, no new theory would be accepted and the choice would vary from person to person. Though the concept of parsimony is not well-defined, two criteria are suggested as indicators of complexity (Cohen and Nagle 1934; Hempel 1966): the number of independent concepts used and the number of assumptions used.

TABLE 1

HYPOTHETICAL SET OF DATA FOR A COHORT ANALYSIS

For two competing theories, the one that uses fewer number of concepts or variables is simpler and therefore is preferred.

A theory is also simpler if it requires fewer assumptions. Heliocentric theory is simpler because it can account for astronomical phenomena in terms of its fundamental ideas without introducing assumptions ad hoc. Special assumptions have to be made, ad hoc, if the geocentric theory is used for explaining the same phenomena (Cohen and Nagel 1934).

There are several justifications for the preference given to simpler theories. Here the view advanced by Popper (Popper 1959) is presented. Popper argues that simpler theory is more readily falsified, if indeed it should be false, and therefore it is better testable. For instance, suppose that the following two functions are proposed to account for the variation in U:

U=2X + W

U=2X

If W is the true factor affecting U, but not X, the second function is more readily falsified by empirical evidence. In some cases, the simpler theory is stronger because it logically implies the more complex one. For example, suppose that a variable X is hypothesized as a linear function of another variable Y and two functions are proposed to account for the relationship:

X=aY + b

X=c(aY + b)

The first function is simpler and logically implies the second function. If the first one is falsified, the second function is automatically falsified.

The following section will present the technique used to achieve the parsimonious solution to the cohort problem according to the criteria mentioned above.

Cohort Analysis as a Nonlinear Programming Problem

The age-period-cohort problem is formulated in such a way that the solution obtained is maximally parsimonious. For a solution to be parsimonious, it has to have:

(a)the fewest number of variables, and/or

(b)the fewest number of assumptions.

In order to achieve the first criterion, a three-step approach is proposed:

1.Minimize the error sum of squares.

2.Require the fewest number of the independent variables to explain the dependent variable.

3.Have the smallest absolute values of the estimates.

To facilitate the understanding of the rationale behind the approach, the cohort model presented in Section 2 is reproduced:

EQUATION (3)

Minimizing the sum of squares error is an attempt to eliminate eijk from the equation. It is given the highest priority, not only because it directly corresponds to goodness of fit, but also because there is a unique error term associated with every response value Yijk. By eliminating eijk, we reduce the number of explanatory variables for the variation in Y by I*J. Step two attempts to further reduce the number of variables required for explaining the variation in Y. The fewer the number of effects are included in the equation, the more parsimonious the explanation is. For two competing models with the same number of explanatory variables, the one that has the smallest absolute value of the effect estimates is preferred according to step three. Finally, since having effect estimates equal to zero is maximally parsimonious (meeting the conditions of step two), we seek to make the model as parsimonious as possible by making the effect estimates as close to zero as possible in step three.

This three-step approach is lexicographic in nature. The three properties correspond to three ranked minimization objectives. To facilitate estimation, a compensatory model is specified to approximate the lexicographic minimum. The objectives are combined as if they are commensurable and weights are assigned to them so that they are expressed in a single performance measure. The weights are assigned in such a way that the preemptive priority structure is preserved. Accordingly, the following nonlinear programming problem is formulated to obtain the parsimonious solution which will have the three minimization objectives.

EQUATION (4)

M1, M2, M3, and M4 are weights, with 0<<M1<<M2<<M3<<M4 (<< symbolizes "much less than"). The e's are residuals. The function h is minimized with respect to the Ai's, Pj's, Ck's, and the e's. Step 1 is satisfied if (IV) is minimized. Step 2 is satisfied if (II) and (III) are minimized. (III) is minimized if all the effect parameters of one of the three variables (age effect, period effect, and cohort effect) are equal to zero. (II) is minimized if the effect parameters of two of the variables are equal to zero. Hence by minimizing line two and three, we are minimizing the number of variables required to explain the dependent variable (step 2). Step 3 is satisfied if (I) is minimized. The weights are assigned in such a way that the difference between two successive weights, for instance M3 and M4, is big enough to guarantee that (IV) is minimized first, followed by (III), (II) and (I) successively. Hence, the priority specified in the three-step process is achieved.

To achieve the second criterion of parsimony, no assumption external to the basic structure of the cohort model would be imposed in the estimation process.

Model Estimation

The nonlinear function h is solved using the subroutine NCONG, a Fortran code developed by Schittkowski(1986). It uses a successive quadratic programming method to solve the nonlinear function. No constraints are imposed other than that the estimates have to satisfy the conditions in (3). No information external to the data is used in the estimation of the effect parameters. To operationalize (4), we use 0.001, 2.5. 99, and 99999 to be the value of M1, M2, M3, M4, respectively. We have found these values to work well on trial data.

Starting values for the effect parameters and the error terms are required for searching the optimal solution. Since only local optima are found by NCONG, different sets of starting values should be used to generate solutions. There are many ways in which the starting values might be selected. For our analysis we choose starting values in such a way that each one of them is a desirable potential solution to the minimization problem. For example, one set of starting values would have only cohort effects and another set would have only time effects.

COMPARATIVE PERFORMANCE TESTING

The relative ability of PACE and CMR to represent the true process that generates the data is studied in this section. The testing consists of a simulation which compares the abilities of the two methods to recover true model parameters.

A Simulation of the Comparative Performance of PACE and CMR

A simulation is performed to compare the estimation accuracy of PACE and CMR. The design and results are described next.

Design. 240 random data sets are generated, reflecting 10 replications each for all combinations of three error variances (0, 10, 20) and eight true models. Thus we have 10 replications in a 3x8 full factorial design. The eight true models are:

(1) Y=m+A+e

(2) Y=m+P+e

(3) Y=m+C+e

(4) Y=m+A+P+e

(5) Y=m+A+C+e

(6) Y=m+P+C+e

(7) Y=m+A+P+C+e

(8) Y=m+e

All effect parameters are assumed to be normally distributed with mean 0 and variance 10. Thus we first sample the effect parameters, which we then assume to be constant across replications. We then sample an error term for each replication. For simple interpretation of results, we set m=0. (This has no effect on the results.) Every data set is in the form of a cohort table with four age groups, four time periods, and seven cohort groups. In total 3840 data points are employed in the simulation.

To operationalize PACE, eight different starting points are constructed for each data set. We choose as our solution the one that generates the minimum value of h specified in Equation (10). As for CMR, equality constraints have to be set up for each data set. According to Rentz and Reynolds (1991), prior knowledge and all available side information should be used to determine the constraints. Since there is no prior knowledge, we assign a constraint in the commonly accepted manner: the two effect levels with the most similar means are set to be equal.

Results. The mean square errors of the effect estimates (calculated as deviations from the true parameter values) are used to compare the estimation accuracy of PACE and CMR. The degrees of freedom used for calculating the mean square error of an effect are:

(number of categories of the effect - 1) x 240

Therefore, the degrees of freedom for both the time and age effects are (4-1) x 240, and the degrees of freedom for the cohort effects are (7-1) x 240. The results are summarized in Table 2. In all three cases, PACE greatly outperforms CMR. The results should not be surprising. CMR imposes arbitrary assumptions on the data; therefore it does not perform well. In consumer research, researchers and practitioners mostly deal with data which do not have known and well-established prior knowledge. Using CMR for cohort analysis may result in erroneous findings.

CONCLUSIONS

The ability of a company to make accurate business forecasts is essential for the long term viability of the company. Knowledge of the changes taking place in the micro- and macro-environment is an important input in forecasting. One of the most important current shifts in the macro-environment is the changing age distribution. The American population is aging. The 45-54 bracket is expected to grow 46% by the year 2000. By contrast, the 25-34 bracket is expected to have a 15.4% decline (MRI 1990). Product categories targeting older audience will benefit as their universe of customers expands. However, the correlation between age and consumer behavior is complicated by the cohort phenomenon. The baby boomers, who will comprise the majority of the older population in the following decades, will carry some of their consumption patterns with them as they age. This is good news for marketers who have been targeting age groups that are shrinking. As for marketers who have been targeting the older audience, their traditional strategies may not be effective because the consumption behavior and response to marketing variables of the baby boom generation are very different from those of their parents. To identify opportunities in the growing elderly market, a knowledge of the cohort, age, and period effects on consumer behavior is crucial.

TABLE 2

MSE OF EFFECT ESTIMATES OBTAINED FROM PACE AND CMR

Estimating the effects is problematic because the effects cannot be unambiguously determined. There exists no unique solution for each individual effect. Cohort analysis is a group of methods designed to separate cohort, age, and period effects. This paper proposes a new approach, parsimonious cohort estimation (PACE), to the cohort estimation problem. A non-linear programming problem is formulated to operationalize the model. A simulation was performed to test the performance of PACE in comparison to the performance of CMR, the predominant existing approach in cohort analysis. The results suggest that PACE performs much better. Since PACE is an objective procedure (unlike CMR), has a clear theoretical basis (unlike CMR), and performed much better in a simulation test, we recommend the adoption of PACE (over CMR) as the standard method of conducting cohort analysis.

In conclusion, a new approach to cohort analysis is proposed in this paper. PACE has a theoretical advantage over the predominant existing approach and is demonstrated to be better empirically. The solutions obtained from PACE should aid researchers in understanding the cohort, age, and period effects on consumer behavior, and in formulating effective life-stage marketing oriented strategies.

REFERENCES

Adams, Wendy L., Philip J. Garry, Robert Rhyne, and James S. Goodwin (1990), "Alcohol Intake in the Healthy Elderly: Changes with Age in a Cross-Sectional and Longitudinal Study," Journal of the American Geriatrics Society, 38 (March), 211-216.

Akers, D.S. (1965), "Cohort Fertility Versus Parity Progression as Method of Projecting Births," Demography, 2 (1965), 414-428.

Clogg, Clifford C. (1982), "Cohort Analysis of Recent Trends in Labor Force Participation," Demography, 19 (November), 459-479.

Cohen, Morris R. and Ernest Nagel (1934), An Introduction to Logic and Scientific Method, New York, NY: Harcourt, Brace and Company.

Edgell Communications, Inc. (1991), Beverage Industry Annual Manual, Cleveland, OH: Magazines for Industry.

Frost, W.H. (1939), "The Age Selection of Mortality from Tuberculosis in Successive Decades," American Journal of Hygiene, 30 (1939), 91-96.

Glenn, Norval D. (1977), Cohort Analysis, Beverly Hills, CA: Sage.

Hempel, Carl Gustav (1966), Philosophy of Natural Science, Englewood Cliffs, NJ: Prentice-Hall.

Hout, Michael and David Knoke (1974), "Social and Demographic Factors in American Political Party Affiliations, 1952-1972," American Sociological Review, 39 (1975), 700-713.

Hout, Michael and David Knoke (1975), "Changes in Voting Turnout, 1952-1972," Public Opinion Quarterly, 39 (1975), 52-68.

Leventhal, Richard C. (1990), "The Aging Consumer: What's All the Fuss About Anyway?" Journal of Services Marketing, 4 (Summer), 39-44.

Mason, Karen Oppenheim, H.H. Winsborough, William M. Mason, and W. Kenneth Poole (1973), "Some Methodological Issues in Cohort Analysis of Archival Data," American Sociological Review, 38 (April), 242-258.

Mason, William and Stephen E. Fienberg (1985), Cohort Analysis in Social Research, New York, NY: Springer-Verlag New York Inc.

Mediamark Research Inc. (1990), Targeting Consumers at the Crossroads of Their Lives, New York, NY: Mediamark Research Inc.

Popper, Karl Raimund (1959), The Logic of Scientific Discovery, New York, NY: Basic Books.

Rentz, Joseph D., Fred D. Reynolds, and Roy G. Stout (1983), "Analyzing Changing Consumption Patterns with Cohort Analysis," Journal of Marketing Research, 20 (February), 12-20.

Rentz, Joseph D. and Fred D. Reynolds (1991), "Forecasting the Effects of an Aging Population on Product Consumption: An Age-Period-Cohort Framework," Journal of Marketing Research, 28 (August), 355-360.

Sacher, G.A. (1977), "Life Table Modification and Life Prolongation," in Handbook of the Biology of Aging, eds. C.E. Finch and L. Hayflick, New York, NY: Van Nostrand Rinehold, 582-683.

Sasaki, Masamichi and Tatsuzo Suzuki (1987), "Changes in Religious Commitment in the United States, Holland, and Japan," American Journal of Sociology, 92 (March), 1055-1076.

Schittkowski, K. (1986), "NLPQL: A FORTRAN Subroutine Solving Constrained Nonlinear Programming Problems," (edited by Clyde L. Monma), Annals of Operations Research, 5, 485-500.

----------------------------------------