Composite Population Descriptors: the Socio-Economic/Life Cycle Grid

Richard B. Ellis, American Telephone & Telegraph Co.
[ to cite ]:
Richard B. Ellis (1975) ,"Composite Population Descriptors: the Socio-Economic/Life Cycle Grid", in NA - Advances in Consumer Research Volume 02, eds. Mary Jane Schlinger, Ann Abor, MI : Association for Consumer Research, Pages: 481-494.

Advances in Consumer Research Volume 2, 1975      Pages 481-494

COMPOSITE POPULATION DESCRIPTORS: THE SOCIO-ECONOMIC/LIFE CYCLE GRID

Richard B. Ellis, American Telephone & Telegraph Co.

[This research was sponsored by the Market Research Section of the American Telephone & Telegraph Company. The author is indebted to Mr. G. Williams for substantial systems and data processing support in this project.]

[Richard B. Ellis is Marketing Manager - Reports of the Market Research Information System at AT&T.]

Greater sophistication in marketing data and analysis techniques has brought increased and more precise insight, but has also caused problems in maintaining consistency between studies and communicating results to management. Two composite population descriptors Socio-Economic Status and Family Life Cycle - have been found valuable both as surrogates for some more complex factors and as stabilizing parameters in presentations and comparisons of study findings.

Both measures have been adapted and adapted to the specific needs of the communications market, and their validity and efficiency in this work is described. Combined in a matrix format, the "SES/LC Grid", they have proven useful in the fields of defining strata for survey probes, correlation and extrapolation of results from different studies and normalization of different populations for comparison purposes.

The past several years have seen an almost exponential increase in the diversity and sophistication of the data and analytic techniques available to the student of consumer behavior. The benefits of this inventive growth are apparent to anyone who has followed their exposition in the literature, bit it has also produced a few impediments for the researcher whose goals include the development of a body of consistent analytic finding, over time for management applications. Specifically, three problems have tended to accrue in this environment:

- While increasing sophistication has improved our insight and understanding, its complexity has also increased the problems of making these findings clear and credible to the client of research - the manager without professional analytic skills.

- In observing consumer behavior over time, it is obviously of benefit to be able to link and measure its changes, but this becomes quite difficult as the definitions and measurements of its dimensions shift and alter as new developments appear.

- Finally, when studying different manifestations of market effects on a geographic basis, there is a natural desire to anchor these in some way for comparative purposes, so that differences caused by location, product perception and sales effort may be identified and evaluated.

None of these difficulties preclude the use of advances in the art, but they do indicate to the prudent practitioner the need for some research yardstick against which can be measured the wealth of new data we are acquiring. The criteria for such a yardstick are practically self-evident. It should be:

- Simple. Easily comprehended and grasped.

- Logical. Possessing prima facie validity.

- Consistent. Durable enough to withstand both the winds of change and the fires of fad.

- Comparable. Capable of being linked not only to other internal information. but external data as well.

The most satisfactory tool we have found to forestall these difficulties, given these criteria, is the use of two composite demographic measures in combination - Socio-Economic Status and Family Life Cycle measurement.

These measures have been so familiar to research practitioners for a number of years that they will bear only the briefest review here. However, a few words in defense of these sometimes maligned measures may be in order. It is undoubtedly true that the actual coalescence of vague intentions into an act of purchase is an exceedingly complex process, fraught with stimuli and responses pyramided on values and attitudes nurtured deep in the consumer's bosom. But it is also true that many of these relatively ephemeral factors are at least partially mirrored and defined in overt descriptive facts about the buyer, and that empirically strong correlations can be found between buyers' characteristics and their consumption patterns.

At a time when research tends to center on the shifting values of social class and the consumption patterns of the post-nuclear family, this may seem a strange position. But the fact remains that our experience in research done on the residence telephone market strongly supports this contention. The composite variables themselves have been used extensively as inputs in a variety of marketing problems and analytic methodologies. The consistency and strength with which they emerge as significant descriptive and differentiating factors has confirmed their validity for this type of work in our market.

As one example, in a study using factor analysis, designed to identify the significant elements differentiating customer groups in terms of geography and purchase behavior, only four significant factors were found. Two of these primary descriptors were Socio-Economic Status and Family Life Cycle which, when coupled with family-housing and mobility factors accounted for 886 of the variability found among 86 population groups tested in the country.

It is also quite relevant that in several analyses, one or more of the components of these composite variables has been found to be significant on its own, and has entered the calculations in addition to the combined values. This has tended to verify the basic hypothesis that the combination of the individual components in the composite form apparently describes customer attributes different from those delineated by the individual variables alone.

Based on this experience, we believe that there is substantial evidence that these descriptors reflect certain relatively intangible life style and social factors which affect buyer behavior. Naturally, they should not be understood to be such factors themselves, nor are they directly substitutable for them. However, they have been found to correlate highly with these more complex measures, and they are considerably easier and less expensive to determine, analyze and understand. In addition, they have the advantage of being factual descriptors of our customers which can be used to estimate total population characteristics.

One final observation on these problems is germaine, but more specific to the communications product than some others. Both of these descriptors obviously tend to measure the joint characteristics of families or households, in most cases these being synonymous. They have proven to be unusually efficient in our work since, unlike a number of other consumer products, the household, rather than a particular member, is the actual "consumer" of our services.

Socio-Economic Status (SES) is a numerical expression, developed by the U.S. Bureau of the Census, of the combined effects of three characteristics of the household and its nominal representative, the head of household. The three factors are income, education and occupational status. Each of these characteristics has been fitted to a numerical scale running 0-100, based on the household population of the country. The SES score is computed by determining the numerical value for each individual attribute and then averaging these three figures. For those interested in a more detailed description of the history and development of these factors, the Bureau of the Census has documented its methodology and findings quite thoroughly.

FIGURE 1

TYPICAL DISTRIBUTIONS OF SES VARIABLES VS. CONSUMPTION

In the original formulation, both the individual scores and the composite SES scores were selected so that the total U.S. population fell approximately on a normal distribution curve. When applied to market analysis of consumption patterns, of course, this assumption of normality does not hold strictly true. In the case of purchase behavior toward discretionary telephone products and services, the curve is truncated in the lower portion by an economic and cultural threshold below which customers do not perceive anything other than basic telephone service to be of value. It is also flattened in the upper portion where a simple extrapolation of ability and proclivity to purchase a product far outstrip the capacity to consume it. However, in the middle range of the distribution where the mass of the market lies, the basic assumptions have been found to pertain, and clear correlations between customer characteristics and buying behavior exist.

These concepts are illustrated in Figure 1, which generally reflects in prototype form the distribution of the SES component variables and the composite values as well. The assumed normal distribution in cumulative form is shown as a solid line, and a typical consumption pattern for communications Products as a dotted line.

Since each component of the SES score has a discrete and limited number of possible values in the real world, the selection of the intervals for each individual variable and the composite score was done by a simple empirical fitting process. a at is, purchase patterns for each product and service were determined for every possible value of each variable, and the break points established where the significant differences in behavior could be found. While this technique might seem peculiarly simple and naive, it proved quite effective, and a high degree of consistency was found among the individual variables and the composite scores related to the specific purchase patterns being studied.

Tables 1 and 2 show the original Census values assigned to family income and head of household education, and the specific scores selected for use in telephone market work. The differences are due to two basic reasons:

- The Bureau of the Census scores were computed and validated based on the population enumeration of 1960. Since that time, both average educational attainment and income have risen substantially due to various social and economic factors. This has tended to make the very fine gradations in the lower reaches of these tables less meaningful in terms of differentiating behavioral categories.

- A number of studies of the Bell System residence market have shown that there are certain relatively broad categories of income and education classification within which no significant differences among customers can be found.

For example, from a telephone usage viewpoint it is impossible to differentiate between households with incomes of $6,000 and $7,000, or household heads who left college after one year or two years. Therefore, no valid purpose is served by maintaining the more complex scoring system.

TABLE 1

SES INCOME SCORES

TABLE 2

SES EDUCATION SCORES

TABLE 3

SES OCCUPATION SCORES

In the case of occupational codes, the Bureau of the Census offers a very detailed set of scores reflecting close to 450 different occupations, plus a set of seven summary categories customarily used in most census reports. Similar to the cases of income and education, the very detailed classifications were found not to be particularly meaningful in differentiating telephone customers, so the summary categories were used, supplemented by some additional classifications to take into account "unemployed" but consuming households. These are shown in Table 3.

As a result of these analyses, two categorizations of SES were defined for use in our research projects, as shown in Table 4, together with the original Census classifications for comparison.

TABLE 4

SOCIO-ECONOMIC STATUS CATEGORIES

The ten interval "detail" category is designed to be used as an independent variable in a variety of technical methodologies, most often in conjunction with the specific variables which make up the composite SES score. It should be noted again that the classification intervals are not distributed evenly over the assumed normal curve, but rather are defined in specific terms of communication market patterns, generally concentrated in the middle ranges where the market for these services exists.

The four "summary" categories are derived from these classifications, and are used as interpretive criteria in presentations, sample stratification and population normalization, as described later in this paper.

The definitions and classifications of Family Life Cycle (FLC) status are less clear in terms of number of segments and significant characteristics. The demographer's viewpoint tends to center on key factors affecting population size and make-up. Probably the best-known expositions of this facet are the articles by Glick and Parke, while Norton offers an updated view of the significant shifts in family life patterns. These have overall value in forecasting gross markets or economies, but they tend to be less useful to the market analyst faced with the problems of specific products in a relatively short-term environment.

Certain marketing applications are patently self-evident - more babies will eat more baby foods, marriages will produce housing starts, and an increasing older population will create demand for geriatric products. But for the vast majority of consumer products, the relationships tend to be more complex. General studies of FLC status related to broad patterns of income and consumption have produced some consistency on the significant components, age and make-up of family being the two most common. However, no consensus on the number and definition of the prototype groupings has emerged.

Wells and Gubar offer a comprehensive summary of these problems and describe the findings of various life cycle studies in the decade preceding 1966. They also propose a prototype classification, without specific values, which served as the basis of our work.

Faced with this uncertainty and definitional vagueness, the analyst is constrained to explore the subject anew, once again with specific products, services and consumption patterns in mind. Such a study was conducted for all the discretionary purchase activity in the residence telephone market utilizing all the variables suggested in the literature as potential components of an FLC measure. These included number, relationship, age and sex of family members, and an employment variable for older families as suggested by the 1964 Michigan Survey of Consumer Finances.

Sonquist and Morgan's Automatic Interaction Detector technique was selected for these analyses because of the variety and intercorrelation of the variables. The results were analyzed to produce a set of classifications significant both for the individual products and services, and the total patterns of expenditure.

TABLE 5

FAMILY LIFE CYCLE STATUS CATEGORIES

Table 5 shows the general prototype classifications and the specific categories, with their definitions developed for telephone market analysis. De intended applications of the "detail" and "s~mary" classifications are identical to those defined for the SES categories.

Several interesting points emerged in the analysis which developed these classifications.

- The head of household age which discriminated most successfully between "younger" and "older" households came out to be 55. This is quite different from the traditional 65 year retirement, and breaks at the 40-45 year level developed in some previous studies, most notably the 1957 Life Study and the 1964 Michigan Survey already cited.

- The significant age which discriminated between "young' and "older" children was also different from figures generally used in previous studies. The breaking point for our purposes came at 12 years, rather than at 6 as suggested by a number of other analyses. The 1965 Life Study did propose 12 as a discriminating age, but only in a secondary sense. This older age is undoubtedly due to the consumption patterns of telephone services as contrasted to a number of other consumer goods.

- In the case of "older" households (head of household over 55) the size and makeup of the household became irrelevant, and the employment status was the sole discriminating variable found.

The final synthesis of these composite descriptors for application purposes takes the form of the simplest analytic tool of all, the matrix. The format of this "Socio-Economic/Life Cycle Grid" is shown in Figure 2. As a matter of interest, the percent of U.S. households served by the Bell System, as estimated from our sample, is shown for each cell.

Several features should be noted in this arrangement:

- The four SES categories appear in the columns of the grid, and the four FLC categories as rows. Each cell, therefore, represents a specific and discrete class of people completely defined by the two factors. This cross classification ensures that the significant differences between the cell members - within cell variability - is minimized

- For each SES and FLC category, the composite value of the characteristic displayed is aggregated for the entire class of customers in the marginal boxes, and the total value for the entire group is shown in the lower right hand corner

- Certain key composite classes of customers are computed in the boxes at the bottom of the page "Middle Class" (Lower Middle plus Upper Middle SES), "Family Households" (Younger Households with any children), "Younger Households", and "Middle Class Family Households" (the center four cells in the matrix).

In research applications, the SES/LC Grid is used in different ways depending on the complexity of the characteristics being studied, and the size of the sample being distributed through the 16 cells. For large sample populations, individual cell analysis of average characteristics may be valid for the aggregate or composite totals. When smaller populations are involved, the aggregate cells may be used for quantitative analysis, and judgements made on individual cell behavior dependent on the consistency of the data and the stability of relationships between the individual cells and their corresponding aggregates.

In practice, the grid has been used in three major applications, all somewhat interrelated - stratified sample selection, inter-project study correlation, and population normalization. The first application is relatively simple, but is quite valuable in terms of time and cost savings for certain types of surveys. Typically, the situation involves preliminary probes on new concepts, products, services, pricing plans, or the like. The researcher is interested in refining and quantifying his hypotheses on a regional or national basis, but the selection of a probability or scientific sample would involve very large numbers of respondents, high cost and a longer time to complete and analyze the survey. This condition is substantially aggravated when visual materials or product demonstration are required to elicit a suitable respondent reaction, and in-home interview costs rise rapidly.

FIGURE 2

The normal answer to this dilemma is a stratified sample allowing for consumer numbers of respondents and more limited interviewing areas. However, lacking more definitive data on the segments of the new market configuration being studied, the strata are difficult to select and the size of the sample problematical. We have had considerable success in establishing the distribution of the study population in terms of the SES/LC Grid and using the 16 cells as the de facto strata. This allows much greater freedom in selection of interview techniques and locations; for example, a small number of fixed walk-in facilities strategically located in various regions of the country. A few basic demographic questions added to the study specific material categorize the respondents with regard to the cells in the grid. Interviews are continued until adequate representation is obtained in the cells, and the results analyzed. For quantified estimates of total population response, these results can be weighted by the known cell sizes and expanded mathematically.

While obviously not as accurate and precise as a carefully selected representative sample, it has proven quite valuable and less expensive in studies during the early stages of product or concept development.

A correlary of this technique is the ability to compare the results of studies conducted for different purposes, under different conditions, at different times and places. This is of particular concern in the communications market where a high degree of interactivity has been found between customer perceptions and purchases of different products and services. This same relationship appears to carry forward, as expected, into new products as well. The inclusion on each survey vehicle of the few responses necessary to classify the respondents allows certain basic comparisons to be made between various studies in terms of the estimated customer groups and their characteristics.

For example, in one set of preliminary studies on three new products, it was found during post-survey comparisons, that the supposed market for the different products overlapped by almost two-thirds. The desirability and cost of a composite of the three products should obviously be determined before proceeding with final design specifications.

Normalization of data is a standard and frequently used statistical technique. However, when the data being normalized are people, not only do methodological problems crop up, but a faint odor of manipulation may tend to taint the findings in the manager's mind. The objective is normally quite clear - the equalization of external factors so that a valid judgement may be made of the significant differences between two or more populations with regard to an identified characteristic or activity. The difficulty encountered with consumer groups involves the large number of exogenous factors which may pertain, and the complexity of trying to account for all of them.

The SES/LC Grid offers a simple visual method of accomplishing this, with the further advantage of the client's being able to see precisely what is being done and what the effects are. In actual practice, two techniques are used to accomplish this. In the case of large sample populations, (over 600) cell by cell analysis is generally possible and will yield a relatively precise picture of the legitimate differences between the two populations. Lacking this number of cases to study, the simplest approach is to proportionalize one of the populations to the structure of the other, and conduct the analysis on the resulting marginals. This is normally most easily accomplished by recomputing the marginal values for one population using the actual cases by cell of the other, thus eliminating the distortion caused by the disparities in distribution between the two.

As a byproduct of such comparisons, a general estimate of market potential is also practicable. Given two market groups broken into the SES/LC cells, a comparison of market development on a cell by cell basis will give clear indications of those population segments where purchase is distinctly less in one of the markets. The difference between the two will normally yield a rough estimate of the unrealized potential in the underdeveloped market.

The use of life style and social class data is generally accepted in consumer research today, and they offer unusual and valuable insights into the market and consumer behavior. However, it is also difficult and expensive to gather valid data and apply them to the total population. Use of Socio-Economic Status and Family Life Cycle factors, individually or in combination, offers a much less expensive and more stable alternative for many purposes. As a supplement to simple demographic characteristics, composite or index variables hold great potential for understanding and predicting buying behavior, extending our view without shortening our communications lines to our clients.

REFERENCES

Barton, S.G. The life cycle and buying patterns. In Lincoln H. Clark (Rd.), Consumer behavior. Volume II. New York: New York University Press, 1955.

Glick, P.C. The life cycle of the family. Marriage and Family Living, February, 1955.

Glick, P.C. & Parke, R., Jr. New approaches in studying the life cycle of the family Demography. 1965 2.

Lansing, J.B. & Kish, L. Family life cycle as an independent variable. American Sociological Review, October, 1957.

Lansing, J.B. & Morgan, J.N. Consumer finances over the life cycle. In Lincoln H. Clark (Ed.), Consumer behavior. Volume II. New York: New York University Press, 1955.

Norton, A.J. The family life cycle updated: Components and uses. Paper presented at the annual meeting of the Population Association of America, New Orleans, La., April, 1973.

Sonquist, J.A. & Morgan, J.N. The detection of interaction effects. University of Michigan, Institute for Social Research, Survey Research Center. Monograph No. 35, 1964.

U.S. Bureau of the Census. Methodology and scores of socioeconomic status. Working Paper No. 15, 1963.

U.S. Bureau of the Census. Socioeconomic characteristics of the population: 1960. Current Population Reports. Series P-23. No. 12. 1964.

Wells, W.D. & Gubar, G. Life cycle concept in marketing research. Journal of Marketing Research. 1966, 3.

----------------------------------------