Graphic and Verbal Presentation of Stimuli: a Probabilistic Mds Analysis

David B. MacKay, Indiana University
Mark Ellis, Indiana University
Joseph L. Zinnes, University of Illinois
ABSTRACT - A probabilistic unfolding model is used to test a variety of hypotheses about the parameters underlying consumers' evaluations of graphic and verbal stimuli. Specific hypotheses concern the dimensionality of the space, the uncertainty with which stimuli are perceived by consumers, and the similarity of configurations derived from graphic and from verbal stimuli. Differences between configurations derived from graphic and verbal stimuli are explored.
[ to cite ]:
David B. MacKay, Mark Ellis, and Joseph L. Zinnes (1986) ,"Graphic and Verbal Presentation of Stimuli: a Probabilistic Mds Analysis", in NA - Advances in Consumer Research Volume 13, eds. Richard J. Lutz, Provo, UT : Association for Consumer Research, Pages: 529-533.

Advances in Consumer Research Volume 13, 1986      Pages 529-533

GRAPHIC AND VERBAL PRESENTATION OF STIMULI: A PROBABILISTIC MDS ANALYSIS

David B. MacKay, Indiana University

Mark Ellis, Indiana University

Joseph L. Zinnes, University of Illinois

ABSTRACT -

A probabilistic unfolding model is used to test a variety of hypotheses about the parameters underlying consumers' evaluations of graphic and verbal stimuli. Specific hypotheses concern the dimensionality of the space, the uncertainty with which stimuli are perceived by consumers, and the similarity of configurations derived from graphic and from verbal stimuli. Differences between configurations derived from graphic and verbal stimuli are explored.

INTRODUCTION

Applications of multidimensional scaling (MDS) implicitly or explicitly involve the assessment of a variety of hypotheses. Hypotheses concerning the dimensionality of the space that underlies the stimuli are perhaps most common. Other hypotheses might involve the equality of two or more configurations, order relations among stimuli on particular dimensions, and the nature of the distance function. Confirmation of such hypotheses requires the user to go beyond the calculation of a general loss function, such as Kruskal's stress, and to assess the "influence and balance of systematic and random effects in the data" (Heiser and Meulman 1984).

Statistical and quasi-statistical methods can be used for confirming MDS hypotheses. In the latter category, for example, would be the efficacy coefficient proposed by Lingoes and Borg (1983) for testing hypotheses with their cons t rained/ confirmatory monotone distance analysis (CMDA) procedure. CMDA (Borg and Lingoes 1980) is a nonmetric NDS method which allows the user to impose order constraints on the distances among points in the estimated configuration. To determine if a constrained CMDA solution is equivalent to an unconstrained solution, the efficacy coefficient - a partial correlation of the ordered distances in the constrained and unconstrained configurations with the order of the original data partialed out - is computed. Unfortunately, the absence of an error model denies the authors the use of inferential statistical theory and requires them to develop a heuristic method for determining which values of the efficacy coefficient lead to the rejection of a hypothesis and which values do not.

To test hypotheses using the efficacy coefficient, a two stage decision model is used. In the first stage, the user compares the efficacy coefficient to the coefficient of alienation - the square root of one minus the squared correlation of the order of distances in the constrained MDS solution and the order of distances in the unconstrained MDS solution. The hypothesis of equivalence between the constrained and unconstrained MDS solutions is rejected if the efficacy coefficient is less than the coefficient of alienation. If the efficacy coefficient is more than three times the size of the coefficient of alienation the hypothesis of equivalence is accepted. If the efficacy coefficient is between one and three times the size of the coefficient of alienation, the user proceeds to stage two. Stage two reduces the value of the criterion required to accept the hypothesis of equivalence by computing instead an average of subjectively scaled factors which include items such as the sample size, configuration size, size of the matrix. etc.

Probabilistic models provide a different approach to confirmatory MDS. Development of probabilistic models has been motivated by the desire for an error model that will allow the use of test statistics for a wide variety of hypotheses. A number of probabilistic MDS models have been proposed, viz. (DeSoete and Carroll 1983), (DeSoete, Carroll and DeSarbo 1985), (Ramsay 1977), (Takane 1981), (Zinnes and MacKay 1983). These models differ with respect to the types of judgments they handle, the measurement properties they assume, and the error models they posit. A review of several probabilistic models has been provided by Young (1984).

Instead of implicitly assuming a deterministic judgment process like CMDA, probabilistic models explicitly assume a probabilistic judgment process and make these assumptions intrinsic components of the model. A primary rationale for probabilistic models is the observed inconsistency of subjects' judgment processes. By assuming a probabilistic process, it is also often possible to make use of very powerful statistical procedures in the estimation of model parameters and the testing of a wide range of hypotheses. In addition, individual models may possess other attractive properties, such as the ability to account for Weber properties, asymmetric judgments, and nonisotropic spaces (Zinnes and MacKay 1981).

Probabilistic MDS models are, though, quite new and users are still faced with a number of open issues which are in need of further investigation. These issues include the sensitivity of the models to departures from their specific error assumptions and the effect of sequential hypothesis testing in accepting and rejecting hypotheses.

In this paper, a probabilistic model for consumer choice data is first defined. Then, an experiment is described which was designed to gather information on consumers' preferences for residential alternatives. Half of the subjects in the experiment received information on the stimuli in a graphic form and half of the subjects received information on the stimuli in a verbal (written) form. Finally, the probabilistic model is used to evaluate hypotheses concerning the dimensionality of the space, the homogeneity with which the stimuli are perceived by the subjects, and the equality of the configurations derived from graPhic and verbal data.

A PROBABILISTIC MDS MODEL

PROSCAL (MacKay and Zinnes 1982) is a probabilistic multidimensional scaling program for incomplete, complete or replicated data. The data are distance judgments between pairs of stimulus objects. PROSCAL represents stimuli as points in a multidimensional space and provides maximum likelihood estimates (MLE) of each point's location. In addition, dispersion parameters are estimated for the stimuli.

Originally defined only for proximity data, PROSCAL has recently been expanded to accommodate preference data as well (MacKay and Zinnes 1985). Preferences are evaluated by a probabilistic unfolding model in which the location coordinates of the stimuli and subjects are assumed to be normally and independently distributed in an isotropic r dimensional space. Standard deviations of the stimulus points are interpreted as measures of the heterogeneity of the subjects' evaluations of the stimuli. Standard deviations of the ideal points are interpreted as the amount of uncertainty in a subject's judgments.

Data for the PROSCAL unfolding model consist of preference ratio judgments. In collecting preference ratio judgments, subjects will typically be asked two questions for each pair of stimuli. First the subject will be asked to identify the stimulus of the pair which is more preferred and then to identify the degree to which that stimulus is preferred over the less preferred stimulus.

To use MLE methods, the tensity function (pdf) of the judgments must be computed. Given the probabilistic assumptions of the model, the squared standardized distance from an ideal point to a stimulus point can be shown to be distributed according to a non-central chi square distribution. If stimulus j is preferred to stimulus k by subject i, the preference ratio is represented by a ratio of distances dki/tji. Conversely, one may consider the ratio djildki as a measure of subject i's disutility for stimulus j relative to stimulus k. Since the ratio of two noncentral chi-square distributions is a doubly noncentral F distribution, F" (Kendall and Stuart 1979), the pdf of the ratio q = dkiZdji can be defined (MacKay and Zinnes 1985).

Given the pdf, PROSCAL then finds the MLE of the parameters through numerical estimation. Initial estimates of the configuration are provided by a deterministic metric unfolding of I scales (Coombs 1950) estimated from the preference ratio data. The I scale for subject i's evaluation of stimulus j is simply defined as the geometric mean of subject i's preference ratio judgments involving stimulus ;. Initial estimates of the variances are defined as a function of the squared differences between the preference ratio data and the initial configuration's estimates of the preference ratios.

AN EXPERIMENTAL EVALUATION OF GRAPHIC AND VERBAL PRESENTATIONS

In a prior study (MacKay and Zinnes 1985), probabilistic MDS was used to evaluate subjects' residential preferences for residential alternatives that were defined on the basis of two dimensions, environmental level and time to work. Results from this prior study were the source of most of the hypotheses that were tested in the study that is described here.

Data

At the beginning of the experiment, subjects were told that they would be evaluating a new system designed to help realtors provide out-of-town clients with a list of residences for their consideration. After being introduced to the experiment, subjects were given a series of warmup tasks to assist them in making the required types of judgments.

To custom tailor the residential preference ratio questions to the prior interests of a subject, a graphic computer interactive data collection program was written. All information was conveyed to the subject by means of two color CRTs. One CRT was used for displaying instructions and receiving subjects' responses. A second high resolution CRT was used for displaying the stimuli.

Each subject expressed preference ratios for au pairs of twelve stimuli. A balanced incomplete block design was used to define the twelve stimuli on the basis of three variables commonly cited in the residential preference literature: environmental quality (three levels), price-quality (four levels) and distance to work (four levels).

The opening scenario told subjects that they were moving to a new town and that they would be working in the central business district (CBD) of a medium sized American city. In the warmup phase of the experiment, subjects were asked whether they were interested in renting or purchasing a house or in renting or purchasing an apartment (condominium). Specific values of the variables defining the stimuli depended upon the type of residence preferred by the subject. If the subject chose to rent or purchase a house, the four levels of tim^-to-work were 10, 20, 30 and 40 minutes. If the subject chose to rent or purchase an apartment (condominium), the travel times were 5, 10, 15 and 20 minutes.

Environmental level was conceptualized as a compound variable consisting of two parts - population density and level of local services (schools, parks, local retail outlets, etc.). For subjects stating a preference for a house, the density levels were given as one, two and three thousand persons per square mile and the service levels were defined as low, medium and high. For subjects stating a preference for an apartment/condominium, density levels were stated to be three, four and five thousand persons per square mile and the service levels were again defined as low, medium and high. Density levels were described for the community in which the subjects resided to provide a benchmark.

The four price levels, utility costs plus mortgage payments or rent, were the same for all stimuli - $300, $450, $600 and $750 per month. Subjects were told that their desired level of quality should also be considered when evaluating the price level of a residence.

While the experiment constructed residential stimuli from only three variables, it was obvious from pretests and the literature that more than three variables were involved in actual residential decision making. To make the task more realistic, subjects were also asked to specify the number of bedrooms they required. All of the residences subjects evaluated were said to be drawn from a list of available properties that met their size requirements.

Thirty-eight subjects took part in the experiment. All of the subjects were graduate students in an MBA program. MBA students were selected because they were about to go into the housing market at salary levels which were high enough for many to contemplate purchasing a house or condominium. In addition, the scenario of working in a CBD and having to communicate with a realtor in a distant town was a scenario which many of the students would soon, if not already, experience.

In the main phase of the experiment, subjects were shown pairs of residential alternatives and asked to first indicate which alternative they preferred and then indicate the degree to which they preferred that stimulus Over the other. Half of the subjects had the stimuli displayed graphically on a map of the city. The city was divided into different zones which contained different environmental levels. Time to work was proportional to the distance of the residence to the CBD. Price levels for the two residences in each pair was indicated in the legend. Subjects touched a light pen to the residence on the map they preferred more for the first part of the preference ratio judgment.

The other half of the subjects were presented with a verbal (written) description of the residences. For each judgment, two residences, A and B, were defined by the values of their three descriptive variables. Subjects indicated their preference by touching a small box underneath the description of the preferred residence with a light pen.

After designating their most preferred residential alternative, subjects indicated their degree of preference for the preferred alternative by touching the light pen to any position on a seven inch bar at the bottom of the screen. The bar was labeled on the left with a one and on the right with a five. The label one meant that they preferred the two residences equally and the five meant that they preferred one residence five or more times as much as the other residence. Each subject mate these judgments for all sixty-six pairs of the twelve stimuli. Response times for au judgments were recorded.

To minimize anchor point problems, subjects rated extreme pairs of stimuli (very similar and very dissimilar residences) in the warmup phase of the experiment. Temporal bias was controlled by randomly ordering the pairs of stimuli and by balancing the order of the stimuli within each pair so that each stimulus appeared as the first stimulus to the subject approximately half the time. At the close of the session, subjects were asked to evaluate the realism of the experiment.

ANALYSIS AND RESULTS

The experiment that has been briefly described above and the one that preceded it were designed to investigate both the nature of subjects' residential preferences and the impact of pictorial and verbal cue presentations. A large number of analyses were done using standard metric and nonmetric MDS procedures as well as PROSCAL for a variety of segments. Comparisons of multidimensional scaling analyses were also mate to analyses using other multivariate methods, such as logit analysis.

A rather small subset of the analyses undertaken in the study are reported in this paper. Those that are used are chosen for their ability to illustrate the confirmatory use of probabilistic multidimensional scaling.

Tests of Dimensionality

With MLE programs, such as PROSCAL, tests of dimensionality are done quite easily. To compare a r and a r+k dimensional solution, for example, solutions are derived in both r and r+k dimensions. Log likelihoods of the two solutions are then compared in a likelihood ratio test. Since the quantity

Q = 2(Lg - Ls), (1)

where:

Lg - the log likelihood of the general r+k dimensional solution

Ls = the log likelihood of the specific r dimensional solution

is asymptotically chi-squared distributed with the degrees of freedom equal to the difference in the number of free parameters estimated by the two models, a statistical test of the null hypothesis of a r dimensional solution is available. Thus, unlike most hypothesis tests in CMDA and confirmatory covariance structure analysis, the specific model is assumed to be true unless the data are strong enough to confirm that the general model is worth the cost of the added degrees of freedom it uses. In CMDA and confirmatory covariance structure analysis, the research interest is usually not in providing evidence for a more general model, but in showing that the simple constrained model with fewer free parameters is as good as the general model.

In this study, the data of the nineteen subjects who said they preferred apartments/condominiums, were evaluated in two, three and four dimensions. The resulting values of Q (1) are given in Table 1.

TABLE 1

LOG LIKELIHOOD DIFFERENCES FOR TESTS OF DIMENSIONALITY

From Table 1, it is clear that there is ample evidence at the 0.01 level that the data are strong enough to reject a two dimensional model for a three dimensional model. However, there is almost no evidence for a four dimensional model and the three dimensional model is thus confirmed. Since the data were defined as three dimensional, one would not expect any more than a three dimensional solutions though a solution of lower dimensionality might be appropriate if subjects were combining dimensions in their evaluations of the stimuli.

Tests of Homogeneity

PROSCAL assumes that each stimulus and each ideal point has a variance that is the same on all dimensions but which may differ from point to point. (This restrictive assumption of an isotropic space that requires equal variances on all dimensions and independence of observations has been relaxed in the most recent version of PROSCAL. This generalization, which requires going to a distribution other than the double non-central F distribution described earlier, was not available when the analysis for this study was conducted.) In the preceding tests of dimensionality, it was assumed (though not stated) that the variances of all the stimuli were the same. Thus, only one variance was estimated. Other conditions could, of course, have been assumed.

The PROSCAL model allows great flexibility in modeling the variances of the stimuli and ideal points. Three general submodels for allocating variances exist; these are referred to as set, partition and distance submodels. More complex models can also be estimated (MacKay and Zinnes 1985).

In this study, the use of the set model for allocating variances is illustrated. The set submodel allocates variances to individual stimuli or ideal points. Each point may have its own unique variance estimated or the analyst may have a variance estimated for a set of points.

The set submodel was used here in two different ways. The first way was by appropriating a finding from the earlier study (MacKay; and Zinnes 1985) on two dimensional residential stimuli that subjects attributed significantly different variances to stimuli according to their level of time from the CBD. This finding was evaluated in the present study by estimating five variances instead of one for the subjects choosing apartments/condominiums. Four unique variances were estimated for the stimuli, three stimuli in each set, and one variance was estimated for the ideal points. A likelihood ratio test then compared the log likelihoods of the five and one variance models.

The second application was to see if the variance for subjects exposed to graphic stimuli differed from the variance for subjects exposed to verbal stimuli. This was accomplished by simply dividing the ideal points into two sets and estimating one additional variance. A likelihood ratio test was again used to compare the likelihoods of the general six and specific five variance models.

Results of the likelihood ratio tests are given in Table 2. It is obvious that the increase in the number of variances from one to five was highly significant but that the increase from five to six was not. In the test of one versus five variances, it will be noticed that only three degrees of freedom are used instead of four. This is because only four of the five estimated variances are free parameters. The uniqueness properties of the unfolding model used by PROSCAL are such that ideal point variances cannot be uniquely distinguished from stimulus variances. The program accommodates this problem by scaling the minimum variances for stimuli and ideal points equally.

TABLE 2

LOG LIKELIHOOD DIFFERENCES FOR TESTS OF HOMOGENEITY

These results differ from those obtained in the earlier study of two dimensional stimuli where it was shown that the variance for graphic stimulus displays was significantly higher than that of verbal stimulus displays. A tentative explanation for this finding is that as the complexity of the stimuli increase, the relative benefits of the graphic display also increase, thus lowering the relative perceived heterogeneity of the stimuli. Estimated variances for the graphic and verbal stimuli were 0.42 and 0.41 respectively.

Tests of Configural Similarity

An analyst will often want to determine if two or more stimuli or ideal points could indeed have the same estimated location. In product evaluation, for example, one would like to know if two or more brands are perceived to be the same. In this study, we were interested in determining if graphically displayed stimuli were perceived the same as verbally displayed stimuli.

To test the simiLarity of the configurations for verbal and graphic displays, separate PROSCAL analyses of subjects exposed to graphic and verbal displays were conducted for the apartment/condominium subjects. In both analyses, the configuration was constrained to be identical to the configuration obtained in the previous section when all apartment/condominium subjects were evaluated at once with a model involving five variances. Five variances were again estimated in both analyses. (These constrained analyses are the same as what is usually termed an external analysis. However, PROSCAL is able to keep any number of parameters free or fixed.)

Results of the hypothesis tests are given in Table 3. Here it is seen that the general s_constrained configuration of the verbally displayed stimuli differs significantly from the specific model in which the configuration of the stimuli was constrained to be the one obtained when subjects chose apartments or condominiums for all, graphic and verbal, stimuli.

TABLE 3

LOG LlKELIHOOD DIFFERENCES FOR TESTS OF CONFIGURAL SIMILARITY

Stimulus Display and Attribute Covariation

Figure 1 shows three configurations. In panel (A), the configuration of the objectively defined stimuli are portrayed. Panel (B) shows the configuration estimated from the judgments of the subjects exposed to the graphic displays and panel (C) shows the configuration of stimuli for subjects exposed to the verbal displays. Estimated ideal points are not shown.

Evaluation of the differences in the two display modes could proceed in different ways. Perhaps the simplest would be to select a single criterion, such as the correlation of interpoint distances between an estimated configuration and the known "true" configuration of physical coordinates in the first panel of Figure 1. If this is tone, the resulting correlations are about equal, 0.56 for the graphic dispLay and 0.62 for the verbal dispLay, with a slight edge for the verbaL display.

FIGURE 1

CONFIGURATIONS FOR (A) THE PHYSICAL STIMULI, (B) THE GRAPHIC STIMULI, AND (C) THE VERBAL STIMULI ALONG THE DIMENSIONS OF PRICE, ENVIRONMENT AND TIME

Examination of the configurations for the two sets of apartment/condominium subjects showed an interesting difference. For the subjects exposed to the graphic stimuli, the correlation between the attribute values of the stimuli on the environment and time axes was of a greater absolute magnitude than that for the subjects exposed to the verbal stimuli. For graphic and verbal displays, the respective correlations were -0.91 and -0.24. The corresponding covariances were -0.41 and -0.24 (p < 0.05)

Attribute covariation has been studied both in marketing and psychology, viz. (Chapman and Chapman 1971, Huber and McCann 1982, Jennings, Amabile and Ross 1984). A consistent finding is that even when presented with data which exhibit no covariation, subjects will report their preconceptions about covariation of the stimuli. The same phenomenon appears to be happening in this study, so the presence of covariation itself should not be taken as an indication that the graphic method is less satisfactory than the verbal method. It might even be argued that the abstractness of the verbal display dilutes the subjects actual preconceived covariation and that the greater covariation of the graphic display is a better representation. (Row preconceptions overcome immediately available data is not known. Jennings, Amabile and Ross (1982) speculate on a number of possibilities - subjects may "see" the relations they report, subjects may weight subjective impressions and expectations, etc.)

Since the covariation affects the criterion of correlation among interpoint distances, other criteria, which are less affected by the covariation, should be considered. One criterion would be to simply correlate the projections of the stimuli on the axes with the corresponding physical values, after rotating the MDS configurations to maximum congruence with the physical configuration,

Results from correlating projections on individual axes, shown in Table 4, indicate that a primary difference in the graphic and verbal displays was that the verbal displays were much less successful in capturing the time dimension than the graphic displays. Correlations of projections on the other two axes are not significantly different from each other.

TABLE 4

PRODUCT MOMENT CORRELATIONS OF PROJECTIONS ON ESTIMATED AND PHYSICAL AXES

From Table 4, one might conclude that graphic displays enable subjects to retain a greater amount of information in their decision making. Such a conclusion, though, would be premature. Additional study of the effects of graphic and verbal displays are needed. To test hypotheses about the differences in graphic and verbal displays, the methods of confirmatory MDS are welL suited.

REFERENCES

Borg, Ingwer and James C. Lingoes (1980), "A Model and Algorithm for Multidimensional Scaling with External Constraints on the Distances," Psychometrika, 45 (March), 25-38.

Chapman, Loren J. and Jean Chapman (1971), "Test Results are What You Think They Are," Psychology Today, (November), 18-22, 106-110.

Coombs, Clyde H. (1950), "Psychological Scaling Without A Unit of Measurement," Psychological Review, 57 (May), 145-158.

DeSoete, Geert and J. Douglas Carroll (1983), "A Maximum Likelihood Method for Fitting the Wandering Vector Model," Psychometrika, 48 (December), 553-566.

DeSoete, Geert, J. Douglas Carroll and Wayne S. DeSarbo, "The Wandering Ideal Point Model: A Probabilistic Unfolding Model for Paired Comparisons Data," unpublished manuscript, (no date).

Heiser, Willem J. and Jacqueline Meulman (1983), "Constrained Multidimensional Scaling, Including Confirmation," Applied Psychological Measurement, 7 (Fall),

Huber, Joel and John McCann (1982), "The Impact of Inferential Beliefs on Product Evaluations," Journal of Marketing Research, (August), 324-333.

Jennings, Dennis L., Teresa M. Amabile and Lee Ross (1982), "Informal Covariation Assessment: Data-Based Versus Theory-Based Judgments," in Judgment Under Uncertainty: Heuristics and Biases, (Eds.) Daniel Kahneman, Paul Slovic and Amos Tversky, Cambridge: Cambridge University Press.

Kendall, Maurice and Alan Stuart (1979), The Advanced Theory of Statistics, II, London: Charles Griffin.

Lingoes, James C. and Ingwer Borg (1983), "A QuasiStatistical Model for Choosing Between Alternative Configurations Derived from Ordinally Constrained Data,' British Journal of Mathematical and Statistical Psychology, 36 (May), 36-53.

MacKay, David B. and Joseph L. Zinnes (1982), "PROSCAL: A Program for Probabilistic Scaling," Discussion Paper 218, Graduate School of Business, Indiana University, Bloomington, IN.

MacKay, David B. and Joseph L. Zinnes (1985), "A Probabilistic Model for the Multidimensional Scaling of Proximity and Preference Data." (submitted for publication).

Takane, Yoshio (1981), "Multidimensional Successive Categories Scaling: A Maximum Likelihood Method," Psychometrika, 46 (March), 9-28.

Young, Forrest W. (1984), "Scaling," Annual Review of Psychology, 35, 55-81.

Zinnes, Joseph L. and David B. MacKay (1981), "Multidimensional Scaling Models: The Other Site," in Multidimensional Data Representations: When and Why, (Et.), Ingwer Borg, Ann Arbor: Mathesis Press, 517-542.

Zinnes, Joseph L. and David B. MacKay (1983), "Probabilistic Multidimensional Scaling: Complete and Incomplete Data," Psychometrika, 48 (September), 27-48.

----------------------------------------