The Application of Venture Analysis to Consumer Research Problems

William M. Campion, University of Wisconsin-Whitewater
Anthony F. McGann, University of Wyoming
ABSTRACT - This paper proceeds from a premise that a criterion variable has already been selected for a consumer research problem. Using a form of Bayesian decision theory called Venture Analysis, an algorithm is presented which can be used to rationally solve three of the important tactical problems in consumer research; how much to spend on research, which method of data collection is best for the problem at hand, and how large a sample to use. Beyond conventional statistical concepts, Venture Analysis requires only that the decisionmaker estimate the "shape" of the relevant population distribution. An example application of the method is also presented.
[ to cite ]:
William M. Campion and Anthony F. McGann (1983) ,"The Application of Venture Analysis to Consumer Research Problems", in NA - Advances in Consumer Research Volume 10, eds. Richard P. Bagozzi and Alice M. Tybout, Ann Abor, MI : Association for Consumer Research, Pages: 238-242.

Advances in Consumer Research Volume 10, 1983      Pages 238-242


William M. Campion, University of Wisconsin-Whitewater

Anthony F. McGann, University of Wyoming


This paper proceeds from a premise that a criterion variable has already been selected for a consumer research problem. Using a form of Bayesian decision theory called Venture Analysis, an algorithm is presented which can be used to rationally solve three of the important tactical problems in consumer research; how much to spend on research, which method of data collection is best for the problem at hand, and how large a sample to use. Beyond conventional statistical concepts, Venture Analysis requires only that the decisionmaker estimate the "shape" of the relevant population distribution. An example application of the method is also presented.


Choosing the best research procedure in any decisional context depends on knowing or accurately estimating the error probabilities conditional on the states of nature. The purpose of this paper is to provide a method for choosing the best survey research procedure from among a specified set of alternatives. In addition, the optimum budget for the chosen procedure is determined along with the optimum sample size. This is accomplished by calculating the probabilities and the potential costs of sampling errors as part of the normal preposterior analysis.

Prior to actually undertaking a study the survey researcher and the user of the research are concerned with three key issues: what resources should be devoted to information gathering, what survey method should be chosen, and of what quality is information to be obtained. Bayesian decision theory has provided the formal logic for evaluating such questions. The usual decision analysis methods involve establishing the conditional payoff for each state-alternative combination and performing a prior analysis. Since additional information is not without cost, a preposterior analysis is conducted for the purpose of determining whether data should be gathered as an input for decision making. For each survey design under consideration, the probability of obtaining each potential test outcome conditional on the appropriate state of nature is assessed subjective . The researcher then computes the expected value of imperfect information, and, knowing the costs associated with each competing design, chooses the design providing the highest positive expected net gain from additional information (3,4).

There are two operational problems that are difficult to resolve in actual situations. First, the sample size is often determined subjectively. Since sampling errors (and hence decisional errors) are a function of sample size, the usual rule is to obtain as many sample observations as possible. At present, it is extremely difficult to assess the trade-off between the cost of a larger sample and the reduction in potential error it ma! be expected to produce in a particular problem setting. Clearly the largest possible sample is not necessarily optimum.

An even greater problem exists with the choice of the survey procedure itself (e.g., mail, telephone, personal interview, etc.). Different survey methods produce different results in varying problem situations. The choice among these alternatives is often conditional on the personal preferences and perhaps past experiences of the investigator, rather than on the problem situation. Each procedure has a different cost structure. Since the start up cost and cost per observation are different for each design, the sample size obtainable is usually different. The relationship between the budget and the procedure to be employed may be considered a function of the fixed and variable costs unique to each procedure. The optimum amount to be spent using any procedure is function of the marginal cost of a potential error. Practically, it is difficult to quantitatively evaluate each alternative procedure prior to its actual use in a particular problem situation (5).


For expository purposes, the method will be presented for three states, which are arbitrarily defined as pessimistic, most likely, and optimistic. Two alternatives will be considered, a "go" alternative and a "no-go" alternative. This is commonly referred to as the venture analysis problem (7). The contemplated venture could be the evaluation of a new product, new public policy, new consumer hazard, or any other or the many decisional situations in which the choice of alternatives finally rests between the "go" or "no-go" actions. This problem is summarized in a conditional payoff table.



For this formulation, the expected value of the venture without additional information is:

EV0 = P (S1) V11 + (S2) V12 + (S3) V13   (1)

where S and V are states and payoffs, respectively.

The expected monetary value of imperfect information (EMVII) is equal to the expected value of perfect information less the expected cost of any errors, or:

EMVII = 1 - P(I3|S3)P(S3)|V13 - P(I3|S1)P(S1)V11 - P(I3|S2)P(S2)V12    (2)

The costs of the survey must be determined for each procedure under consideration. The total cost of information using a given procedure will be expressed here as a function of its fixed and variable components. The fixed portion (FC) includes any costs that may be incurred in either the preliminary stages, such as interviewer training, or the analysis stages, including preparation and presentation of a final report. The variable cost component is assumed to be a function of the cost of each observation (VC/observation) and the number or observations (n). Therefore, the cost of additional information for this example is

CAS = FC + nVC/observacion   (3)

The expression for CAI can be appropriately modified depending on the needs of the researcher and/or the actual cost relationships being faced.

The expected net gain from additional information (ENGAI) must be positive, or no additional information will be assembled using that particular design. The ENGAI is equal to the expected value of imperfect information less the cost of additional information or


The design that produces the highest ENGAI will be chosen to gather the survey data.


In order to make effective use of the methods described, it is necessary to assess the appropriate levels of the Type I and Type II errors such that ENGAI is optimized for each design under consideration. Current practice involves setting these levels of potential error either subjectively or by some ad hoc means in order that ENGAI may be determined. The intent here is to provide a method or calculating the potential errors as part of the normal preposterior analysis.

The relationship between the population of interest and its underlying theoretical sampling distribution for a given sample size is well known, as is the statistical procedure for the calculation or sample size based on a desired level of precision. These concepts are defined in terms of the algorithm and modified such that they be used to calculate the error probabilities.

The specific problem situation is described by two pieces or information, the conditional payoffs such as chose appearing in Table 1, and the variability of the population from which the assessments were developed. he determination of the states of nature, their probabilities and the payoffs may be made using any of the commonly accepted methods (1,6,8). The variability of the population may be described by its coefficient of variation, the ratio of the population standard deviation or its mean, C = EQUATION. To estimate the population coefficient of variation, one need know only the "shape" or the population distribution. For example, a normally distributed population may be adequately represented by a coefficient of variation of .33, and a uniformly distributed population by a coefficient of variation of .58, regardless of their actual numerical mean and standard deviation (9, p. 760). Here, the "constructively" ignorant researcher (and the truly ignorant, as well) will begin with an estimate of a uniform distribution. .Any additional information, even in the form of an educated guess, will increase the peakedness of the distribution.

The calculation of sample size depends on the desired precision of the estimate, in addition to information about the variability of the population. The precision of any estimate (d) to be derived from a sample may be represented as the ratio of the absolute error to the population mean


The sample size (n) is defined by


where Z is the standard normal deviate appropriate to a specified level of Type I error, and C is the population coefficient of variation. This procedure is often used by survey research practitioners to determine the sample size, but is of limited usefulness since the tolerable error is still established somewhat subjective. A more useful approach involves initially fixing the sample size (n) at its upper limit and calculating the error that results (S), holding the Type I error constant. Since the problem situation sets the limit on the expected value of perfect information (EVPI), it also limits the possible number of sample observations. The cost of additional information cannot exceed E PI. The maximum possible number of sample observations is found by setting CAI (Equation 3) equal to EVPI, and solving or n


The presence of Type I error will lower the sample size. Since a = 1-P(I3|S3), the obtainable sample size given a level of a is


Therefore, Equation (6) may be rearranged to express precision as a function of sample size.


At this point, the problem situation has been defined, permitting the determination or EVPI. The maximum sample size is then expressed as a function or EVPI and a specified Type I error. As a result, the precision of the sample estimate is also expressed as a function Of the problem situation.

In order to calculate the level of Type II error, a critical region must next be defined.



A mean implicitly underlies each state of nature specified in the payoff table. It is reasonable to assume these means (X.) can be estimated. Since S is treated as a null hypothesis, the mean underlying this state, X is used to locate the center of the theoretical sampling distribution. The critical region for hypothesis testing purposes can be expressed as a function of the mean, X , plus some proportion of this mean corresponding to the tolerable error as developed in Equation (9).

CR = X3 + dX3

      = X3 (1+d)    (10)

where, CR = Interval that defines the critical region, and

X3 = Estimated mean of the population under the null hypothesis (This is also the mean of the underlying theoretical sampling distribution since E(X3) = u under the hypothesis.)

In summary. rather than placing a constraint on the tolerable error and calculating the appropriate sample size, the sample size is constrained by EVPI and the probability of a Type I error, which in turn specifies the amount of error that may be realized. This permits expressing the critical region as a function of the hypothesized mean plus-some proportion of it. It is now possible to calculate-the appropriate Type II errors in the usual manner relative to the means of the alternative states S1 = X1, S2, X2. The EMVII is found by substituting the probabilities of the Type I error and the Type II errors into Equation (9). The cost of the additional information max be calculated from Equation (3). ENGAI is then found by Equation (4).

This procedure may be iterated for several levels of Type I error, and hence sample sizes. Since the Type I errors tend to interact wish Type II errors as the sample size changes, the precision of the estimate, and ultimately the critical value defining the errors, must be recomputed each time. Also, EMVII, CAI, and ENGAI must he recalculated for each iteration.

By plotting ENGAI for each design under consideration as a function of sample size, an optimum value of each may be determined. This is shown graphically in Figure 2 for three hypothetical designs being considered.



Design D2 produces a higher ENGAI than D1 or D3 and, therefore be the one used to gather additional information. The optimum number of observations is no for this design. The optimum budget is determined by substituting no in Equation (3). Design D1 and design D3 do not produce a positive ENGAI, and if they were the only designs under consideration, the investigator would choose not to gather information. since the cost of gathering information using either would exceed the value of information.


Suppose a firm was considering whether to use an advertising copy platform which was based on retail price reductions. In this example, the numbers were arbitrarily chosen, but the problem is one with clear applications to consumer research activities (c.f. Nwokoye 1975; Stapel, 1972; Highland and McGann, 1977 and Aaker, 1982), To be effective, these price reductions would have to be offset by corresponding increases in sales volume. Over a specified campaign interval, the firm expects to earn an incremental $28,000 if there is a 50% increase in-sales volume. Should the campaign produce only a 45% increase in sales volume, the earnings increment would be only $4,000. And, if the campaign generates a 40% or lower increment in sales volume, the effect on earnings will be a loss of (at least) $20,000. Management attaches probabilities of .25, .5 and .25 to these optimistic, most likely and pessimistic outcomes respectively.

Rather than Bake an arbitrary decision about an ad campaign based on retail price reductions, management is considering three survey designs which have the potential for producing additional information. Table 2 portrays the problem formulation and pertinent design information.



In this problem situation, EVPI = P(S3)|V13| = $5,000. The algorithm considers several possible levels or Type I error which affects the number of observations that may be drawn. If the Type I error for a telephone survey is .01, the maximum obtainable sample size is found by Equation (8 ). Therefore, the tolerable error expressed as a percentage of the mean underlying the hypothesized true state of nature is


Table 3 contains the obtainable sample size and tolerable error, reflecting the presence of Type I error as calculated by Equation (8). The tolerable error for design and Type I error is determined by Equation (9 ), assuming the population is reflected by a coefficient or variation equal to .33. Several iterations are shown for selected levels of Type I error.



In order to compute the Type II errors, it is necessary to define the value representing-the critical region under the hypothesis that S3 = 40% is the true state of nature. The critical region is provided by Equation (10) and may be calculated for each Type I error and corresponding sample size for the three designs. For a Type I error of .01, the critical region for the telephone survey is

CR = X3 (1+d)

       = .40 (1 + .070)

       = .4280   42.80%

The Type II errors may now be calculated with respect to the estimated means of the alternative states, X1 = 50% and X2 = 45%. With the usual methods,


Note that the quantity EQUATION, defines the critical region. Therefore, from equation (10)


The population standard deviation is round by substituting the hypothesized mean X3 = 40% into the equation C = d/u. Since the population is suspected to be normally distributed, its coefficient of variation is estimated as .33. Therefore,


For the telephone survey with n = 123,


Substituting these probabilities into Equation (2),

EMVII = (1 - .01)(.25)(20,000) - (.001)(.25)(28,000) - (.121)(.50)(4,000) = $4,701

By subtracting the cost or the telephone survey from EMVII, the expected net gain resulting from additional information is determined. Since 123 observations are being considered. Equation (3) gives

CAI = $4 ,000 + 123($8)

ENGAI = $4,470 - $4,984

             = -$514

The optimum levels of Type I and Type II error are those that maximize ENGAI for each design. These are found by selecting alternative levels of Type I error iterating.

Table 4 summarizes the final calculations for several iterations. Of the three designs under consideration, the best is the personal interview method, since it produces the highest and only positive ENGAI. For this design, the optimum sample size is 50 observations at the point of maximum ENGAI, which is equal to $29. The probability of a Type I error equals .20, and the Type II errors are .082 and .001, relative to S1 and S2 respectively. The optimum research budget is $4,000.




This Venture Analysis algorithm extends the Decision Analysis approach of preposterior analysis. Based on existing statistical concepts, the only additional requirement of this procedure is that the investigator be able to estimate the shape or the population distribution.

This algorithm has several distinct advantages. First, the problem of subjectively estimating the probability of each potential test outcome is essentially eliminated Instead, actual levels or Type I and Type II error and their associated costs now may be calculated. Second, considerable insight may be gained into the question or how to gather information. Alternative designs may be compared within the context of a particular problem. The choice is based on which design is expected to produce the highest net gain from additional information. Furthermore, the optimum sample size for the indicated design is determined. Third. the optimum research budget may be determined, subject to both the decisional context and the errors inherent in the chosen design.

If the number of research designs being considered is not extensive, hand calculation is feasible for venture analysis. However, if the problem is more complex (e.g. many states of nature or many competing research designs) the algorithm is easily programmed for computers or for the more sophisticated desk-top calculators.


Aaker, D. A., Carman, J. M. and Jacobson, R. (1982), "Modeling Advertising-Sales Relationships Involving Feedback," Journal of Marketing Research, Vol. XIX (February), pp. 116-25.

Brown, Rex V., Kahr, Andrew S. and Peterson, Emerson (1974), Decision Analysis for the Manager, (New York: Holt, Rinehart & Winston), Ch. 13 and 14.

Deming, W. E. (1960), Sample Design in Business Research. New York: John Wiley & Sons, Inc.), p. 260.

Enis, Ben M. and Broome, Charles L. (1971), Marketing Decisions: A Bayesian Approach, (Scranton, Pennsylvania: International Textbook Co.).

Green, Paul F. and Tull, Donald S. (1970), Research for Marketing Decisions, 2nd Ed., (Englewood Cliffs, New Jersey: Prentice-Hall, Inc.).

Highland, A. V. and McGann A. F. (1977), "Physical Distribution, Ultimate Consumers and Weber's Law," Advances in Consumer Research, Vol. IX, pp. 133-37.

Marquardt, Ray A. and Bryhn, Tor (1972), "Using a Loss Function to Evaluate Marketing Research Studies," Marquette Business Review, Vol. XVI (Summer), pp. 89-98.

Nwokoye, N. (197;), "An Experimental study of the Relationship Between Responses to Price Changes and the Price Level for Shoes," Advances in Consumer Research, Vol. 2, pp. 693-703.

Schlaifer, Robert (1959), Probability and Statistics for Business Decisions, (New York: McGraw-Hill).

Stapel, J. (1972), "'Fair' or 'Psychological' Pricing," Journal of Marketing Research, Vol IX (February), pp. 109-10.

Tull, Donald S. (1974), "Assessing the Value of Additional Information" in Ferber, Robert, Editor, Handbook of Marketing Research, (New York: McGraw-Hill), pp. 2-35 to 2-37.

Winkler, Robert (1967), "The Assessment of Prior Distributions in Bayesian Analysis," Journal of the American Statistical Association, (September).

NOTE: A listing of the program may be obtained by writing to Prof. Campion, University of Wisconsin-Whitewater.