About Sample Balancing

 

Note: Run|Sample Balancing cannot be selected unless your job and data files are open.

The goal of the “Sample Balancing” module is to provide a weight for each respondent in the sample such that the weighted marginals on each of a set of characteristics matches preset values of those marginals. This process is sometimes called “raking” or “rim weighting.”  The most common procedure used to produce these weights is “iterative proportional fitting”, a procedure devised by W. Edwards Deming and Frederick F. Stephan, first published in their December, 1940 paper, "On a Least Squares Adjustment of a Sampled Frequency Table when the Expected Marginal Totals are Known," in Volume 11 of The Annals of Mathematical Statistics, pages 427-444, and further explicated in Chapter 7 of Deming's book, Statistical Adjustment of Data (New York: John Wiley & Sons, 1943). Though “iterative proportional fitting” has the nice property of converging to a set of nonnegative weights, these weights do not have any optimal properties (such as the minimization of some measure of goodness of fit.)

 

WinCross's adaptation was developed by J. Stephens Stock, a colleague of Deming, in the 1960s with the express goal that the weights that it produces optimize a measure of goodness of fit. Unfortunately, Stock and his Market-Math, Inc. partner Jerry Green never published their algorithm, but made it available to the market research community. The Analytical Group, Inc. has utilized this algorithm since its incorporation in 1970. (In Public Opinion of Criminal Justice in California, a 1974 report for the Institute of Environmental Studies at the University of California Berkeley by the Field Research Corporation, we find a use of this algorithm, with the note (page 118) “…the weighting correction is based on a design concept originated by the late J. Stephens Stock and Market-Math, Inc. It is currently used by Field Research Organization and several other leading research organizations.”)

 

See the Sample Balancing section of Help|Statistical Reference for more detailed information about Sample Balancing and the techniques that are being used.

 

Sample balancing allows you to use up to 10 different variables and as many as 20 numeric values within each variable (each value can include a range of up to 99 codes) the combination of which cannot exceed 50,000,000:

 

For example:

Variable 1

Variable 2

Variable 3

 

Value 1 = 20%

Value 1 = 25%

Value 1-10 = 50%

 

Value 2 = 20%

Value 2 = 25%

Value 11-20 = 25%

 

Value 3 = 20%

Value 3 = 25%

Value 21-98 = 25%

 

Value 4 = 20%

Value 4-9 = 25%

 

 

Value 5 = 20%

 

 

Total Levels =

5

4

3

 

5x4x3 = 60 (much less than 50,000,000)

Related topics:

Sample Balancing ASCII data

Sample Balancing Variable data