Measurement Uncertainty -- Statistical Analysis of Measurements Subject to Random Errors

Home | Glossary | Books | Links/Resources

EMC Testing | Environmental Testing | Vibration Testing

AMAZON multi-meters discounts AMAZON oscilloscope discounts

Mean and Median Values

The average value of a set of measurements of a constant quantity can be expressed as either the mean value or the median value. Historically, the median value was easier for a computer to compute than the mean value because the median computation involves a series of logic operations, whereas the mean computation requires addition and division. Many years ago, a computer performed logic operations much faster than arithmetic operations, and there were computational speed advantages in calculating average values by computing the median rather than the mean. However, computer power increased rapidly to a point where this advantage disappeared many years ago.

As the number of measurements increases, the difference between mean and median values becomes very small. However, the average calculated in terms of the mean value is always slightly closer to the correct value of the measured quantity than the average calculated as the median value for any finite set of measurements. Given the loss of any computational speed advantage because of the massive power of modern-day computers, This means that there is now little argument for calculating average values in terms of the median.

For any set of n measurements x1, x2 ___ xn of a constant quantity, the most likely true value is the mean given by ...

This is valid for all data sets where the measurement errors are distributed equally about the zero error value, that is, where positive errors are balanced in quantity and magnitude by negative errors.

The median is an approximation to the mean that can be written down without having to sum the measurements. The median is the middle value when measurements in the data set are written down in ascending order of magnitude. For a set of n measurements x1, x2 ___ xn of a constant quantity, written down in ascending order of magnitude, the median value is given by...

Thus, for a set of nine measurements x1, x2 ___ x9 arranged in order of magnitude, the median value is x5. For an even number of measurements, the median value is midway between the two center values, that is, for 10 measurements x1 ___ x10, the median value is given by...

Suppose that the length of a steel bar is measured by a number of different observers and the following set of 11 measurements are recorded (units millimeter). We will call this measurement set A. ...

Using Equations (4) and (5), mean = 409.0 and median = 408. Suppose now that the measurements are taken again using a better measuring rule and with the observers taking more care to produce the following measurement set B: ...

For these measurements, mean = 406.0 and median = 407. Which of the two measurement sets, A and B, and the corresponding mean and median values should we have the most confidence in? Intuitively, we can regard measurement set B as being more reliable because the measurements are much closer together. In set A, the spread between the smallest (396) and largest (430) value is 34, while in set B, the spread is only 6.

• Thus, the smaller the spread of the measurements, the more confidence we have in the mean or median value calculated.

Let us now see what happens if we increase the number of measurements by extending measurement set B to 23 measurements. We will call this measurement set C.

Now, mean = 406.5 and median = 406

• This confirms our earlier statement that the median value tends toward the mean value as the number of measurements increases.

Standard Deviation and Variance

Expressing the spread of measurements simply as a range between the largest and the smallest value is not, in fact, a very good way of examining how measurement values are distributed about the mean value. A much better way of expressing the distribution is to calculate the variance or standard deviation of the measurements. The starting point for calculating these parameters is to calculate the deviation (error) di of each measurement xi from the mean value xmean in a set of measurements x1, x2, ______ xn:

+ The variance (Vs) of the set of measurements is defined formally as the mean of the squares of deviations:

The standard deviation (ss) of the set of measurements is defined as the square root of the variance:

Unfortunately, these formal definitions for the variance and standard deviation of data are made with respect to an infinite population of data values whereas, in all practical situations, we can only have a finite set of measurements. We have made the observation previously that the mean value xm of a finite set of measurements will differ from the true mean m of the theoretical infinite population of measurements that the finite set is part of. This means that there is an error in the mean value xmean used in the calculation of di in Equation (6). Because of this, Equations (7) and (8) give a biased estimate that tends to underestimate the variance and standard deviation of the infinite set of measurements. A better prediction of the variance of the infinite population can be obtained by applying the Bessel correction factor (n/n_1) to the formula for Vs in Equation (7):

... where Vs is the variance of the finite set of measurements and V is the variance of the infinite population of measurements.

This leads to a similar better prediction of the standard deviation by taking the square root of the variance in Equation (9):

From this data, using Equations (9) and (10), V = 3.53 and s = 1.88.

Note that the smaller values of V and s for measurement set B compared with A correspond with the respective size of the spread in the range between maximum and minimum values for the two sets.

• Thus, as V and s decrease for a measurement set, we are able to express greater confidence that the calculated mean or median value is close to the true value, that is, that the averaging process has reduced the random error value close to zero.

• Comparing V and s for measurement sets B and C, V and s get smaller as the number of measurements increases, confirming that confidence in the mean value increases as the number of measurements increases.

We have observed so far that random errors can be reduced by taking the average (mean or median) of a number of measurements. However, although the mean or median value is close to the true value, it would only become exactly equal to the true value if we could average an infinite number of measurements. As we can only make a finite number of measurements in a practical situation, the average value will still have some error. This error can be quantified as the standard error of the mean, which is discussed in detail a little later. However, before that, the subject of graphical analysis of random measurement errors needs to be covered.

Graphical Data Analysis Techniques-Frequency Distributions

Graphical techniques are a very useful way of analyzing the way in which random measurement errors are distributed. The simplest way of doing this is to draw a histogram, in which bands of equal width across the range of measurement values are defined and the number of measurements within each band is counted. The bands are often given the name data bins. A useful rule for defining the number of bands (bins) is known as the Sturgis rule, which calculates the number of bands as Number of bands = 1 + 3:3 log10 n ð +, where n is the number of measurement values.

Example 5

Draw a histogram for the 23 measurements in set C of length measurement data given in Section 5.1.

Solution---For 23 measurements, the recommended number of bands calculated according to the Sturgis rule is 1 +3.3 log10(23) =5.49. This rounds to five, as the number of bands must be an integer number.

To cover the span of measurements in dataset C with five bands, databands need to be 2 mm wide. The boundaries of these bands must be chosen carefully so that no measurements fall on the boundary between different bands and cause ambiguity about which band to put the min. Because the measurements are integer numbers, this can be accomplished easily by defining the range of the first band as 401.5 to 403.5 and so on. A histogram can now be drawn as in Fgr. 6 by counting the number of measurements in each band.

In the first band from 401.5 to 403.5, there is just one measurement, so the height of the histogram in this band is 1 unit.

In the next band from 403.5 to 405.5, there are five measurements, so the height of the histogram in this band is 1 = 5 units.

Number of measurements:

The rest of the histogram is completed in a similar fashion.

When a histogram is drawn using a sufficiently large number of measurements, it will have the characteristic shape shown by truly random data, with symmetry about the mean value of the measurements. However, for a relatively small number of measurements, only approximate symmetry in the histogram can be expected about the mean value. It’s a matter of judgment as to whether the shape of a histogram is close enough to symmetry to justify a conclusion that data on which it’s based are truly random. It should be noted that the 23 measurements used to draw the histogram in Fgr. 6 were chosen carefully to produce a symmetrical histogram but exact symmetry would not normally be expected for a measurement data set as small as 23.

As it’s the actual value of measurement error that is usually of most concern, it’s often more useful to draw a histogram of deviations of measurements from the mean value rather than to draw a histogram of the measurements themselves. The starting point for this is to calculate the deviation of each measurement away from the calculated mean value. Then a histogram of deviations can be drawn by defining deviation bands of equal width and counting the number of deviation values in each band. This histogram has exactly the same shape as the histogram of raw measurements except that scaling of the horizontal axis has to be redefined in terms of the deviation values (these units are shown in parentheses in Fgr. 6).

Fgr. 7

Let us now explore what happens to the histogram of deviations as the number of measurements increases. As the number of measurements increases, smaller bands can be defined for the histogram, which retains its basic shape but then consists of a larger number of smaller steps on each side of the peak. In the limit, as the number of measurements approaches infinity, the histogram becomes a smooth curve known as a frequency distribution curve, as shown in Fgr. 7. The ordinate of this curve is the frequency of occurrence of each deviation value, F(D), and the abscissa is the magnitude of deviation, D.

The symmetry of Figures 6 and 7 about the zero deviation value is very useful for showing graphically that measurement data only have random errors. Although these figures cannot be used to quantify the magnitude and distribution of the errors easily, very similar graphical techniques do achieve this. If the height of the frequency distribution curve is normalized such that the area under it’s unity, then the curve in this form is known as a probability curve, and the height F(D) at any particular deviation magnitude D is known as the probability density function (p.d.f.). The condition that the area under the curve is unity can be expressed mathematically as...

The probability that the error in any one particular measurement lies between two levels D1 and D2 can be calculated by measuring the area under the curve contained between two vertical lines drawn through D1 and D2, as shown by the right-hand hatched area in Fgr. 7. This can be expressed mathematically as ....

Of particular importance for assessing the maximum error likely in any one measurement is the cumulative distribution function (c.d.f.). This is defined as the probability of observing a value less than or equal to Do and is expressed mathematically as...

Thus, the c.d.f. is the area under the curve to the left of a vertical line drawn through Do, as shown by the left-hand hatched area in Fgr. 7.

The deviation magnitude Dp corresponding with the peak of the frequency distribution curve ( Fgr. 7) is the value of deviation that has the greatest probability. If the errors are entirely random in nature, then the value of Dp will equal zero. Any nonzero value of Dp indicates systematic errors in data in the form of a bias that is often removable by recalibration.

Gaussian (Normal) Distribution

Measurement sets that only contain random errors usually conform to a distribution with a particular shape that is called Gaussian, although this conformance must always be tested (see the later section headed "Goodness of fit"). The shape of a Gaussian curve is such that the frequency of small deviations from the mean value is much greater than the frequency of large deviations. This coincides with the usual expectation in measurements subject to random errors that the number of measurements with a small error is much larger than the number of measurements with a large error. Alternative names for the Gaussian distribution are normal distribution or bell-shaped distribution. A Gaussian curve is defined formally as a normalized frequency distribution that is symmetrical about the line of zero error and in which the frequency and magnitude of quantities are related by the expression:

... where m is the mean value of data set x and the other quantities are as defined before.

Equation (13) is particularly useful for analyzing a Gaussian set of measurements and predicting how many measurements lie within some particular defined range. If measurement deviations D are calculated for all measurements such that D = x _ m, then the curve of deviation frequency F(D) plotted against deviation magnitude D is a Gaussian curve known as the error frequency distribution curve. The mathematical relationship between F(D) and D can then be derived by modifying Equation (13) to give ...

The shape of a Gaussian curve is influenced strongly by the value of s, with the width of the curve decreasing as s becomes smaller. As a smaller s corresponds with typical deviations of measurements from the mean value becoming smaller, this confirms the earlier observation that the mean value of a set of measurements gets closer to the true value as s decreases.

If the standard deviation is used as a unit of error, the Gaussian curve can be used to determine the probability that the deviation in any particular measurement in a Gaussian data set is greater than a certain value. By substituting the expression for F(D)in Equation (14) into probability Equation (11), the probability that the error lies in a band between error levels D1 and D2 can be expressed as...

Solution of this expression is simplified by the substitution ...

Table 1 Error Function Table (Area under a Gaussian Curve or z Distribution)

The effect of this is to change the error distribution curve into a new Gaussian distribution that has a standard deviation of one (s =1) and a mean of zero. This new form, shown in Fgr. 8, is known as a standard Gaussian curve (or sometimes as a z distribution), and the dependent variable is now z instead of D. Equation (15) can now be re-expressed as ...

Unfortunately, neither Equation (15) nor Equation (17) can be solved analytically using tables of standard integrals, and numerical integration provides the only method of solution.

However, in practice, the tedium of numerical integration can be avoided when analyzing data because the standard form of Equation (17), and its independence from the particular values of the mean and standard deviation of data, means that standard Gaussian tables that tabulate F(z) for various values of z can be used.

Standard Gaussian Tables (z Distribution)

A standard Gaussian table (sometimes called z distribution), such as that shown in Table 3.1, tabulates the area under the Gaussian curve F(z) for various values of z, where F(z)is given by ....

Thus, F(z) gives the proportion of data values that are less than or equal to z. This proportion is the area under the curve of F(z) against z that is to the left of z. Therefore, the expression given in Equation (17) has to be evaluated as [F(z2)_F(z1)]. Study of Table 3.1 shows that F(z) = 0.5 for z = 0. This confirms that, as expected, the number of data values _0 is 50% of the total. This must be so if data only have random errors. It will also be observed that Table 3.1, in common with most published standard Gaussian tables, only gives F(z) for positive values of z. For negative values of z, we can make use of the following relationship because the frequency distribution curve is normalized:...

[F(_z) is the area under the curve to the left of (_z), i.e., it represents the proportion of data values __z.]

Example 6 How many measurements in a data set subject to random errors lie outside deviation boundaries of +s and _s, that is, how many measurements have a deviation greater than |s|?

Solution:

The required number is represented by the sum of the two shaded areas in Figure 9.

This can be expressed mathematically as ... of the measurements lie outside the _s boundaries, then 68% of the measurements lie inside.

The analysis just given shows that, for Gaussian-distributed data values, 68% of the measurements have deviations that lie within the bounds of _s. Similar analysis shows that boundaries of _2s contain 95.4% of data points, and extending the boundaries to _3s encompasses 99.7% of data points. The probability of any data point lying outside particular deviation boundaries can therefore be expressed by the following table.

Standard Error of the Mean

The foregoing analysis has examined the way in which measurements with random errors are distributed about the mean value. However, we have already observed that some error exists between the mean value of a finite set of measurements and the true value, that is, averaging a number of measurements will only yield the true value if the number of measurements is infinite. If several subsets are taken from an infinite data population with a Gaussian distribution, then, by the central limit theorem, the means of the subsets will form a Gaussian distribution about the mean of the infinite data set. The standard deviation of mean values of a series of finite sets of measurements relative to the true mean (the mean of the infinite population that the finite set of measurements is drawn from) is defined as the standard error of the mean, a. This is calculated as...

Clearly, a tends toward zero as the number of measurements (n) in the data set expands toward infinity.

The next question is how do we use the standard error of the mean to predict the error between the calculated mean of a finite set of measurements and the mean of the infinite population? In other words, if we use the mean value of a finite set of measurements to predict the true value of the measured quantity, what is the likely error in this prediction? This likely error can only be expressed in probabilistic terms. All we know for certain is the standard deviation of the error, which is expressed as a in Equation (20).We also know that a range of _ one standard deviation (i.e., _a) encompasses 68% of the deviations of sample means either side of the true value. Thus we can say that the measurement value obtained by calculating the mean of a set of n measurements, x1, x2, ___ xn, can be expressed as...

...with 68% certainty that the magnitude of the error does not exceed |a|. For data set C of length measurements used earlier, n = 23, s = 1.88, and a = 0.39. The length can therefore be expressed as 406.5 _ 0.4 (68% confidence limit).

The problem of expressing the error with 68% certainty is that there is a 32% chance that the error is greater than a. Such a high probability of the error being greater than a may not be acceptable in many situations. If this is the case, we can use the fact that a range of _ two standard deviations, that is, _2a, encompasses 95.4% of the deviations of sample means either side of the true value. Thus, we can express the measurement value as...with 95.4% certainty that the magnitude of the error does not exceed |2a|. This means that there is only a 4.6% chance that the error exceeds 2a. Referring again to set C of length measurements, 2s = 3.76, 2a = 0.78, and the length can be expressed as 406.5 _ 0.8 (95.4% confidence limits).

If we wish to express the maximum error with even greater probability that the value is correct, we could use_3a limits (99.7%confidence). In this case, for length measurements again, 3s = 5.64, 3a = 1.17, and the length should be expressed as 406.5 _ 1.2 (99.7% confidence limits). There is now only a 0.3% chance (3 in 1000) that the error exceeds this value of 1.2.

Estimation of Random Error in a Single Measurement

In many situations where measurements are subject to random errors, it’s not practical to take repeated measurements and find the average value. Also, the averaging process becomes invalid if the measured quantity does not remain at a constant value, as is usually the case when process variables are being measured. Thus, if only one measurement can be made, some means of estimating the likely magnitude of error in it’s required. The normal approach to this is to calculate the error within 95%confidence limits, that is, to calculate the value of deviation D such that 95%of the area under the probability curve lies within limits of _D. These limits correspond to a deviation of _1.96s. Thus, it’s necessary to maintain the measured quantity at a constant value while a number of measurements are taken in order to create a reference measurement set from which s can be calculated. Subsequently, the maximum likely deviation in a single measurement can be expressed as Deviation =_1.96s. However, this only expresses the maximum likely deviation of the measurement from the calculated mean of the reference measurement set, which is not the true value as observed earlier. Thus the calculated value for the standard error of the mean has to be added to the likely maximum deviation value. To be consistent, this should be expressed to the same 95% confidence limits. Thus, the maximum likely error in a single measurement can be expressed as

Before leaving this matter, it must be emphasized that the maximum error specified for a measurement is only specified for the confidence limits defined. Thus, if the maximum error is specified as _1% with 95% confidence limits, This means that there is still 1 chance in 20 that the error will exceed _1%.

Example 7

Suppose that a standard mass is measured 30 times with the same instrument to create a reference data set, and the calculated values of s and a are s = 0.46 and a = 0.08. If the instrument is then used to measure an unknown mass and the reading is 105.6 kg, how should the mass value be expressed?

Solution:

Using Equation (21), 1.96(s + a) = 1.06. The mass value should therefore be expressed as 105.6 _ 1.1 kg.

Distribution of Manufacturing

Tolerances---Many aspects of manufacturing processes are subject to random variations caused by factors similar to those that cause random errors in measurements. In most cases, these random variations in manufacturing, which are known as tolerances, fit a Gaussian distribution, and the previous analysis of random measurement errors can be applied to analyze the distribution of these variations in manufacturing parameters.

Example 8 An integrated circuit chip contains 10^5 transistors. The transistors have a mean current gain of 20 and a standard deviation of 2. Calculate the following:

(a) number of transistors with a current gain between 19.8 and 20.2 (b) number of transistors with a current gain greater than 17.

Solution (a) The proportion of transistors where 19.8 < gain < 20.2 is…

Chi-Squared (x2) Distribution

We have already observed the fact that, if we calculate the mean value of successive sets of samples of N measurements, the means of those samples form a Gaussian distribution about the true value of the measured quantity (the true value being the mean of the infinite data set that the set of samples are part of). The standard deviation of the distribution of the mean values was quantified as the standard error of the mean.

It’s also useful for many purposes to look at distribution of the variance of successive sets of samples of N measurements that form part of a Gaussian distribution. This is expressed as the chi-squared distribution F( Χ²), where Χ² is given by ... is the variance of a sample of N measurements and s2 is the variance of the infinite data set that sets of N samples are part of. k is a constant known as the number of degrees of freedom and is equal to (N_1).

The shape of the Χ² distribution depends on the value of k, with typical shapes being shown in Fgr. 10. The area under the Χ² distribution curve is unity but, unlike the Gaussian distribution, the Χ² distribution is not symmetrical. However, it tends toward the symmetrical shape of a Gaussian distribution as k becomes very large.

The Χ² distribution expresses the expected variation due to random chance of the variance of a sample away from the variance of the infinite population that the sample is part of. The magnitude of this expected variation depends on what level of "random chance" we set. The level of random chance is normally expressed as a level of significance, which is usually denoted by the symbol a. Referring to the Χ² distribution shown in Fgr. 11,value Χ² a denotes the Χ² value to the left of which lies 100(1_a)% of the area under the Χ² distribution curve. Thus, the area of the curve to the right of Χ² a is a and that to the left is (1_a).

Numerical values for Χ² are obtained from tables that express the value of Χ² for various degrees of freedom k and for various levels of significance a. Published tables differ in the number of degrees of freedom and the number of levels of significance covered. A typical table is shown as Table 2.

One major use of the Χ²distribution is to predict the variance s2 of an infinite data set, given the measured variance sx2 of a sample of N measurements drawn from the infinite population.

The boundaries of the range of Χ² values expected for a particular level of significance a can be expressed by the probability expression:

To put this in simpler terms, we are saying that there is a probability of (1_a)% that Χ² lies within the range bounded by Χ² 1_a=2 and Χ² a=2 for a level of significance of a. For example, for a level of significance a = 0.5, there is a 95%probability (95%confidence level) that Χ² lies between Χ² 0:975 and Χ² 0:025.

The thing that is immediately evident in this solution is that the range within which the true variance and standard deviation lies is very wide. This is a consequence of the relatively small number of measurements (10) in the sample. It’s therefore highly desirable wherever possible to use a considerably larger sample when making predictions of the true variance and standard deviation of some measured quantity.

The solution to Example 10 shows that, as expected, the width of the estimated range in which the true value of the standard deviation lies gets wider as we increase the confidence level from 90 to 99%. It’s also interesting to compare the results in Examples 3.9 and 3.10 for the same confidence level of 95%. The ratio between maximum and minimum values of estimated variance is much greater for the 10 samples in Example 9 compared with the 25 samples in Example 10. This shows the benefit of having a larger sample size when predicting the variance of the whole population that the sample is drawn from.

Example 9

The length of each rod in a sample of 10 brass rods is measured, and the variance of the length measurement in the sample is found to be 16.3 mm. Estimate the true variance and standard deviation for the whole batch of rods from which the sample of 10 was drawn, expressed to a confidence level of 95%.

Example 10---The length of a sample of 25 bricks is measured and the variance of the sample is calculated as 6.8 mm. Estimate the true variance for the whole batch of bricks from which the sample of 25 was drawn, expressed to confidence levels of (a) 90%, (b) 95%, and (c) 99%.

Goodness of Fit to a Gaussian Distribution

All of the analysis of random deviations presented so far only applies when data being analyzed belong to a Gaussian distribution. Hence, the degree to which a set of data fits a Gaussian distribution should always be tested before any analysis is carried out. This test can be carried out in one of three ways:

(a) Inspecting the shape of the histogram: The simplest way to test for Gaussian distribution of data is to plot a histogram and look for a "bell shape" of the form shown earlier in Fgr. 6. Deciding whether the histogram confirms a Gaussian distribution is a matter of judgment. For a Gaussian distribution, there must always be approximate symmetry about the line through the center of the histogram, the highest point of the histogram must always coincide with this line of symmetry, and the histogram must get progressively smaller either side of this point. However, because the histogram can only be drawn with a finite set of measurements, some deviation from the perfect shape of histogram as described previously is to be expected even if data really are Gaussian.

(b) Using a normal probability plot: A normal probability plot involves dividing data values into a number of ranges and plotting the cumulative probability of summed data frequencies against data values on special graph paper.* This line should be a straight line if the data distribution is Gaussian. However, careful judgment is required, as only a finite number of data values can be used and therefore the line drawn won’t be entirely straight even if the distribution is Gaussian. Considerable experience is needed to judge whether the line is straight enough to indicate a Gaussian distribution. This will be easier to understand if data in measurement set C are used as an example. Using the same five ranges as used to draw the histogram, the following table is first drawn:

The normal probability plot drawn from this table is shown in Fgr. 12. This is sufficiently straight to indicate that data in measurement set C are Gaussian.

(c) The Χ² test: The Χ² distribution provides a more formal method for testing whether data follow a Gaussian distribution. The principle of the Χ² test is to divide data into p equal width bins and to count the number of measurements ni in each bin, using exactly the same procedure as done to draw a histogram. The expected number of measurements ni 0 in each bin for a Gaussian distribution is also calculated. Before proceeding any further, a check must be made at this stage to confirm that at least 80% of the bins have a data count greater than a minimum number for both ni and ni 0. We will apply a minimum number of four, although some statisticians use the smaller minimum of three and some use a larger minimum of five. If this check reveals that too many bins have data counts less than the minimum number, it’s necessary to reduce the number of bins by redefining their widths. The test for at least 80% of the bins exceeding the minimum number then has to be reapplied. Once the data count in the bins is satisfactory, a Χ² value is calculated for data according to the following formula:

The Χ² test then examines whether the calculated value of Χ² is greater than would be expected for a Gaussian distribution according to some specified level of chance.

This involves reading off the expected value from the Χ² distribution table (Table 3.2) for the specified confidence level and comparing this expected value with that calculated in Equation (25). This procedure will become clearer if we work through an example.

Example 11 --- A sample of 100 pork pies produced in a bakery is taken, and the mass of each pie (grams) is measured. Apply the Χ² test to examine whether the data set formed by the set of 100 mass measurements shown here conforms to a Gaussian distribution.

Solution:

Applying the Sturgis rule, the recommended number of data bins p for N data points is given by ...

This rounds to 8.

Mass measurements span the range from 480 to 519. Hence we will choose data bin widths of 5 grams, with bin boundaries set at 479.5, 484.5, 489.5, 494.5, 499.5, 504.5, 509.5, 514.5 and 519.5 (boundaries set so that there is no ambiguity about which bin any particular data value fits in). The next step involves counting the number of measurements in each bin. These are the ni values, i = 1, ___ 8, for Equation (25).

Results of this counting are set out in the following table.

Because none of the bins have a count less than our stated minimum threshold of four, we can now proceed to calculate ni 0 values. These are the expected numbers of measurements in each data bin for a Gaussian distribution. The starting point for this calculation is knowing the mean value (m) and standard deviation of the 100 mass measurements. These are calculated using Equations (4) and (10) as m = 499.53 and s =8.389.We now calculate the z values corresponding to the measurement values (x)at the upper end of each data bin using Equation (16) and then use the error function table (Table 3.1) to calculate F(z). F(z) gives the proportion of z values that are _z, which gives the proportion of measurements less than the corresponding x values. This then allows calculation of the expected number of measurements ni 0 __ in each data bin. These calculations are shown in the following table.

In case there is any confusion about the calculation of numbers in the final column, let us consider rows 1 and 2. Row 1 shows that the proportion of data points less than 484.5 is 0.037. Because there are 100 data points in total, the actual estimated number of data points less than 484.5 is 3.7. Row 2 shows that the proportion of data points less than 489.5 is 0.116, and hence the total estimated number of data points less than 489.5 is 11.6. This total includes the 3.7 data points less than 484.5 calculated in the previous row. Hence, the number of data points in this bin between 484.5 and 489.5 is 11.6 minus 3.7, that is, 7.9.

We can now calculate the Χ² value for data using Equation (25). The steps of the calculation are shown in the following table.

The value of Χ² is now found by summing the values in the final column to give Χ² = 1:96.

The final step is to check whether this value of Χ² is greater than would be expected for a Gaussian distribution. This involves looking up Χ² in Table 3.2. Before doing this, we have to specify the number of degrees of freedom, k. In this case, k is the number of bins minus 2, because data are manipulated twice to obtain the m and s statistical values used in the calculation of ni0. Hence, k = 8 _ 2 = 6.

Table 2 shows that, for k = 6, Χ² = 1.64 for a 95% confidence level and Χ² = 2.20 for a 90% confidence level. Hence, our calculated value for Χ² of 1.96 shows that the confidence level that data follow a Gaussian distribution is between 90% and 95%.

We will now look at a slightly different example where we meet the problem that our initial division of data into bins produces too many bins that don’t contain the minimum number of data points necessary for the Χ² test to work reliably.

Example 12:

Suppose that the production machinery used to produce the pork pies featured in Example 11 is modified to try and reduce the amount of variation in mass. The mass of a new sample of 100 pork pies is then measured. Apply the Χ² test to examine whether the data set formed by the set of 100 new mass measurements shown here conforms to a Gaussian distribution.

Solution:

The recommended number of data bins for 100 measurements according to the Sturgis rule is eight, as calculated in Example 11.Massmeasurements in this new data set span the range from 481 to 517. Hence, data bin widths of 5 grams are still suggested, with bin boundaries set at 479.5, 484.5, 489.5, 494.5, 499.5, 504.5, 509.5, 514.5, and 519.5.

The number of measurements in each bin is then counted, with the counts given in the following table:

Looking at these counts, we see that there are two bins with a count less than four. This amounts to 25% of the data bins. We have said previously that not more than 20% of data bins can have a data count less than the threshold of four if the Χ² test is to operate reliably. Hence, we must combine the bins and count the measurements again. The usual approach is to combine pairs of bins, which in this case reduces the number of bins from eight to four. The boundaries of the new set of four bins are now 479.5, 489.5, 499.5, 509.5, and 519.5. New data ranges and counts are shown in the following table.

Now, none of the bins have a count less than our stated minimum threshold of four and so we can proceed to calculate ni0 values as before. The mean value (m) and standard deviation of the new mass measurements are m = 499.39 and s = 6.979. We now calculate the z values corresponding to the measurement values (x) at the upper end of each data bin, read off the corresponding F(z) values from Table 1, and so calculate the expected number of measurements ni 0 __ in each data bin:

We now calculate the Χ² value for data using Equation (25). The steps of the calculation are shown in the following table.

The value of Χ² is now found by summing the values in the final column to give Χ² = 0:091. The final step is to check whether this value of Χ² is greater than would be expected for a Gaussian distribution. This involves looking up Χ² in Table .2. This time, k = 2, as there are four bins and k is the number of bins minus 2 (as explained in Example 11, data were manipulated twice to obtain the m and s statistical values used in the calculation of ni 0).

Table 2 shows that, for k = 2, Χ² = 0.10 for a 95% confidence level. Hence, our calculated value for Χ² of 0.91 shows that the confidence level that data follow a Gaussian distribution is slightly better than 95%.

Out of interest, if the two bin counts less than four had been ignored and Χ² had been calculated for the eight original data bins, a value of Χ² = 2:97 would have been obtained. (It would be a useful exercise for the reader to check this for himself/herself.) For six degrees of freedom (k =8 _ 2), the predicted value of Χ² for a Gaussian population from Table 3.2 is 2.20 at a 90% confidence level. Thus the confidence that data fit a Gaussian distribution is substantially less than 90% given the Χ² value of 2.97 calculated for data. This result arises because of the unreliability associated with calculating Χ² from data bin counts of less than four.

Rogue Data Points (Data Outliers)

In a set of measurements subject to random error, measurements with a very large error sometimes occur at random and unpredictable times, where the magnitude of the error is much larger than could reasonably be attributed to the expected random variations in measurement value. These are often called rogue data points or data outliers. Sources of such abnormal error include sudden transient voltage surges on the main power supply and incorrect recording of data (e.g., writing down 146.1 when the actual measured value was 164.1). It’s accepted practice in such cases to discard these rogue measurements, and a threshold level of a _3s deviation is often used to determine what should be discarded. It’s rare for measurement errors to exceed _3s limits when only normal random effects are affecting the measured value.

While the aforementioned represents a reasonable theoretical approach to identifying and eliminating rogue data points, the practical implementation of such a procedure needs to be done with care. The main practical difficulty that exists in dealing with rogue data points is in establishing what the expected standard deviation of the measurements is....

When a new set of measurements is being taken where the expected standard deviation is not known, the possibility exists that a rogue data point exists within the measurements.

Simply applying a computer program to the measurements to calculate the standard deviation will produce an erroneous result because the calculated value will be biased by the rogue data point. The simplest way to overcome this difficulty is to plot a histogram of any new set of measurements and examine this manually to spot any data outliers. If no outliers are apparent, the standard deviation can be calculated and then used in a _3s threshold against which to test all future measurements. However, if this initial data histogram shows up any outliers, these should be excluded from the calculation of the standard deviation.

It’s interesting at this point to return to the problem of ensuring that there are no outliers in the set of data used to calculate the standard deviation of data and hence the threshold for rejecting outliers. We have suggested that a histogram of some initial measurements be drawn and examined for outliers. What would happen if the set of data given earlier in Example 13 was the initial data set that was examined for outliers by drawing a histogram? What would happen if we did not spot the outlier of 4.59? This question can be answered by looking at the effect on the calculated value of standard deviation if this rogue data point of 4.59 is included in the calculation. The standard deviation calculated over the 19 values, excluding the 4.59 measurement, is 0.052. The standard deviation calculated over the 20 values, including the 4.59 measurement, is 0.063 and the mean data value is changed to 4.42. This gives a 3s threshold of 0.19, and the boundaries for the _3s threshold operation are now 4.23 and 4.61. This does not exclude the data value of 4.59, which we identified previously as a being a rogue data point! This confirms the necessity of looking carefully at the initial set of data used to calculate the thresholds for rejection of the rogue data point to ensure that initial data don’t contain any rogue data points. If drawing and examining a histogram don’t clearly show that there are no rogue data points in the "reference" set of data, it’s worth taking another set of measurements to see whether a reference set of data can be obtained that is more clearly free of rogue data points.

Example 13:

A set of measurements is made with a new pressure transducer. Inspection of a histogram of the first 20 measurements does not show any data outliers. The standard deviation of the measurements is calculated as 0.05 bar after this check for data outliers, and the mean value is calculated as 4.41. Following this, the following further set of measurements is obtained:

4.35 4.46 4.39 4.34 4.41 4.52 4.44 4.37 4.41 4.33 4.39 4.47 4.42 4.59 4.45 4.38 4.43 4.36 4.48 4.45

Use the _3s threshold to determine whether there are any rogue data points in the measurement set.

Solution:

Because the calculated s value for a set of "good" measurements is given as 0.05, the _3s threshold is _0.15. With a mean data value of 4.41, the threshold for rogue data points is values below 4.26 (mean value minus 3s) or above 4.56 (mean value plus 3s).

Looking at the set of measurements, we observe that the measurement of 4.59 is outside the _3s threshold, indicating that this is a rogue data point.

Student t Distribution

When the number of measurements of a quantity is particularly small (less than about 30 samples) and statistical analysis of the distribution of error values is required, the possible deviation of the mean of measurements from the true measurement value (the mean of the infinite population that the sample is part of) may be significantly greater than is suggested by analysis based on a z distribution. In response to this, a statistician called William Gosset developed an alternative distribution function that gives a more accurate prediction of the error distribution when the number of samples is small. He published this under the pseudonym

"Student" and the distribution is commonly called student t distribution. It should be noted that t distribution has the same requirement as the z distribution in terms of the necessity for data to belong to a Gaussian distribution.

The student t variable expresses the difference between the mean of a small sample xmean ð + and the population mean (m) in terms of the following ratio:

t = j error in mean j …standard error of the mean ....

Because we don’t know the exact value of s, we have to use the best approximation to s that we have, which is the standard deviation of the sample, sx. Substituting this value for s in Equation (26) gives ....

Note that the modulus operation (| ___ |) on the error in the mean in Equations (26) and (27) means that t is always positive.

The shape of the probability distribution curve F(t) of the t variable varies according to the value of the number of degrees of freedom, k ( = N _ 1), with typical curves being shown in Fgr. 13. Ask!1, F(t)!F(z), that is, the distribution becomes a standard Gaussian one. For values of k<1, the curve of F(t) against t is both narrower and less high in the center than a standard Gaussian curve, but has the same properties of symmetry about t = 0 and a total area under the curve of unity.

In a similar way to z distribution, the probability that t will lie between two values t1 and t2 is given by the area under the F(t) curve between t1 and t2. The t distribution is published in the form of a standard table (see Table 3) that gives values of the area under the curve a for various values of k, where ...

The area, a, is shown in Fgr. 14 a corresponds to the probability that t will have a value greater than t3 to some specified confidence level. Because the total area under the F(t) curve is unity, there is also a probability of (1_a) that t will have a value less than t3. Thus, for a value a =0.05, there is a 95% probability (i.e., a 95% confidence level) that t < t3.

Because of the symmetry of t distribution, a is also given by .... as shown in Fgr. 15. Here, a corresponds to the probability that t will have a value less than _t3, with a probability of (1_a) that t will have a value greater than _t3.

Equations (28) and (.29) can be combined to express the probability (1_a) that t lies between two values _t4 and +t4. In this case, a is the sum of two areas of a/2 as shown in Fgr. 16. These two areas can be represented mathematically as ...

The values of t4 can be found in any t distribution table, such as Table 3.

Referring back to Equation (27), this can be expressed in the form:

Hence, upper and lower bounds on the expected value of the population mean m (the true value of x) can be expressed as ...

Out of interest, let us examine what would have happened if we had calculated the error bounds on m using standard Gaussian (z-distribution) tables. For 95% confidence, the maximum error is given as _1:96s= ..., that is, _0.96, which rounds to _1.0 mm, meaning the mean internal diameter is given as 105.4 _ 1.0 mm. The effect of using t distribution instead of z distribution clearly expands the magnitude of the likely error in the mean value to compensate for the fact that our calculations are based on a relatively small number of measurements.

Example 14:

The internal diameter of a sample of hollow castings is measured by destructive testing of 15 samples taken randomly from large batch of castings. If the sample mean is 105.4mm with a standard deviation of 1.9 mm, express the upper and lower bounds to a confidence level of 95% on the range in which the mean value lies for internal diameter of the whole batch.

NEXT: Aggregation of Measurement System Errors

Article index [industrial-electronics.com/DAQ/mi_0.html]

top of page Home

Home | Glossary | Books | Links/Resources

EMC Testing | Environmental Testing | Vibration Testing

Updated: Sunday, 2014-03-30 17:07 PST