Measurement System Analysis: Are Your Gauges Capable?
- carystraley
- Jun 5
- 13 min read
Your parts might be in spec. Or they might not be. Without a valid measurement system analysis, you genuinely cannot tell the difference. Studies referenced in the AIAG MSA Reference Manual consistently show that measurement error can consume 20 to 30 percent of your total process tolerance, which means a gauge that appears functional is silently corrupting accept/reject decisions on the shop floor. For precision machining operations serving automotive, aerospace, and industrial customers, that is not a theoretical risk. It is a daily liability that drives scrap, warranty returns, and failed PPAPs.
Table of Contents
Quick Takeaways
Key Insight
Explanation
%GRR below 10% is acceptable
When measurement variation is less than 10% of the total study variation or tolerance, the gauge system is considered capable for production use.
%GRR between 10% and 30% requires judgment
This range may be acceptable depending on application criticality, cost of the gauge, and customer requirements. Document the decision.
%GRR above 30% means the gauge is failing you
Reject the measurement system for production decisions until the source of variation is identified and corrected.
Bias and stability are separate from repeatability and reproducibility
A gauge can pass a Gage R&R study and still have significant bias. Run linearity and bias studies on instruments with wide measurement ranges.
Number of distinct categories (ndc) must be at least 5
If your gauge cannot distinguish at least 5 categories within the part variation, it lacks the resolution to support process control decisions.
Operator technique contributes to reproducibility error
When the reproducibility component is large relative to repeatability, the problem is training or fixture design, not the gauge itself.
MSA is required for PPAP Level 3 and above
AIAG PPAP documentation requirements explicitly include Gage R&R studies for all measurement systems used to verify characteristics on the control plan.
What Is Measurement System Analysis

Measurement system analysis is a structured methodology for quantifying and separating the sources of variation within a measurement process. The measurement process includes the gauge itself, the operators using it, the fixturing that holds the part, the environment, and the interaction between all of these elements. MSA does not simply ask whether a gauge is calibrated. It asks whether the total measurement system is statistically capable of making reliable decisions relative to your part tolerance.
The methodology originates from the AIAG Measurement System Analysis Reference Manual, now in its fourth edition, which is the governing document for MSA requirements in IATF 16949-certified supply chains. The manual defines a measurement system as the complete process of obtaining a measurement, not just the instrument. This distinction matters because in practice, a perfectly calibrated micrometer operated inconsistently by two different machinists is a broken measurement system regardless of its calibration certificate.
At Summit City Precision Machining, MSA is not a paperwork exercise conducted before a PPAP submission. It directly informs which gauges are assigned to which features, how CMM programs are validated, and how the MetroLab division structures calibration intervals based on observed stability data. A gauge that performs well in a controlled calibration environment may perform very differently on a production floor with coolant mist, vibration, and multiple shifts.

Why Gage R&R Is the Foundation of MSA
Gage R&R, which stands for Gage Repeatability and Reproducibility, is the most commonly conducted MSA study and the one most directly tied to production decisions. It isolates two components of measurement variation: repeatability, which is the variation produced by the gauge itself when the same operator measures the same part multiple times, and reproducibility, which is the variation introduced when different operators use the same gauge on the same part.
These two components are not interchangeable, and treating them as a single number misses the diagnostic power of the study. A common mistake is reporting only the combined %GRR value without examining the ratio of repeatability to reproducibility. If reproducibility is the dominant source of variation, the fix is operator standardization or fixture improvement, not a new gauge. If repeatability is the dominant source, the gauge itself or its mounting needs attention.
The Crossed vs. Nested Study Design
A crossed Gage R&R study, where every operator measures every part, is the standard approach when all operators have access to all parts. This is the correct design for most production gauges used on discrete parts. A nested design is used when destructive testing is required, meaning the part cannot be re-measured after the first measurement. Tensile testing and surface finish measurements on soft substrates are common nested study applications.
In practice, the crossed study with 3 operators, 10 parts, and 2 replications provides the statistical power required by AIAG guidelines. Deviating from this sample structure by using fewer parts or fewer replications reduces the precision of your variance estimates, and customers reviewing your PPAP will notice.
Pro tip: Select parts for your Gage R&R study that span the full expected range of production variation, not just parts that are near nominal. If all your study parts are close to the same dimension, your ndc value will be artificially inflated and your %GRR will appear better than it actually is in production.
The Five Properties MSA Actually Measures
The AIAG MSA manual defines five statistical properties of a measurement system, and a complete MSA program evaluates all five. Most quality teams only run Gage R&R and consider their obligation fulfilled. That approach leaves significant risk on the table, particularly for gauges used across wide measurement ranges or in environments with temperature variation.
The five properties are bias, repeatability, reproducibility, stability, and linearity. Bias is the systematic difference between the average measured value and the true reference value. Repeatability and reproducibility are captured in the Gage R&R study. Stability is the change in bias over time, evaluated by measuring a reference standard repeatedly over weeks or months. Linearity is the change in bias across the operating range of the gauge.
Linearity Studies for Wide-Range Gauges
A linearity study is particularly important for gauges that measure parts across a wide range, such as a micrometer used to measure features from 0.5 inches to 4 inches. A gauge may have acceptable bias at one end of its range and significant bias at the other. Without a linearity study, you will never catch this pattern through routine calibration checks, which typically only verify accuracy at one or two reference points.
For CMM qualification, linearity is built into the volumetric accuracy assessment required by ISO 10360, but for manual gauges and indicator-based fixtures, linearity studies are frequently skipped. This is a gap that experienced customers and auditors will flag.
Stability Tracking as a Calibration Driver
Stability data is the most defensible basis for setting calibration intervals. Instead of assigning arbitrary 6-month or 12-month intervals, stability charts show you when a specific gauge begins to drift. In practice, a well-maintained shop CMM operating in a temperature-controlled environment may demonstrate 18 months of stability, while a handheld indicator in a production cell shows drift within 90 days. The data should drive the schedule.
How to Conduct a Crossed Gage R&R Study
The procedural steps for a valid crossed Gage R&R study are specific and non-negotiable. Shortcuts produce misleading results, and misleading results produce bad measurement decisions. The following is how the study is actually executed in a production machining environment, not how it is described in a textbook summary.
First, select 10 parts that represent the full range of production variation for the feature being measured. Number or mark the parts in a way that the operators cannot see the part number during measurement to prevent memory bias. Second, randomize the measurement order. Do not hand operators parts in sequential order. AIAG guidance is explicit that randomization is required to prevent operators from remembering previous readings.
Running the Study Without Introducing Bias
Each of the three operators measures all 10 parts in a randomized order, completes all measurements, and then the sequence is repeated for the second replication. Do not allow operators to see each other's measurements. Do not allow the study coordinator to announce results between replications. Any communication of measurement values between operators contaminates the reproducibility component of the study.
Record all raw values. Do not round or adjust readings to align with expectations. The value of the study depends entirely on capturing the true behavior of the measurement system, not the behavior you want to see. Rounding errors alone can artificially improve or degrade your %GRR, and this is one of the most common ways Gage R&R studies are unknowingly corrupted.
Pro tip: Use a worksheet or software that forces operators to enter data one measurement at a time, with previous values hidden. Minitab, SPC software with Gage R&R modules, and even well-designed Excel templates can enforce this sequence. The discipline of data entry is not a minor detail.
Interpreting Your Results: The Numbers That Matter
Once your Gage R&R data is analyzed using ANOVA or the range method, you will receive several output metrics. The three that drive action are %GRR (the percentage of study variation or tolerance consumed by measurement error), the number of distinct categories (ndc), and the ratio of repeatability to reproducibility variance components.
The AIAG acceptance criteria are clear. A %GRR below 10% indicates an acceptable measurement system. A %GRR between 10% and 30% falls in a conditional zone where the decision to accept or reject the gauge must be documented and justified based on application risk and cost. A %GRR above 30% is an unacceptable measurement system for production decisions, full stop.
"The number of distinct categories represents the number of non-overlapping confidence intervals that will span the range of product variation. The ndc should be greater than or equal to five for an adequate measurement system." AIAG Measurement System Analysis Reference Manual, Fourth Edition
The ndc is an often-ignored output that is actually one of the most practically meaningful. A gauge with %GRR of 12% might appear borderline acceptable, but if its ndc is 3, it can only reliably distinguish three groups of parts within the production range. That gauge cannot support statistical process control or meaningful capability analysis. The %GRR alone does not tell you that.
MSA for CMM and Automated Gauging
CMM-based measurement systems require MSA just as manual gauges do, and many quality programs treat CMM results as ground truth without ever validating the measurement system. This is a significant oversight. A CMM program with poorly defined datum structures, inadequate stylus qualification, or insufficient point density on curved surfaces will produce repeatable errors consistently. Repeatability on a CMM is not a substitute for accuracy.
For CMM qualification, the study design needs to account for the absence of operator-to-operator variation as a primary contributor. When a single CMM program runs the same part, reproducibility between operators is replaced by reproducibility between fixturing load cycles. The variation introduced by how the part is loaded, clamped, and seated in the fixture is frequently the largest source of measurement error in CMM-based Gage R&R studies, and it is entirely addressable through fixture design.
Automated Gauging in High-Volume Production
Automated in-process gauging, post-process gauging cells, and vision systems used in production require attribute or variable MSA depending on whether they produce discrete pass/fail outputs or continuous measurements. For variable automated gauges, the crossed Gage R&R study is replaced by a study that characterizes repeatability across multiple load cycles, environmental conditions, and time periods.
SCPM's MetroLab division supports customers in designing MSA studies for both CMM programs and custom inspection fixtures, including the development of dedicated MSA fixtures that hold reference parts in controlled orientations to eliminate loading variation as a confounding factor in the study.

Comparison of MSA Study Types
Study Type
What It Measures
When to Use It
Crossed Gage R&R (Variable)
Repeatability and reproducibility for continuous measurement data across multiple operators and parts
Standard production gauges, CMM programs, inspection fixtures used by multiple operators on re-measurable parts
Nested Gage R&R (Variable)
Repeatability and reproducibility when parts cannot be re-measured after initial measurement
Destructive tests, surface roughness on soft materials, any measurement that alters the part
Attribute Agreement Analysis (Attribute)
Consistency of pass/fail decisions between operators and against a reference standard
Go/no-go gauges, visual inspection, functional fixtures that produce binary outputs rather than dimensional values
Common MSA Failures and How to Avoid Them
MSA studies fail in predictable ways, and the root causes are almost always process failures rather than statistical failures. The mathematics of a Gage R&R study are straightforward. The discipline required to execute the study correctly is where most problems originate.
The most common failure is using parts with insufficient variation. When all study parts measure within 0.0003 inches of each other on a feature with a 0.010-inch tolerance, the gauge variation appears enormous relative to the part variation. The %GRR skyrockets, the ndc drops to 1 or 2, and the gauge appears to have failed. In reality, the study design failed. The gauge may be perfectly appropriate for the application.
Operator Coaching During the Study
A second common failure is coaching operators during the study. When study coordinators tell operators to take their time, check their technique, or re-measure a suspicious reading, the repeatability component is artificially compressed. The study produces results that reflect best-case performance rather than typical production performance. The study should replicate exactly how measurement happens on the floor, not how you want it to happen.
Using the Wrong Reference for %GRR Calculation
There are two valid references for calculating %GRR: total study variation (the spread of all measurements in the study) and tolerance (the engineering tolerance of the feature). Using study variation tends to produce better-looking %GRR numbers when part variation is wide, while using tolerance is more conservative and more relevant for production quality decisions. Know which basis your customers expect and document it clearly in your PPAP submission. Submitting a %GRR calculated on study variation when your customer expects tolerance-based calculation is a common audit finding.
MSA in the Context of PPAP and A2LA Accreditation
PPAP Level 3 submissions require Gage R&R data for all measurement systems referenced in the control plan. This is not optional, and it is not satisfied by submitting a calibration certificate. The calibration certificate tells the customer that your gauge reads accurately at a reference standard. The Gage R&R tells the customer that your measurement process is statistically capable of correctly classifying their parts in production conditions.
SCPM's A2LA accreditation through the MetroLab division means that MSA requirements are not just customer-driven but are baked into the laboratory quality management system. A2LA accreditation under ISO/IEC 17025 requires documented evidence of measurement uncertainty for every calibration and inspection activity, which is the calibration world's equivalent of MSA. The underlying principle is identical: quantify and control the sources of measurement error before you trust the numbers.
For customers who require first article inspection reports, PPAP documentation packages, or ongoing production measurement support, working with an A2LA-accredited metrology lab eliminates the need to qualify the measurement system independently. The accreditation body has already audited the lab's MSA practices, uncertainty budgets, and calibration traceability, which is documentation your customers can rely on without performing their own supplier audit of the lab.
The interaction between gauge manufacturing, CMM programming, and MSA is tighter than most manufacturers realize. A gauge that is manufactured to tight tolerances but not validated through an MSA study is an untested assumption in your quality system. SCPM manufactures custom gauges and validates them through complete MSA studies before they are deployed, which means customers receive a gauge and a measurement system, not just a piece of hardened steel.
Frequently Asked Questions
What is the difference between calibration and measurement system analysis?
Calibration verifies that a gauge reads accurately at one or more known reference values under controlled conditions. Measurement system analysis evaluates the entire measurement process, including operator technique, fixturing, environmental effects, and gauge variation, to determine whether the system can reliably distinguish conforming from non-conforming parts in actual production conditions. A calibrated gauge can still fail an MSA study.
How often should a Gage R&R study be conducted?
At a minimum, a Gage R&R study should be conducted before any gauge is deployed for a new application, after any gauge repair or significant maintenance, after operator personnel changes, and whenever process capability data shows unexplained variation. Some customer contracts specify annual re-qualification of measurement systems. Stability data can inform the re-qualification interval: if stability charts show no drift, a longer interval is defensible.
Can a Gage R&R study be run with fewer than 3 operators or 10 parts?
Technically yes, but the statistical power of the study decreases significantly, and AIAG guidelines are built around the standard structure of 3 operators, 10 parts, and 2 replications. Using 2 operators or fewer parts produces wider confidence intervals on your variance estimates, meaning your %GRR could be either better or worse than the true value by a larger margin. For PPAP submissions, deviating from the standard structure requires explicit customer approval.
What causes high reproducibility in a Gage R&R study?
High reproducibility variation means different operators are getting substantially different results when measuring the same part with the same gauge. The most common causes are inconsistent measurement technique (where the gauge is placed, how much force is applied, where the part contacts the datum), inadequate fixturing that allows the part to be positioned differently by different operators, and insufficient operator training on the specific measurement method. Fixing reproducibility is a training and fixture problem, not a gauge problem.
What does %GRR tell you that the ndc does not?
The %GRR tells you how large measurement error is relative to either the tolerance or the total study variation. The ndc tells you how many distinct groups of parts the measurement system can reliably separate within the production variation. You need both numbers because a gauge with a borderline %GRR of 20% might still have an ndc of 6, making it adequate for process monitoring, while a gauge with a %GRR of 8% might have an ndc of 4 if the part variation is very tight, making it insufficient for SPC purposes. Neither number alone gives you the full picture.
Is MSA required for attribute gauges like go/no-go pins?
Yes. Attribute gauges require attribute agreement analysis, which is a different study structure than a variable Gage R&R but serves the same purpose. In an attribute agreement analysis, multiple operators each assess a set of parts multiple times, and the results are compared against each other and against a reference determination, typically made using a calibrated variable gauge. The study quantifies within-operator consistency, between-operator consistency, and accuracy against the reference.
Have you encountered a gauge that passed calibration but failed a Gage R&R study? Share what you found as the root cause in the comments or reach out to the team at Summit City Precision Machining to discuss how the MetroLab division approaches measurement system qualification for your specific application.




Comments