To qualify a measurement process in a shop/factory you DO need to evaluate multiple inspectors.
Using test articles that are worthy of being measured.
That means surface finish, taper, roundness of the part.
The measuring instrument MUST perform to specs with a GOOD inspector, on good parts, in a good environment.
At the factory I worked at, as a senior Metrologist, we checked calibration on hundreds of calipers, cal cycle after cal cycle, for years. Cleaned, stoned jaws, adjusted the grub screws for proper jaw tension, in a controlled environment, IF the Extreme Spread, in the lab, showed outside +/- 0.001" it was rejected. We checked for beam errors (per inch up to full travel), spot checked per 0.025" tooth, and 0.1" gear errors (dirt and wear).
Send those same calipers out to the shop with 100+ machinists/quality inspectors using any one of a few hundred calipers, in shop environments, the process tolerance was expanded.
Caliper performance is the sum of Human, Tool, Part, and environment.
For the Home Reloader, errors across multiple people and multiple instruments is not normally included in an process evaluation. Measuring parts like brass cases might not be the best way to evaluate an individual or an instrument. Especially after the Home Reloader has messed with the cases

.
Case length, or COAL with pointed bullets, will have variances due to poor part design (not suitable for precision measurement). CBTO and CBTDatum measurements are plagued with measuring a tapered part, usually with an accessory. Accept the fact that some measurements will always be in the good enough range.
I propose a home test, over months and months of testing.
You, and only you and one set of decent calipers. If you must, compare two calipers.
Clean and adjust, and check ZERO. Once, 3 times, 10 times, 50 times.
If you EVER get a whole thousandths spread, something is WRONG.
Select 3 gage pins, separated by 0.001", like 0.223", 0.224" and 0.225".
Put a small piece of masking tape over the etched size.
Zero, measure one pin, the next, and the next.
Do this for 7000 times
Kidding. If you can't pick out the three pin sizes from your measurements 99.999% of the time,
something is WRONG.
There might be a small amount of taper or roundness with the bullets that you didn't see with the pins, but good bullets should be fairly uniform (probably better than +/- 0.001"). My Hornady Ogive tool ALWAYS (since new) measures almost 0.001" short requiring a zero reset. Ogive measurements will also include errors between bullet taper and the tool.
Not best for accurate measurements, but oh well.
So, back to the bullets.
Check ZERO, measure the pins, measure a few bullets.
Bullet diameters, loaded neck diameters are just made for caliper measurements.
Now check your bullet diameter measurements with a one tenth (or better) micrometer.
