This underlined statement is kinda my point - sufficient analysis has not been done for us to be able to TRUST that these group sizes are actually different, but a lot of folks fall into a trap of believing they MUST be different.
The targets were presented and a bunch of guesses were chased based on one small 3 shot group, which was assumed to be different than one bigger 3 shot group - "different" meaning "differentiated" by the variable change, rather than simply "coincidentally not the same" which happens simply because one group out of two groups almost inevitably has to be smaller than another. A scatter plot was presented to better visualize a trend, and the observation bias was confirmed for a lot of folks here, again, without actually doing any actual analysis...
To the contrary of the quote here, I WOULD argue, however, that there is evidence that there really is no "strong indication for followup." Because there are no strong indicators of differentiation of the results within the dataset presented, and rather, there ARE strong indicators that the samples presented are NOT differentiated.
Here's a visual depiction of what I'm describing (this isn't even evaluating the Mean Radius or the Standard Deviation of the Radii of the groups, just taking a simplified look at the extreme spread of the groups, since that was the data presented):
I've plotted here the group sizes, as was done above, but adding +/-1SD error bars on each group, along with depicting the average of all groups, as well as the ranges for +/-1SD and +/-2SD from the average. Reminding, for a Normal Distribution Population, meaning NO differentiation at all, we predict 68.2% of samples to fall within +/-1SD from Mean, and we expect 95.4% of samples to fall within +/-2SD from Mean.
View attachment 1530286
So evaluating ONLY the data presented, because that's all we have:
Mean Group size ("average") = 0.37" (I estimated ~0.35 earlier today)
Standard Deviation of Group Size = 0.136" (I estimated ~0.1" earlier today)
Range = 0.52 (largest group minus smallest group)
The Mean group size is depicted by the light blue dashed line. We note here that the +/-1SD error bars on 14 of the samples crosses the mean line - so 14 of those groups COULD have been 0.37" within just 1SD of where their group size actually landed...
The span of +/-1 SD for this data set would span between 0.234" and 0.507" groups, which is depicted by the span between the orange dashed lines - again, this captures 14 of the 21 data points, which is 66%, while a Normal Distribution would predict 68.2%...
The span of +/-2SD is represented between the purple dashed lines, spanning between 0.098" and 0.643". This span captures 100% of the sample points, while a Normal Distribution would predict it to capture 95.4%... Since 4.6% of 21 is only 0.966 samples, I guess we should have expected ONE sample outside of the purple bracket (but within a +/-3SD bracket), but statistical predictions being what they are, it's fair to expect we might be off on ONE observation in 21...
So it sure seems uncanny to me that when we run statistics on this entire sample set as if they are NOT differentiated, they follow NEARLY perfectly Normal Distribution predictions. So it REALLY doesn't look to me like there is any support a hypothesis for "strong indications for follow up" to be derived from this data set.
It was mentioned above that we could consider a rolling average of the group sizes, which isn't a good means to determine differentiation between samples, BUT, just for some fun... I've added here a 5 point and 7 point rolling average over the plot, the orange curve representing the rolling average of the sample plus it's 2 nearest neighbors (5) and the sample plus it's 3 nearest neighbors (7). The class favorite of #13 isn't supported as favorable, as it seems the general trend of the data is showing smaller and smaller groups for the longer and longer jump, so 21 would be the best option if we believed this were a valid means of analysis.
View attachment 1530288
Given more information about each group, enough to compile a T-Test for Mean Radius, then we could REALLY be cooking to determine if these groups truly are differentiated or just coincidentally different. Repeating this test to compile multiple group sizes for each jump distance would also create enough sample size for a T-test. But in this quick and dirty analysis, it looks like we can't say ANY of these are truly different from one another.
[/QU
The SD should be calculated based on deviations among replicates (which don't exist), not between the the test items. There is sufficient data to carry out a proper analysis if one has stat software for anova, but not many are interested.