- Edited
MWJB Yes, extraction time was not the target, it was a variable that was kept constant. EY was the only dependent variable. My point is that the variability in EY cannot be due to variable extraction times (as Lance points out too). So if there is variability, there must be other variables at play affecting EY that you cannot control.
How do you control for this variability? By taking multiple measurements. N=8 is not enough to make any statistical judgments. Especially if you are doing multiple comparisons, the chances of finding a difference purely due to chance increase exponentially.
Lance claims that the Blind Shaker is superior overall as the mean difference in EY is 0.7%. This is based on the difference in mean values between different groups. I do not think this is a valid statement. For example, If you do a t-test comparing Blind Shaker vs Moonraker, you do not get a statistically significant difference. ( p is .10768)
Post hoc testing shows a statistically significant difference only for Blind Shaker vs Autocomb.
Also, what is the minimum difference in EY that can be detected by taste? Ultimately that is what matters.
The inference you could make is that all methods are equal. But this completely ignores the probability of type 2 error- not finding a difference because the sample size was too small.
The method that I think makes more sense, is to do a double blind study. Someone else plays with the parameters, another person brings it over the Lance and his job would be to just judge the flavours. Use a different coffee for each extraction and try out the different methods. Not entirely sure what the appropriate sample size would be- this will need power analysis