January 20, 2019

It isn’t unusual to skip the Results section when reading a research report. The Results, in a quantitative  research paper, consist of statistics (numerical representation of collected data). In many cases a lack of understanding, or an under appreciation of the relevant stats is the reason the Results section aren’t read. There are a wide range of stats used across different domains. Frequentist stats are the type of stats most often used regarding fitness and health. Frequentists derive inferences from sample data emphasizing proportion or frequencies of collected data. If you were taught stats in college it was probably of the Frequentist type. Frequentist stats are discussed in this article.

This video provides an excellent overview comparing Frequentist and Bayesian stats: A primary difference is the definition of probability- Bayesian and Frequentist

Two general types of stats are used in Frequentist models: descriptive and inferential. Descriptive statistics are numerical measures that describe a population by providing information on the central tendency of the distribution, the width of distribution (dispersion, or variability), the shape of distribution (Jackson, 2009). Inferential statistics are procedures that allow us to make an inference from a sample to the population. That is, we are able to make generalizations about a population based on the information derived from the sample.

The Results & Discussion: Concise, mentioned below are from the study “Expectations Do Not Always Influence Food Liking”

Results & Discussion : Concise (Hale, 2012)

To test the hypothesis that participants in the positive expectation group would rate the crackers higher in liking than those in the neutral expectation group an independent sample t-test was conducted. The results of the independent samples t-test did not show a significant difference between cracker ratings from those in the positive expectation group (M = 4.22, SD = .60), versus those in the neutral expectation group (M=4.00, SD=.52), t (44) =1.31, p > .05, d = .39.  (symbols described at bottom). The independent samples t-test evaluated the mean difference between the two groups. If the difference reached the critical region (number criterion) we would have said statistical significance was found. However, not finding significance doesn’t mean there was no difference. It just means critical region criterion was not met in accordance with alpha level (significance level). We used standard null hypothesis statistical testing.. The number of the t statistic needed to be large enough (towards the end of the tail, extreme) to be congruent with a p-value of less than .05 (standard, even though sometimes p-value of .01 or even .001 may be used). The p-value of .05 is not determined by logical strictures or statistics, but that value has become common in significance testing. The p-value, p=.05, is not synonymous with saying “ the null hypothesis has only a 5% chance of being true”; textbooks promote this misinformation and students are taught this. “This is, without a doubt, the most pervasive and pernicious of the many misconceptions about the P value. It perpetuates the false idea that the data alone can tell us how likely we are to be right or wrong in our conclusions. The simplest way to see that this is false is to note that the P value is calculated under the assumption that the null hypothesis is true. It therefore cannot simultaneously be a probability that the null hypothesis is false (Goodman, p.136, 2008).”  The p-value, is the probability, assuming the null hypothesis, of obtaining a score equal to or more extreme than the one obtained. P-value is not synonymous with effect size. We calculated cohen’s d, which is a measure of effect size; the magnitude of difference between groups.

The findings in this study suggest that positive suggestions do not always lead to a statistically significant increase in ratings of food, when compared to a neutral expectation. One possibility for explaining this finding is that participants in the positive expectation group actually did not have a positive expectation regarding the flavor of the crackers. Another possibility for explaining the outcome of the study is the sensory properties of the food were inconsistent with the expectations. That is, participants in the positive expectation group expected the crackers to have a good flavor, but their expectations were disconfirmed when eating the crackers.

It is reasonable to suggest that if the sample had been larger there may have been a different outcome. A statistical power analysis conducted after the study shows with the effect size we obtained (near a moderate effect size) we needed a larger sample to find a significant difference (in terms of standard null hypothesis testing used in frequentist models). Sample size is an important factor in regards to obtaining stat…signif.. The small time frame, of seven minutes, may have influenced the outcome. The type of food used in this study may place limitations on the outcome.  Positive expectations may be hard to induce for a neutral food such as crackers.

If the objective is to thoroughly analyze the study, don’t skip over the “Results” section when reading the paper. The study outcomes are presented in the Results section. A key guideline for the Results section is a presentation of numerical findings that should be stated clearly, concisely and accurately. Learn at least the basics regarding theoretical implications of stats used. Consider the stats in the context of the paper; an understanding of the relevant literature is important. Also, appreciate uncertainty and consider other types of validities and their study relevance.

Better thinking about stats:

Don’t confuse statistical significance with effect size

Don’t over estimate the applicability of findings

Appreciate stats..limitations, don’t over speculate

Recognize, non-significant finding doesn’t mean there was no difference

Always consider sample size- appreciate small samples often unrepresentative of the population

Consider the stats in the context of the study- don’t neglect importance of statistical and other sorts of validity and reliability

Stats are derived from samples and the stats in the Results section do not show individual scores and variability

Stats prediction is superior to clinical, but it doesn’t reflect perfect prediction

What has been presented here is a concise look into what is involved with the Results section in quantitative research..

References available upon request

To learn more about stats and research methods read In Evidence We Trust  

In Evidence We Trust– contains 76 questions and answers regarding scientific research methods and stats. It also contains practice problems involving statistical procedures.