Year : 2019 | Volume
: 6 | Issue : 2 | Page : 51--52
Statistical need of the hour
Sanjeev Kumar Jain, Nidhi Sharma, Sonika Sharma
Department of Anatomy, Teerthanker Mahaveer Medical College, Teerthanker Mahaveer University, Moradabad, Uttar Pradesh, India
Dr. Sonika Sharma
Department of Anatomy, Teerthanker Mahaveer Medical College, Teerthanker Mahaveer University, Moradabad - 244 001, Uttar Pradesh
|How to cite this article:|
Jain SK, Sharma N, Sharma S. Statistical need of the hour.Acta Med Int 2019;6:51-52
|How to cite this URL:|
Jain SK, Sharma N, Sharma S. Statistical need of the hour. Acta Med Int [serial online] 2019 [cited 2022 Aug 17 ];6:51-52
Available from: https://www.actamedicainternational.com/text.asp?2019/6/2/51/271116
It would not be erroneous to say that statistics has established itself as an indispensable tool in the course of epistemological inquiry. Given our desire to explore maximum information about our world and beyond, formulating accurate inferences assumes special importance. Conventionally, hypothesis testing (which involves a set of tentative statements about the “truth” of a certain phenomenon) has occupied a dominant rank in inferential statistics. However, various arbitrary factors (apart from the effect that we are trying to measure) have an undesirable influence on the test statistic (such as z, t, or F), which makes it likely that we commit errors while deriving conclusions.
For the purpose of illustration, let us assume that a researcher wants to examine the efficacy of a newly invented drug, and he is also intent on seeing whether the administration of that drug could lead to more successful results than what merely placebo would offer. For this purpose, she employs an independent-samples t-test and concludes that there is a statistically significant difference in participants' health over a week (with the “drug group” faring better than the “placebo group”), leading him to conclude that the drug actually has a therapeutic advantage to offer. It should be noted that the t-statistic that she obtains (which forms the basis of his aforementioned conclusion) could be inflated because of a large sample size chosen for the purpose, leading to a smaller probability that a large t-statistic value occurred by mere chance/random factors.
As can be observed from the example discussed above, the inherent problem with hypothesis testing lies in the fact that our power to draw conclusions is limited by probabilistic reasoning. In other words, we can talk about the results only in terms of how likely or unlikely they are to occur, given that the null hypothesis (of no difference) is true. In our example, there is only a low probability (higher t-statistic) that our conclusion (the therapeutic effectiveness of the drug) could be wrong.
However, there still exists a possibility that this conclusion is false, and that the researcher is making Type I Error (believing that there is a significant effect when there is actually not), and the probability of such an error is generally 0.05 or 0.01. Thus, we cannot make definitive statements about the trueness or falseness of the phenomena that we are testing; we can only speak of them in terms of probability/likelihood.
To overcome the problem of deciding whether a difference in any measurable dependent variable is merely statistically significant or if it is actually practically important, confidence intervals (CIs) appear promising. Instead of supplying us with a probability value due to chance factors, they provide an estimated value of the parameter of interest. However, how do we reach an interval and the associated degree of confidence? We know that 95% and 99% of the scores lie within 1.96 and 2.58 standard deviations from the mean of 0. Based on the requirement of the degree of confidence, we could plug the values in the formula, and obtain a range of values. This final range specifies the two values within with our population parameter will fall in 95% (or 99%) samples recruited from the population. Because researchers generally tend to draw only one sample at a time, they can claim that “they can be 95% or 99% confident” that the population parameter lies within the given interval.
With respect to the above example, the researcher would be interested to obtain the estimated value of the difference between the means of two groups in 95% or 99% of the cases. As can be inferred, there are evident advantages of using CI over a t-test. First, we get a direct value of the population parameter we are trying to estimate (rather than an associated statistic such as z or t). Secondly, it is up to the researcher to decide whether the obtained population parameter (of difference, in this case) is practically important or not (unlike the t-value which only provides information about the statistical importance). Third, increasing the sample size (n) would give us a more precise range of values within with the obtained population parameter is likely to fall, unlike in hypothesis testing where increasing the sample size could result in Type I error.
Thus, by overcoming the problem of dichotomous reasoning (i.e., outright acceptance/rejection) of the null hypothesis, which is prone to error, CI help us reach a clearer range of values that can be expected to contain the target parameter.
We would like to acknowledge Sanya Jain (Psychology major) ex-student Daulat Ram College, Delhi University Delhi, for giving important feedback on this editorial.