# Misconceptions of the p-value - let us use new approaches and procedures
##### **Siniša Opić, Majda Rijavec** *University of Zagreb, Faculty of Teacher Education* |
**Pedagogy, didactics and inclusion in education** | Number of the paper: 21 | **Original scientific paper** |
##### **Abstract** |
Back in 1925, in the book *Statistical Methods for Research Workers*, Ronald Fisher defined the statistical significance p<0.05, and today almost a hundred years later, there is a growing need to redefine this arbitrary and often misinterpreted value indicating existence/nonexistence of differences, correlations or effects. P-value indicates that the null hypothesis is true, i.e. the probability that the result was accidental, but it is not related to whether the alternative hypothesis is true or false. Also, the p-value depends on sample size. The larger the sample, the smaller the associated p-value and the higher the risk of “accidental” significance at the 5% threshold. Therefore, in certain cases, it is suggested to use Bayesian statistics whose parameters are more informative rather than the commonly used statistical significance (p <0.05). On the basis of various simulations, this paper proposes a tripartite standard statistical inference approach which include Confidence Intervals (CI), effect size, and Bayesian procedure. The p-value should be one of the inference approaches, but not necessarily the only one. The dichotomous (yes/no) approach based on the rejection or confirmation of a hypothesis should be replaced by a polystochastic one. |
***Key words*** |
*Bayesian inference; confidence intervals; effect size; p-value; statistical significance* |
undergraduates | graduates | integrated | ||||||
M | SD CI | M | SD CI | M | SD CI | p | η2 | |
smaller sample (N=110) | 3.63 | 1.19 3.22-4.04 | 3.83 | .92 3.54-4.12 | 4.06 | .69 3.82-4.30 | .179 | 0.032 |
larger sample (N=212) | 3.54 | 1.17 3.26-3.81 | 3.81 | .91 3.60-4.01 | 3.97 | .84 3.76-4.18 | .038 | 0.038 |
Table 2. ANOVA with Bayes Factor | ||||||
Dependent variable | Sum of Squares | df | Mean Square | F | Sig. | Bayes Factora |
Between Groups | 6.494 | 2 | 3.247 | 3.331 | .038 | .122 |
Within Groups | 203.714 | 209 | .975 | |||
Total | 210.208 | 211 | ||||
Table 3. Bayesian Estimates of Coefficients | |||||
Parameter | Posterior | 95% Credible Interval | |||
Mode | Mean | Variance | Lower Bound | Upper Bound | |
undergraduate | 3.535 | 3.535 | .014 | 3.304 | 3.766 |
graduate | 3.808 | 3.808 | .013 | 3.587 | 4.028 |
integrated undergraduate and graduate | 3.968 | 3.968 | .016 | 3.723 | 4.213 |
If we compare prior means (µ1=3.54; µ2=3.81; µ3=3.97) with posterior there is almost no difference. Additionally, 95% CI of prior means; µ1 (LB=3.26; UB=3.81); µ2(LB=3.60; UB=4.01); µ3 (LB=3.76; UB=4.18) are just slightly different from posterior CI (Table 3). CI of mean is very informative because it indicates that we are 95% sure that the population mean lies in this range. Ranges of CI (lower and upper) are slightly (smaller) which indicates more precise value of the mean. A posterior, log likelihood and prior distribution are shown in Figures 2,3 and 4. | |||||
[](https://hub.ufzg.hr/uploads/images/gallery/2022-10/stoo2.png) | ***2nd International Scientific and Art Faculty of Teacher Education University of Zagreb Conference*** *Contemporary Themes in Education – CTE2 - in memoriam prof. emer. dr. sc. Milan Matijević, Zagreb, Croatia* |
##### **Zablude o p-vrijednosti – upotrijebimo nove pristupe i postupke** |
##### **Sažetak** |
Davne 1925. godine u knjizi *Statistical Methods for Research Workers*, Ronald Fisher je definirao statističku značajnost p<0,05 i danas gotovo stotinu godina kasnije uočljiva je sve veća potreba redefiniranje te arbitrarno postavljene i vrlo često krivo interpretirana vrijednosti o postojanju/ nepostojanju razlika, povezanosti i učinaka. P-vrijednost ukazuje na to da je nul-hipoteza točna, tj. na vjerojatnost da je rezultat slučajno dobiven, ali ne odgovara na pitanja je li alternativna hipoteza točna ili netočna. Također, p-vrijednost ovisi o veličini uzorka. Što je uzorak veći to je p-vrijednost manja a time i je veći rizik “slučajne” značajnosti na razini od 5%. Na temelju različitih simulacija, u ovom radu se predlaže tripartitni standard statističkog zaključivanja koji uključuje intervale pouzdanosti (CI), veličinu učinka i bajezijansku proceduru. P-vrijednost može biti jedan od inferencijalnih pristupa, ali ne i jedini. Dihotomni (da/ne) pristup temeljen na prihvaćanju ili odbacivanju hipoteze treba zamijeniti polistohastičkim pristupom. |
***Ključne riječi*** |
*bajezijanska statistika; intervali pouzdanosti; veličina efekta; statistička značajnost* |