Reliability assessment between rating agencies (ACCORD, also known as the Inter-Rater Agreement) is often necessary for research projects that collect data through evaluations of trained or untrained coders. However, many studies use false statistical analyses to calculate ERREURS, misinterpret the results of IRR analyses, or disrepresent the implications that IRR estimates have on statistical performance for subsequent analyses. To calculate pe (the probability of a fortuitous agreement), we note that: Cohen`s kappa mathematical bases (1960) make this statistic suitable for only two coders, so that IRR statistics for nominal data with three or more coders are generally formalized as extensions of the Scott-Pi statistic (e.g. B Fleiss` 1971) or with the arithmetic average of Kappa or P (e) (z.B. , Light 1971; Davies – Fleiss, 1982). Unfortunately, the limit amounts may or may not estimate the amount of random agreement in uncertainty. It is therefore doubtful that the reduction in the estimate of the agreement provided for by the kappa statistics is truly representative of the amount of the coincidence-advice agreement. In theory, the pre (e) is an estimate of the approval rate when advisors advise each position and guess with rates similar to marginal shares, and when the advisors were totally independent (11). None of these hypotheses is justified, so there are wide differences of opinion on the use of Kappa among researchers and statisticians.
There are several operational definitions of “inter-rated reliability” that reflect different views on what a reliable agreement between advisors is. [1] There are three operational definitions of agreements: if advisors tend to accept, the differences between advisors` comments will be close to zero. If one advisor is generally higher or lower than the other by a consistent amount, the distortion differs from zero. If advisors tend to disagree, but without a consistent model of one assessment above each other, the average will be close to zero. Confidence limits (generally 95%) It is possible to calculate for bias and for each of the limits of the agreement. Intraclass correlation analysis (CCI) is one of the most commonly used statistics to assess ERREURS for ordination, interval and reporting variables. CCI is suitable for studies involving two or more coders and can be used if all subjects are evaluated by multiple coders in one study or if a single subset of subjects is evaluated by multiple coders and the rest is evaluated by a coder.