Terminology 101: Reliability of psychometric instruments
Reliability: The extent to which a psychometric instrument yields accurate scores
In psychometric research, it is important to use instruments that are both appropriate and accurate. In the May 2016 column, I explained validity testing, which is used to indicate the extent to which an instrument measures what it is supposed to measure (i.e., its appropriateness). In this column, I will address reliability testing, which is used to indicate how accurate the instrument is in measuring what it measures.
A reliable instrument is one that demonstrates internal consistency, stability and equivalency. Internal consistency is a measurement of how well the items in the instrument produce a coherent reflection of the concept being studied. In other words, it is a measurement of how well the items in an instrument hang together as a unit. For example, in an instrument designed to measure job satisfaction, one expects that a respondent who agrees with an item that states “The work environment is positive” will disagree with another item that states “Bullying and incivility are common practice in the workplace.” If the individual responds in this way, it is evidence of good internal consistency among the items that make up the instrument.
Stability is a measurement of the degree to which the instrument yields similar results when administered more than once to the same group of respondents under the same conditions. Stability is established by correlating test and retest scores, usually collected two weeks apart. An instrument that has highly correlated test-retest scores is considered stable.
Equivalency is a measurement of the degree to which different forms of the same instrument yield similar results. To test for this, instrument developers create an alternative form of the instrument (by shuffling the order of items, for instance) and then ask a group of respondents to complete it a few days or weeks after completing the original form. The higher the correlation between the results obtained with the two forms, the stronger the evidence of equivalency.
Each of these tests generates a correlation coefficient. Within the context of psychometric testing, such coefficients are better known as reliability coefficients. A reliability coefficient has a value that ranges between zero and one; the closer the value is to one, the stronger the evidence of reliability. If any of these types of testing produces a reliability coefficient less than 0.6, researchers should have serious doubts about employing the instrument.
It is important to remember that validity indicates the appropriateness of the instrument to measure a specific concept, while reliability indicates the accuracy of the instrument in measuring this concept. A good psychometric instrument is both valid and reliable. Readers should be suspicious about study results generated from instruments that lack evidence of either validity or reliability.
NurseONE.ca resource on this topic
- Fain, J. A. (2013). Reading, Understanding, and Applying Nursing Research (4th ed.).