Reliability and Validity of Evalart Tests

At Evalart, we follow a rigorous and fundamented procedure to ensure the validity and reliability of our psychometric tests. Below there will be a summary of the most important aspects we consider to prepare a psychometric test.

1. Exhaustive selection of the items and factors: The first thing we do is ensure the content validity, which refers to the theoretical and empirical substantiation of the scales and item. This means they have to be supported by theoretical deductions obtained from studies and/or theories about the construct to be measured (Magnusson, 1972). Nunnally and Bernstein (1994) argue that the development of a test should be based on the principle of parsimony, both for the content and the final structure (length). This implies covering as much information as possible, with the least amount of factors and items that can explain the construct. Furthermore, one should adequate the length and duration of the test, depending on target population. In the case of personality tests, the redaction of the items should mask the content to be evaluated to avoid the effect of social desirability.

2. Choose a criterion to determine the items that will be used and the ponderation given to the different ranges of each factor. Normally we select a normative group of at least 100 and based on their percentiles we obtain an estimation of the ranges we will consider in each factor.

3. Calculate the internal coherence of the first set of data, by obtaining: the total Cronbach’s alpha coefficient, the Cronbach’s alpha coefficient by factor, and the Cronbach’s alpha coefficient when an item is eliminated. This statistical coefficient refers to the average of correlations between the items, separating the random variance (effect of uncontrolled variables) from the true variance. Following Nunnally and Bernstein (1994) we ensure that the Cronbach’s alpha coefficient must be at least 0.7 which is an adequate value for research purposes.

4. Evaluate the criterion validity, determined by the correlation between the test´s scores and a certain external criterion, considered a reliable measure of the construct. Normally we consider the performance of the candidates as a valid criterion by asking each client to give us an honest ranking of each candidate based on their performance (Anastasi, 1973).

5. Evaluate the construct validity to determine whether the construct is unitary or has several sub-factors (Hernández-Nieto, 2011). We calculate this by obtaining an inter-correlation matrix between the factors, which shows the correlations between the different factors. It would be expected that if the test contains different factors, the correlations between them are not significant. Once this analysis has been carried out, it should be confirmed whether theoretically hypothesized factors correspond to the ones obtained by statistical methods (Yela, 1996).

If you are interested in the research and validity/reliability for a specific test from the catalog, please contact us.

Bibliographic references

Anastasi, A. (1973). Tests Psicológicos (3ra ed). Madrid: Aguilar sa de ediciones.
Hernández-Nieto, R. (2011). Instrumentos de recolección de datos en ciencias sociales y ciencias biomédicas. Venezuela: Universidad Los Andes.
Magnusson, D. (1972). Teoría de los Test. México: Trillas.
Nunnally, J. y Bernstein, I. (1995). Teoría Psicométrica. México: McGraw-Hill.
Yela, M. (1996). Los Test y el Análisis Factorial. Psicothema, 8, 73-88. Obtenido de: http://www.psicothema.com/psicothema.asp?id=654