Chat with us, powered by LiveChat W2DQ | Writedemy

W2DQ

W2DQ

67

3Foundations of Psychological Testing

Noel Hendrickson/Photodisc/Thinkstock

Learning Outcomes After reading this chapter, you should be able to

• Identify the purpose and uses of psychological testing.

• Describe the characteristics of a high-quality psychological assessment tool or selection method.

• Explain the importance of reliability and validity.

• Identify commonly used psychological test design formats.

• Recognize the types of tests used to assess individual differences.

• List the steps needed to develop and administer tests most effectively.

• Discuss special issues in testing, including applicants’ reactions to testing and online administration.

• Summarize the importance of testing for positivity.

you83701_03_c03_067-102.indd 67 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

68

Section 3.2 What Are Tests?

3.1 The Importance of Testing When you hear the word test, what comes to mind? For many people, tests are not a pleasant activity. Most of us can remember wishing, as children, to grow up and finish school so we would never have to take a test again. Of course, as adults, we discover that tests affect our lives long after we earn our diplomas. Tests determine whether we can drive a car, get into a graduate or job-training program, or earn a professional certification. They influence our career choices and, quite often, our career advancement.

What profession do you plan to pursue? Do you want to be a doctor or a lawyer? How about a police officer or firefighter? Perhaps you would like to earn your MBA and start your own business? Each of these examples, along with most professions, require many years of tests, demand high levels of knowledge and skills, and require continued education and recertifica- tion testing.

Businesses use tests to help determine whether job applicants possess the skills and abilities needed to perform a job. After an applicant is hired, tests will help determine placement in an appropriate training and development program. Throughout an employee’s career, the orga- nization may require testing for new job placements or promotions.

As you can see, tests can have a significant influence on people’s lives; they can help identify talent and promote deserving candidates. But they can also be misused. Unfortunately, there are many poorly designed psychological tests on the market. They seduce organizations with promises of fantastic results but do little to identify quality employees. I/O psychologists pos- sess the knowledge, skills, and education to design, implement, and score measures that meet the legal and ethical standards for an effective psychological test.

The goal of this chapter is not to teach you how to design quality psychological tests, but rather to acquaint you with the requirements, challenges, and advantages of doing so. Fur- thermore, understanding the test-making methods that I/O psychologists use will make you a more informed consumer of tests for your own personal and professional goals.

3.2 What Are Tests? In general, a test is an instrument or procedure that measures samples of behavior or perfor- mance. In an employment situation, tests measure an individual’s employment and career- related qualifications and characteristics. The Uniform Guidelines on Employee Selection Procedures (1978) defines a test as any method used to make a decision about whether to hire, retain, promote, place, demote, or dismiss an employee or potential employee. By this definition, then, any procedure that eliminates an applicant from the selection process would be defined as a test. As discussed in Chapter 2, examples include application forms that eval- uate education and experience; résumé screening processes; interviews; reference checks; performance in training programs; and psychological, physical ability, cognitive, or knowl- edge-based tests.

you83701_03_c03_067-102.indd 68 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

69

Section 3.2 What Are Tests?

I/O psychologists are concerned with design- ing and implementing selection systems that identify quality job candidates. Clearly, orga- nizations want to hire the best workers, but it is a real challenge to screen candidates who vary widely in their KSAOs, behaviors, traits, and attitudes—or what is known as their individual differences. This is especially true when hiring decisions are made with only basic tools such as application forms and short interviews. To help organizations better measure applicants’ personal charac- teristics, I/O psychologists have developed psychological measurements to identify how and to what extent people vary regarding individual differences (Berry, 2003).

What Is the Purpose of Psychological Testing and Selection Methods? In employment, tests and other selection tools are used to predict job performance. Keep in mind that job performance can never be predicted with 100% accuracy. The only way employ- ers could reach such a high level of accuracy would be to hire all the applicants for a particu- lar job, have them perform the job, and then choose those with the highest performance. Of course, this approach is neither practical nor cost effective. Moreover, even if an organiza- tion could afford to hire a large number of applicants and retain only those who performed best, performance prediction is still not perfectly accurate. For example, many organizations have probationary periods in which the employer and the employee try each other out before a more permanent arrangement is established. Employees may be motivated to perform at a much higher level during the probationary period in order to secure permanent employ- ment, but performance levels may drop once the probationary period is over. Moreover, job performance may be influenced over time by a myriad of factors that cannot be predicted or managed.

Although it is impossible to perfectly predict job performance, psychological testing and selection methods can provide reasonable levels of prediction if they accurately and consis- tently assess predictors that are related to specific performance criteria. As briefly introduced in Chapter 2, accurately predicting performance criteria—usually referred to as validity— ensures that test results indicate performance outcomes, so that those who score favorably on the test are more likely to be high performers than those who do not. Simply put, valid- ity reflects the correlation between applicants’ scores on the test or selection tool and their actual performance. A high correlation indicates that test scores can accurately predict per- formance. A low correlation indicates that test scores are poorly related to performance.

In order to assess the validity of a selection tool, job performance must be quantifiable, mean- ing that there is a set of numbers associated with applicants’ test scores so one can calculate a correlation. Performance scores are usually obtained from performance appraisal systems, which will be discussed in more detail in Chapter 4. Poorly designed performance appraisal

Fuse/Thinkstock

Students take tests to demonstrate their knowledge of a particular subject. Similarly, employers administer exams to job applicants to measure employment and career-related qualifications and characteristics.

you83701_03_c03_067-102.indd 69 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

70

Section 3.2 What Are Tests?

systems can hinder an organization’s ability to assess the validity of its selection methods. For example, performance evaluations are sometimes highly subjective. Some managers tend to score all of their employees similarly regardless of performance in order to ensure they all receive a raise. Some do so to sidestep confrontation or to avoid having to justify the decision. This lack of variability in scores can bias the results of the statistical analysis’s underlying validity, preventing it from adequately calculating or comparing the validity of various selec- tion methods.

Another important determining factor of tests and other selection tools is reliability. Also referred to as consistency, reliability is the extent to which test scores can be replicated over time and across situations. A reliable test will reflect an applicant’s aptitude rather than the influence of other factors such as the interviewer, room temperature, or noise level. For exam- ple, many organizations use multiple interviews or panel interviews to evaluate applicants so that there are multiple raters scoring each applicant. Such processes have scorers assign both an absolute score, which measures how the applicant did in relation to the highest possible score, and a relative score, which measures how the applicant did in relation to the rest of the interviewees. When the scores assigned by these multiple raters are comparable in terms of absolute scores for each applicant, as well as relative scores and rankings across applicants, the interview process is considered reliable. On the other hand, if different raters score the same applicant very differently, and if the interview process yields different rankings across applicants and thus different hiring recommendations, then the process is unreliable. Similar to validity, no test or selection method has perfect reliability, but the more reliable and con- sistent a selection tool is, the more accurate it will be in determining quality candidates, and the more legally defensible it will be if the organization is sued for discriminatory hiring. An objective and systematic selection process that leads to consistent results across candidates and raters is an organization’s first line of defense against such accusations.

Ensuring that tests are both valid and reliable is an important part of the assessment process. Of course, the more accurate the testing process, the more likely the best candidates will be selected, promoted, or matched with the right job. However, that is not the only reason to do so: invalid or unreliable tests can be costly. Many tests need to be purchased or require a license of use to be obtained. Testing also takes time, both for the candidate and the organiza- tion. Tests need to be administered, be rated, and have the results reported, which requires managers’ and HR professionals’ time and effort. Tests that are not valid or reliable also have opportunity costs, such as the time spent using them as well as the lower productivity of those who were hired or promoted using the wrong test.

Finally, there are legal implications for ineffective testing. An invalid test may not be related to performance, but it may be discriminatory. It may favor certain protected classes over oth- ers. For example, if younger job applicants consistently score higher than older ones on a test, and these scores are not related to job performance, then that test may be found discrimina- tory. Similarly, if a particular test favors men over women or places minority applicants at a disadvantage, it can be considered discriminatory and thus illegal. For example, complaints were filed against the pharmacy chain CVS Caremark for using a discriminatory personality test. The test included questions about the applicant’s propensity to get angry, trust others, and build friendships. These questions were found to be potentially discriminatory against applicants with mental disabilities or emotional disorders (Tahmincioglu, 2011). Although the organization may have had no intent to discriminate, using invalid discriminatory tests can result in what was referred to in Chapter 2 as “disparate impact,” which is also illegal.

you83701_03_c03_067-102.indd 70 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

71

Section 3.2 What Are Tests?

Uses of Tests Companies of all sizes are integrating tests into their employment practices. In 2001 a study by the American Management Association found that 68% of large U.S. companies used job- skill testing as part of their employment process; psychological (29%) and cognitive (20%) measurements were used less frequently. More recent studies, however, show that the test- ing trend is on the rise, with nearly 80% of Fortune 500 organizations using assessments of

Consider This: Tests and Testing Make a list of as many tests as you can remember having taken during times when you were up for selection from a pool of potential candidates (e.g., jobs, volunteering opportunities, college admissions, scholarships, military service). Remember that a test can be any instrument or procedure that measures samples of behavior or performance. It does not have to be a written, proctored exam.

Questions to Consider

1. What do you think each of those tests was attempting to measure? 2. What is your opinion of each of those tests? 3. Did the tests adequately measure what they were trying to measure? 4. What were some strengths and weaknesses of each test?

Find Out for Yourself: The Validity and Reliability of Commonly Used Selection Tools

Visit the following websites to read about the different types of validity and reliability, as well as how they are measured.

Validity & Reliability of Methods

Psychometric Assessment Validity

Psychometric Test Reliability

What Did You Learn?

1. Compare the validity of various selection methods such as interviews, reference checks, and others. If you were a recruiter, which ones would you use? Why?

2. If you were to use an actual test, how long would you design the test to be? Why? 3. If you were to design an interview process, how many interviewers would you use for

each candidate? What are the benefits of using more than one interviewer (rater)? 4. What are the key factors in increasing the validity of the selection process? 5. What are the key factors in increasing the reliability of the selection process?

you83701_03_c03_067-102.indd 71 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

72

Section 3.3 Requirements of Psychological Measurement

some sort (Dattner, 2008). Moreover, about a quarter of employers utilize online personality testing to weed out applicants early in the recruitment process, even before any other screen- ing tool is used; this trend is expected to increase by about 20% annually. Examples include large, nationwide employers such as McDonald’s and CVS Caremark (Tahmincioglu, 2011). The growing popularity of testing within organizations has resulted in the use of tests not only for selection but also for a number of other HR functions.

One of the most important ways in which organizations use tests is to evaluate job applicant or employee fit. Organizations often administer tests to accurately evaluate an applicant’s job-related characteristics or determine an employee’s suitability for promotion or place- ment in a new position within the company. Because promotions and training are expensive, organizations place high importance on being able to determine which employees possess the higher level abilities and skills needed to assume advanced job positions. Similarly, during job reorganizations, companies must be able to place individuals into new jobs that align with their skills and abilities. Keep in mind that selection, promotion, and job-placement processes all involve employment decisions and thus must be well designed in order to meet the requi- site legal and professional standards.

HR professionals make use of tests in areas outside the realm of employment selection. Train- ing and development is one such example. Trainees are often tested on their job knowledge and skills to determine the level of training that will fit their proficiency. At the end of train- ing, they may take tests to assess their mastery of the training materials or to identify areas where they need to be retrained. Other types of tests help individuals identify areas for self- improvement, and sometimes job teams take tests to help facilitate team-building activities. Finally, tests can help individuals make educational or vocational choices. People who work at jobs that utilize their skills and interests are more likely to be successful and satisfied, so it is important that vocational and educational tests make accurate matches and predictions. Note that tests used solely for career exploration or counseling need not meet the same strict legal standards as tests used for employee selection.

3.3 Requirements of Psychological Measurement Tests designed by I/O psychologists possess a number of important characteristics that set them apart from other tests you may have taken. Scientifically designed tests differ from mag- azine quizzes or informal tests that you find online in that they are more than a set of ques- tions related to a specific topic. Instead, they must meet standards related to administration (the way in which the test is given), scoring methods, score interpretation, reliability, and validity. Unfortunately, many employers and consultants think they can simply put together a test or an interview protocol that they believe measures what they feel is necessary and start using the selection tool without any statistical analysis to assess its quality. In addition to the fact that this approach does not effectively distinguish applicants with higher performance potential, which makes it a waste of time and resources, it can also yield inconsistent and discriminatory results, which makes it illegal.

you83701_03_c03_067-102.indd 72 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

73

Section 3.3 Requirements of Psychological Measurement

Standardized Administration To administer a selection test properly, the conditions under which applicants complete the test must be standard—that is, they must be the same every time the test is given. These con- ditions include the test materials, instructions, testing facilities, time allowed for testing, and allowed resources and materials. To ensure standardization, organizations put instructions into written form or administer the test to large groups so that all applicants hear the same instructions. Additionally, applicants will all complete the test in the same location using well-functioning equipment and comfortable seating. Test administrators are also careful to keep the testing environment comfortable in terms of temperature and humidity as well as

free from extraneous noise or other distractions. Variations in testing conditions can significantly interfere with results, making it impossible to create accurate comparisons between different applicants based on the conditions in which they were tested.

Consider how changing even one aspect of test- ing conditions can affect test performance. What would happen if, on a cold day in the middle of winter, the heat stopped working partway through a series of applicant evaluations? Appli- cants might not perform well on a typing test because their hands were cold and stiff, or they might not complete a written test because they

were shivering and could not concentrate. Now think about how differently two groups of test takers would perform if one group accidentally received incomplete instructions from an inexperienced administrator, while a second group received the proper instructions from an experienced administrator. You can easily see how unfair it would be to try to compare test results of applicants who were not all tested under equal conditions!

Of course, it is sometimes not only appropriate but also necessary to alter the testing condi- tions. Applicants with disabilities may need specific accommodations, such as a sign language interpreter for a person with a hearing impairment or a reader or Braille version of a written test for a person with a visual impairment. For applicants with disabilities, then, not allow- ing for changes in the testing conditions would make it difficult, if not impossible, for them to perform their best.

A real-life example of this occurred when an on-campus recruiter for a highly coveted intern- ship program noticed that, despite performing as well as other students when interviewed on campus, minority students from very low-income areas fared poorly when invited to an on-site interview. The recruiter suspected that the organization’s luxurious office building and extravagant furnishings may have intimidated those students and caused their poor per- formance. To further investigate his point, he changed the location of the interview to a local community youth center—everything else about the interview process was kept identical. Under these conditions, the students’ performance improved significantly and was no differ- ent from the overall applicant pool. In other words, the change was necessary to neutralize the distracting effects of an otherwise unrelated testing condition.

Lisa F. Young/iStock/Thinkstock

It is important that all applicants take the same test under the same conditions.

you83701_03_c03_067-102.indd 73 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

74

Section 3.3 Requirements of Psychological Measurement

Objective Scoring Just as performance on a test can be affected by testing conditions, results can be affected by the way in which the test is scored. To eliminate the possibility of bias, test scores must be standardized, which means that a test must be scored in the same way for everyone who takes it. Establishing a scoring key or a clear scoring criterion before administering a test will reduce an evaluator’s subjective judgments and produce the same or similar test scores no matter who evaluates the test.

Organizations can utilize both objective tests and subjective tests. Objective tests, such as achievement tests and some cognitive ability tests, have one clearly correct answer— multiple-choice tests are an example. As long as a scoring key is available, scoring an objec- tive test should be free from bias. In contrast, subjective tests, such as résumé evaluations, interviews, some personality tests, and work simulations, have no definitive right or wrong answers; their scores rely on the interpretation and judgment of the evaluator. To minimize the influence of personal biases and increase scoring accuracy, the evaluator uses a predeter- mined scoring guide or template, also sometimes referred to as a rubric, which establishes a very specific set of scoring criteria, usually with examples of what to look for in order to assign a particular score. For example, if a rater is scoring an applicant on professionalism using a scale of 1 to 5, with 3 being “average” or “meeting expectations,” this score can also be accompanied by a description of what is considered “meeting expectations” so the assess- ment is not subject to the rater’s interpretation of what is expected. Although subjective tests are commonly used to make employment decisions, objective tests are preferred for making fair, accurate evaluations of and comparisons between job candidates. In the majority of situ- ations, not everything can be objectively measured, so both subjective and objective tests are used to combine the specificity of objective tests and the richness of subjective tests.

Score Interpretation After a test has been scored, the score needs to be interpreted. Once again, this process must be standardized. Additionally, to be interpreted properly, a person’s score on a test must be compared to other people’s scores on the same test. I/O psychologists use a standardization sample, which is a large group of people who have taken the test against whose scores an individual’s score can be compared. The comparison scores provided by the standardization sample are called test norms. The demographics of the standardization sample can be used to establish test norms for various racial and ethnic groups, men and women, and groups of different ages and various education levels.

Let’s look at an example of how test-score interpretation works. As part of a selection process, an applicant might answer 30 out of 40 questions correctly on a multiple-choice cognitive ability test. By itself, this score provides little information about the applicant’s level of cogni- tive ability. However, if we can compare it to how others performed on the same test—spe- cifically, the test norms established by a standardization sample—we will be able to ascribe some meaning to the score.

Often, raw scores, the number of points a person scores on a test, are transformed into per- centile scores, which tell the evaluator the percentage of people in the standardized sample who scored below an individual’s raw score. Continuing the example above, if the score of 30 falls in the 90th percentile, 90% of the people in the standardized sample scored lower than our fictional applicant.

you83701_03_c03_067-102.indd 74 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

75

Section 3.3 Requirements of Psychological Measurement

Test Reliability As introduced earlier, test reliability refers to the dependability and consistency of a test’s measurements. If a person takes a test several times and scores similarly on it each time, that test is said to measure consistently. If a test measures inconsistently, then outside factors must be influencing the results. For example, if an applicant takes a mechanical ability test one week and correctly answers 90 out of 100 items, but then takes another form of the test the following week and gets only 50 out of 100 items correct, the test evaluator must ask whether the tests are actually doing what they’re supposed to be doing—measuring mechanical abil- ity—or if something else is influencing the scores. Examples of common but often unreliable interview questions include “tell me about yourself” and “discuss your strengths and weak- nesses.” Such questions are unreliable because the answer may vary widely depending on the applicant’s mood or recollection of recent events. Moreover, interpretation of the answers is subjective and depends on whether the interviewer likes or dislikes what the applicant says. There is also a very limited scope for comparing answers across applicants to determine which answers are higher or lower quality. On the other hand, more targeted and job-related questions—such as “tell me about a situation where you had to . . .” or “what would you do if you were faced with this situation”—are more likely to yield consistent and comparable responses. Thus, before trusting scores from any test that measures inconsistently, I/O psy- chologists must discover what is influencing the scores.

The test taker’s emotional and physical state can influence his or her score. A person’s mood, state of anxiety, and level of fatigue may change from one test-taking time to another, and these factors can have a profound effect on test performance. Illness can also impact perfor- mance. If a person is healthy the first time she takes a test, but then has a cold when she takes the test again, her score will likely be lower the second time around.

Changing environmental factors can also make a test measure inconsistently. Differences from one testing environment to another, such as room lighting, noise level, temperature, humidity, and equipment, will affect a person’s performance, as will the relative completeness or incompleteness of instructions and manner in which they are given.

Differences between versions of the same test also influence reliability. Many tests have more than one version or form (the written portion of a driver’s license test is an example). Although the test versions aim to measure the same knowledge, the test items or questions vary. If the questions in one version are more difficult than those in another, the test taker may perform better on one version of the test.

Finally, some inconsistency in test scores stems from real changes in the test taker’s KSAOs. These changes often appear if a significant period of time passes between tests. High school students taking the SAT, for example, may show vast increases in scores from their junior year to their senior year because they have improved their cognitive ability, subject knowledge, and/or test-taking skills.

Measures of Reliability Typically, reliability is measured by gathering scores from two sets of tests and then deter- mining their association. The reliability coefficient states the correlation—or relation- ship—between the two score sets, and ranges from 0 to +1.00. Although we won’t go into

you83701_03_c03_067-102.indd 75 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

76

Section 3.3 Requirements of Psychological Measurement

mathematical details of calculating correlations here, it is important to understand that the closer the score sets approach a perfect +1.00 correlation, the more reliable the test. For employment tests, reliability coefficients above +.90 are considered excellent, and those above +.70 are considered adequate. Tests with reliability estimates lower than +.70 may possess sufficient errors to make them unusable in employment situations. I/O psychologists measure test reliability with the internal consistency, test–retest, interrater, alternate-form (or parallel-form), and split-halves methods described below.

Internal Consistency Reliability Internal consistency reliability assesses the extent to which different test items or questions are correlated and thus consistently measure the same trait or characteristic. For example, if a 20-question test is used to measure extroversion, then scores on these 20 items should be highly correlated. If they are not, then some of the items may be measuring a different concept. The most common measure of internal consistency reliability is Cronbach’s alpha, which is an overall statistical measure of all the intercorrelations across items in a particular test. It can also pinpoint which items are poorly performing and which should be removed to improve the test’s internal consistency.

Test–Retest Reliability Test–retest reliability involves administering the same test to the same group of people at two different times and then correlating the two sets of scores. To the extent that the scores are consistent over time, the reliability coefficient shows the test’s stability (see Figure 3.1). This method has a few limitations. First, it can be uneconomical, because it requires considerable time for employees to complete the tests on two or more occasions. Second, it can be difficult to determine the optimal length of time that should pass between one test-taking session and the next. If the interval is short, say, a few hours, test takers may remember all the ques- tions and simply answer everything the same way on the retest, which could artificially inflate the reliability coefficient. Conversely, waiting too long between tests, say, 6 months to 1 year, could cause retesting scores to be affected by changes that result from outside learning. This can artificially reduce the test’s reliability. The best time interval is therefore relatively short, such as a few weeks or up to a few months.

Interrater Reliability As the name implies, and as introduced in earlier examples, interrater reliability involves allowing more than one evaluator (rater) to assess each candidate’s performance, and then correlating the scores of the raters across candidates. To the extent that the scores are con- sistent across raters, the test is considered more reliable. This method is particularly relevant for subjective tests that are prone to personal interpretation. Even with well-designed rubrics and valid questions, these methods introduce some variability in assessment across candi- dates and raters. Interrater reliability ensures that such variability is under control and insuf- ficient to bias the results of the process or alter hiring decisions. Various advanced statistical methods are available to take into account both the consistency of the absolute scores of the candidates across raters and their relative scores and rankings compared to each other. Both types of consistency are important. For example, absolute scores can affect selection deci- sions in situations where there is a cutoff score, such as in the case of college admissions and certifications. Relative scores and rankings can affect promotion decisions or come into play when only a predetermined number of candidates (e.g., top five) can be selected.

you83701_03_c03_067-102.indd 76 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

77

Mechanical comprehension at first

test administration

Mechanical comprehension at second

test administration

High

Low

High

Low

Mechanical comprehension at first

test administration

Mechanical comprehension at second

test administration

High

Low

High

Low

High reliability

Low reliability

Section 3.3 Requirements of Psychological Measurement

Alternate-Form Reliability Alternate- (or parallel-) form reliability has to do with how consistent test scores are likely to be if a person takes two similar, but not identical, forms of a test. As with test–retest reliability, this measure requires two sets of scores from a group of people who have taken two varia- tions of a test. If the score sets yield a high parallel-form reliability coefficient, then the tests are not materially different. If the reverse happens, the tests are probably not equivalent and therefore cannot be used interchangeably.

Figure 3.1: Test-retest reliability

One way to determine a test’s reliability is to conduct a test–retest. The more consistent test scores are over time, the more reliable the test is thought to be.

From Levy, P.E. (2016). Industrial/organizational psychology: Understanding the workplace. (5 ed). p. 26, Fig. 2.3. Copyright 2017 by Worth Publishers. All rights reserved. Reprinted by permission of Worth Publishers.

Mechanical comprehension at first

test administration

Mechanical comprehension at second

test administration

High

Low

High

Low

Mechanical comprehension at first

test administration

Mechanical comprehension at second

test administration

High

Low

High

Low

High reliability

Low reliability

you83701_03_c03_067-102.indd 77 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

78

Section 3.3 Requirements of Psychological Measurement

The major limitations of parallel-form reliability are that it is time consuming and costly. Fur- thermore, it requires the test developer to design two versions of a test that both cover the same subject matter and are equivalent in difficulty and reading level.

Split-Halves Reliability Split-halves reliability tests are more cost effective than either test–retest or parallel-form reliability because they can be assessed with scores from one test administered just one time. After the test has been given, it is split in half, and scores from the two halves of the test are correlated. A high reliability coefficient indicates that each section of the test is consistently measuring similar content, whereas the reverse is true with a low reliability coefficient.

The tricky part of split-halves reliability is determining how best to split the test. For example, if test items increase in difficulty as the test progresses, or if the first half of the test contains fewer difficult questions than the second, it won’t work to simply split the test down the middle and compare the scores from each half. To solve this dilemma, tests are often split by odd- and even-numbered questions.

Find Out for Yourself: Test Reliability This exercise is designed to help you grasp the challenges involved in designing a reliable test. To begin, choose a trait or skill in which you are interested. For example, consider an academic subject, mastery of an online game, or a personality trait that you admire. Then, write 10 state- ments you believe to be good measures of the selected trait or skill. Ask three friends or family members to rate themselves on each statement on a scale of 1–5 (1 = strongly disagree, 5 = strongly agree). Finally, without looking at their scores, rate each individual on the same 10 statements based on your own perceptions of that individual.

What Did You Learn?

1. Assess interrater reliability: Add up each individual’s scores based on his or her own assessment. Rank order the scores. Add up each individual’s scores based on your assessment of that individual. Rank order the scores. Did the rankings change?

2. Assess split-halves reliability: Add up each individual’s scores on the first five questions. Add up each individual’s scores on the second set of five questions. Are the scores of each individual on the two test halves similar? Rank order the three individuals based on their first five questions. Rank order them again based on the second set of five ques- tions. Did the rankings change?

3. Ask the same three individuals to rate the same statements again a week later. Assess test–retest reliability: Add up each individual’s scores on the first time he or she took the test. Add up each individual’s scores on the second time he or she took the test. Are the scores similar? Rank order the three individuals based on their first test. Rank order them again based on their second test. Did the rankings change?

As you can probably appreciate from this exercise, anyone can “whip up” a test, but the test may be highly subjective and unreliable if the scores are not consistent across raters, ques- tions, and times of test administration. You probably now have some idea as to how to improve your test. Similarly, scientific test design requires numerous iterations of writing and rewrit- ing items and statistically examining results with multiple samples to ensure reliability before the test can be used.

you83701_03_c03_067-102.indd 78 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

79

Section 3.3 Requirements of Psychological Measurement

Test Validity Validity is the most important aspect of an employment test. Although a test is reliable if it is able to make consistent measurements, a test is valid if it truly measures the characteristics it is supposed to measure. An employment test may yield reliable scores, but if it doesn’t mea- sure the skills that are needed to perform a job successfully, it is not very useful. I/O psycholo- gists use three methods to establish test validity: criterion-related validity, content validity, and construct validity.

Criterion-Related Validity The purpose of criterion-related validity is to establish a predictive, empirical (number-based) link between test scores and actual job performance (see Figure 3.2). To do this, I/O psycholo- gists compare applicants’ employment test scores with their subsequent job performance. This correlation is called the validity coefficient, and it ranges from 0 to ±1.00. Tests that yield validity coefficients ranging from +.35 to +.45 are considered useful for making employment decisions, whereas those with validity coefficients of less than +.10 probably have little rela- tionship to job performance.

I/O psychologists use two different methods to establish criterion-related validity: predictive validity and concurrent validity. Predictive validity involves administering a new test to all job applicants but not using the test scores to try to predict the applicants’ job readiness. Instead, the scores are filed away to be analyzed later. After a time, managers will have accumulated performance ratings or other information that indicates how new hires are performing on the job. At that point, the new hires’ preemployment test scores will be correlated with their performance, and the organization can look at how successfully the test predicted perfor- mance. If the test proves to be a valid predictor of performance, it can be used in future hiring decisions. Although this approach is considered the gold standard of test validation, many organizations are unwilling to use the predictive validity method because filing away employ- ment test scores lets some unqualified applicants slip through the preemployment screening process. However, scientifically developed tests regularly use this method to validate them prior to their use.

The concurrent validation approach is more popular because, instead of job applicants, on- the-job employees are used to establish the test’s validity. With this method, both the current employees’ test scores and their job performance ratings are gathered at the same time, and test validity is established by correlating these measures. Organizations appreciate the cost- effectiveness offered by concurrent validation. Tests that are found to be of high concurrent validity based on the results from current employees are then used to assess job applicants.

Other forms of concurrent validation include convergent and divergent validity. Convergent validity refers to the correlation between test scores and scores on other related tests. For example, SAT and ACT tests are expected to correlate, so the validation of new SAT questions may involve examining their correlation with ACT questions. Divergent validity refers to the correlation between test scores and scores on other tests or factors that should not be related. For example, test scores should not be related to gender, race, religion, or other protected classes. To ensure that a test is not discriminatory, the correlation between its scores and each of these factors can be examined. Divergent validity is supported when these correla- tions are low or statistically insignificant. Taken together, convergent and divergent validity

you83701_03_c03_067-102.indd 79 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

80

Cognitive ability score

Job performance

High

Low

Excellent

Poor

Cognitive ability score

Job performance

High

Low

Excellent

Poor

High validity

Low validity

Section 3.3 Requirements of Psychological Measurement

examine the extent to which a test relates to what it should be related to and does not relate to what it should not be related to, respectively.

Despite the time- and cost-saving advantages, concurrent validation does have a number of drawbacks. First, the group of employees who validate a test could be very different from the

Figure 3.2: Criterion-related validity

In order to establish a connection between test scores and actual job performance, I/O psychologists use criterion-related validity. Tests with high correlations between scores and job performance are considered to be high-validity tests and can be useful for making employment decisions. Tests with low correlations between scores and job performance are considered low-validity tests and would not be ideal for assessing job performance.

From Levy, P.E. (2016). Industrial/organizational psychology: Understanding the workplace. (5 ed). p. 29, Fig. 2.4. Copyright 2017 by Worth Publishers. All rights reserved. Reprinted by permission of Worth Publishers.

Cognitive ability score

Job performance

High

Low

Excellent

Poor

Cognitive ability score

Job performance

High

Low

Excellent

Poor

High validity

Low validity

you83701_03_c03_067-102.indd 80 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

81

Section 3.3 Requirements of Psychological Measurement

group of applicants who actually make use of the test. The former would likely skew the vali- dation because lower performing employees (and their low scores) would have already been removed from their positions and thus not be part of a test-validating process. This can cause the validity of the test to appear higher than it really is. On the other hand, employees may also skew the validation by not trying as hard as job applicants. Employees who already have jobs might not be as motivated to do their best as applicants eager for a job would be. This can cause the test’s validity to appear lower than it really is.

Content-Related Validity Content-related validity is the rational link between the test content and the critical job-related behaviors. In other words, test items should be directly related to the important require- ments and qualifications for the job. The rationale behind content-related validation is that if a test samples actual job behaviors, then individuals who perform well on it will also perform well on the job. Remember that the goal of testing is to predict performance.

As you can probably guess, content-related validation studies rely heavily on information gathered from a job analysis. If test questions are directly related to the specific skills needed to perform a job, the test will have high content-related validity. For example, a test for admin- istrative professionals might ask questions related to effective filing methods, schedule man- agement, and typing techniques. Because these skills are important for the administrative professional position, this test would have high content-related validity. Content validity can- not be evaluated numerically or statistically as readily as criterion validity. It is often qualita- tively evaluated by subject matter experts. However, qualitative evaluations can be quantified using well-designed rubrics and assessed for reliability and consistency.

Construct Validity Construct validity is the extent to which a test accurately assesses the abstract personal attri- butes, or constructs, that it intends to measure. Although there are valid measures of many personality traits, numerous invalid measures of the same traits can be found in less scientific sources such as magazines or the Internet. Because constructs are intangible, it can be chal- lenging to design tests that measure them.

How do we know if tests of personality, reasoning, or motivation actually measure these intangible, unobservable characteristics? One way to establish construct validity is to cor- relate a new test with an established test that is known to measure the construct in question. If the new test correlates highly with the established test, the new test is likely measuring the construct it is intended to measure.

Validity Generalization Initially, I/O psychologists thought that validity evidence was situation specific; that is, a test that was validated and used for applicants for one job could not be used for applicants for a different job unless an additional job-specific validation study was performed. Further, I/O psychologists believed that tests that had been validated for a position at one company could not be used for the same position at a different company—again, unless it was validated for the second company, which is a tedious and costly process.

you83701_03_c03_067-102.indd 81 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

82

Section 3.3 Requirements of Psychological Measurement

However, research over the past few decades has shown that validity specificity is unfounded (Lubinski & Dawis, 1992). I/O psychologists now believe that validity evidence transfers across situations, a notion that is referred to as validity generalization. Researchers discov- ered that the studies that supported validity specificity were flawed (Schmidt & Hunter, 1981).

Validity generalization has been a huge breakthrough for organizations. Establishing validity evidence for every employment test in every situation was, for most organizations, both cost and time prohibitive. Because of its cost- and time-saving benefits, the advent of validity gen- eralization has meant that organizations are more willing to integrate quality tests into their employment practices.

Face Validity Face validity is not a form of validity in a technical sense; rather, it is the subjective impression of how job relevant an applicant perceives a test to be. For example, a bank teller would find nothing strange about taking an employment test that dealt with numerical ability or money- counting skills, because these skills are obviously related to job performance. On the other hand, the applicant may not see the relevance of a personality test that asks questions about personal relationships. This test would thus have low face validity for this job.

Organizations need to pay close attention to their applicants’ perceptions of a test’s face valid- ity, because low face validity can cause an applicant to feel negatively about the organization (Chan, Schmitt, DeShon, Clause, & Delbridge, 1997; Smither, Reilly, Millsap, Pearlman, & Stof- fey, 1993). If organizations have the opportunity to pick between two tests that are otherwise equally valid, they should use the test with the greater level of face validity.

Find Out for Yourself: Your Personality Type Visit the 16 Personalities website and take the personality-type assessment provided. Read your results, then visit the Encyclopedia Britannica entry on personality assessment and read about the reliability and validity of assessment methods.

16 Personalities

Reliability and Validity of Assessment Methods

What Did You Learn?

1. What have you learned about yourself through this assessment? 2. Is this assessment accurate? Which types of validity and reliability apply to it? 3. Based on this assessment, what are some examples of jobs that would fit your type?

you83701_03_c03_067-102.indd 82 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

83

Section 3.4 Test Formats

Qualitative Methods Most of the methods discussed in this chapter are quantitative in nature. However, employ- ers often find it necessary to take into consideration other factors that are not necessarily quantifiable but can be very important in employee selection. Qualitative selection methods may include observation, unstructured interview questions, consultation with references, or solicitation of opinions from past or future managers and coworkers. Qualitative methods can yield a much richer and broader array of information about an applicant that may be impos- sible to capture using quantitative methods.

Unfortunately, however, qualitative methods are much harder to assess in terms of validity and reliability, and thus they may lead to erroneous decisions. Their subjectivity can also lead to discriminatory decisions. Qualitative methods can be particularly problematic in ranking applicants. Without a predetermined set of evaluation criteria and a rating scale, interrater reliability can be very low. That is why psychologists attempt to quantify even what would be considered qualitative criteria—such as person–organization fit and person–job fit—by creating survey measures of these factors. With the help of I/O psychologists, employers can also create quantitative scoring themes for qualitative data to increase the integrity and legal- ity of these methods.

3.4 Test Formats Thousands of employment tests are on the market today. Naturally, no two tests are the same and may differ in their construction or administration. Tests vary in their quality depending on the rigor of their validation processes. They also vary in cost depending on their extensive- ness and popularity. However, it is important to note that quality and cost do not always go hand in hand. Some of the most valid and reliable tests are available in the scientific literature free of charge, but they are not very popular among practitioners, who are often unfamiliar with the scholarly literature. On the other hand, some of the popular and expensive tests marketed by well-known consulting companies have questionable validity and reliability. In some cases these tests are only taken at face validity and are never statistically analyzed. In other cases the test developers make sure their tests are valid and reliable but are reluctant to publicly share their analyses for proprietary reasons. In all cases prudent employers should demand evidence of validity and reliability in order to ensure that they are (a) investing their time and resources in the right selection tools that will yield the most qualified workforce and (b) using legally defensible and nondiscriminatory methods. Commonly used test-design for- mats include assessment centers, computer-adaptive tests, speed and power tests, situational judgment tests, and work-sample tests.

Assessment Centers An assessment center is one of the most comprehensive tests available and is often used to select management and sales personnel because of its ability to assess interpersonal, commu- nication, and managerial skills. A typical assessment center includes personality inventories, cognitive assessments, and interviews, as well as simulated activities that mimic the types of activities performed on the job. Common types of simulated activities include in-basket tasks, leaderless group discussions, and role-play exercises.

you83701_03_c03_067-102.indd 83 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

84

Section 3.4 Test Formats

Although assessment centers can predict the level of success both in training and on the job, they have a number of limitations. First, assessment centers are expensive to design and administer, because administrators must be specifically trained to evaluate discussions and perform role plays. Assessment centers for senior management positions can cost more than $10,000, a price tag that is prohibitive for many organizations. Second, because scoring an assessment center relies on the judgment of its assessors, it can be difficult to standardize scores across time and location. This issue can be mitigated by training assessors to evaluate behaviors against an established set of scoring criteria.

Computer-Adaptive Tests Typical tests include items that sample all levels of a candidate’s ability. In other words, they contain some questions that are easy and will be answered correctly by almost all test tak- ers, some that are difficult and will be answered correctly by only a few, and some that are in-between. A computer-adaptive test, however, tailors the test to each test taker’s individual ability.

In this type of test, the candidate begins by answering a question that has an average level of difficulty. If he or she answers correctly, the next question will be more difficult; if he or she answers incorrectly, the next question will be easier. This process continues until the candi- date’s proficiency level is determined. The fact that candidates do not waste time answering questions of inappropriate difficulty is a clear advantage of the computer-adaptive test. Addi- tionally, because each test is tailored to the individual, test security (i.e., cheating) is less of a concern.

Speed and Power Tests Tests can be designed to assess either an individual’s depth of knowledge or rate of response. The first type of test is called a power test. Power tests are designed to be difficult, and very few individuals are able to answer all of the items correctly. Test takers receive either a gen- erous time limit or no time limit at all. The overall purpose of the power test is to evaluate depth of knowledge in a particular domain. Therefore, response accuracy is the focus, instead of response speed.

Speed tests contain a homogenous content set, and test takers receive a limited amount of time to complete the test. These tests are well suited to jobs in which tasks must be per- formed both quickly and accurately, such as bookkeeping or word processing. For these jobs, a data-entry test would be an appropriate speed test for measuring an applicant’s potential for success.

Situational Judgment Tests A situational judgment test is a type of job simulation that is composed of a number of job- related situations designed to assess the applicant’s judgment. Each situation includes mul- tiple options for how to respond. The applicant must select the options that will produce the most and least effective outcomes. Statistically, situational judgment tests have validities comparable to structured interviews, biographical data, and assessment centers (Schmidt & Hunter, 1998).

you83701_03_c03_067-102.indd 84 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

85

Section 3.5 Testing for Individual Differences

Situational judgment tests are frequently admin- istered to candidates for management positions, although research shows them to be predictive of performance for a wide variety of jobs. Stud- ies have found validity evidence for situational judgment tests’ ability to predict supervisory performance (Motowidlo, Hanson, & Craft, 1997) and to predict job performance for sales profes- sionals (Phillips, 1992), insurance agents (Dales- sio, 1994), and retail store employees (Weekley & Jones, 1997).

Work-Sample Tests Work-sample tests evaluate an applicant’s level of performance when demonstrating a small sample set of a job’s specific tasks. The two general areas for work-sample tests are motor skills and verbal abilities. A test that requires a machinist applicant to properly operate a drill press is an example of a motor skills work-sample test; a test that asks a training applicant to present a portion of the organization’s training program is a verbal ability work-sample test.

One advantage of work-sample tests is that they generally show a high degree of job related- ness, so applicants perceive that they have a high degree of face validity. Additionally, these tests provide applicants with a realistic job preview. The disadvantage is that they can be expensive to develop and administer.

Goodshot/Thinkstock

Situational judgment tests present applicants with job-related scenarios in order to assess their decision- making skills.

Consider This: The Costs of Testing 1. For most of the tests discussed in this section, expense is a major drawback. Why do you

think an organization would go to the trouble of developing and using employment tests? 2. How might the expense of a test be justified, offset, or overcome?

3.5 Testing for Individual Differences People differ in psychological and physical characteristics, and identifying and categorizing people in respect to these differences is important for successfully predicting both job per- formance and job satisfaction. The most commonly tested categories of individual differences are cognitive ability, physical ability, personality, integrity, and vocational interests. Each of these categories has an important theoretical foundation as well as its own set of advantages and disadvantages.

Cognitive Abilities The past hundred years of study have produced two distinct concepts of cognitive ability. Beginning with Spearman’s seminal research in 1904 on general intelligence, one concept is based on the belief that cognitive ability is a single, unitary construct (called the g, or general,

you83701_03_c03_067-102.indd 85 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

86

Section 3.5 Testing for Individual Differences

factor) along with multiple subfactors (called s factors). According to this two-factor, or hier- archical, theory of cognitive ability, the g factor is important to all cognitive performance, whereas s factors influence specific intelligence domains. For example, your performance on a math test will be influenced by both your general, overall intelligence (the g factor) and your knowledge of the specific math topic being tested (an s factor). From test to test, your scores will be strongly correlated, because all performance is influenced by the g factor; however, the influence of s factors will keep the correlation from being perfect. So, although a high g factor of overall intelligence might mean that you would score higher than most on both a math test and a verbal reasoning test, your math test scores could be lower than your verbal reasoning scores simply because you never took a class that covered the specific topics on the math test (an s factor).

Led by Thurstone’s research, begun in 1938, scientists challenged Spearman’s theories by proposing that cognitive ability was a combination of multiple distinct factors, with no over- arching factor. Using a statistical technique called factor analysis, Thurstone and his col- leagues identified seven primary mental abilities: spatial visualization, number facility, verbal comprehension, word fluency, associative memory, perceptual speed, and reasoning (Thur- stone, 1947). This theory suggests that employment tests should evaluate the primary men- tal abilities that are most closely linked to a specific job. For example, a test for engineering applicants would focus on spatial and numerical abilities.

Although there is no consensus, research has supported Spearman’s hierarchical model of cognitive ability (Carroll, 1993; Schmid & Leiman, 1957). Consequently, employers tend to use tests to measure both general intelligence and specific mental domains. Those that focus on the g factor are called general cognitive ability tests. They measure one or more broad mental abilities, such as verbal, mathematical, or reasoning skills. General cognitive ability tests can be used to evaluate candidates for almost any job, especially those in which cogni- tive abilities such as reading, computing, analyzing, or communicating are involved. Specific cognitive ability tests measure the s factors and focus on discrete mental abilities such as reac- tion time, written comprehension, and mathematical reasoning. These tests must be closely linked to the job’s specific functions.

Cognitive ability tests are among the most widely used tests because they are highly effec- tive at predicting job and training success across many occupations. In their landmark study, Schmidt and Hunter (1998) examined validity evidence for 19 different selection processes from thousands of studies over an 85-year period. After compiling a meta-analysis (which is a combination of the results of several studies that address a set of related research hypoth- eses), Schmidt and Hunter found that cognitive ability test scores correlated with job perfor- mance at .51 and training success at .53, which were the highest validity coefficients among all the types of tests they examined. Other researchers found similar validities using data from European countries (Bertua, Anderson, & Salgado, 2005; Salgado, Anderson, Moscoso, Bertua, & Fruyt, 2003). Interestingly, additional research has found that a job’s complexity positively affects the validity of cognitive ability tests. In other words, the more complex the job, the better the test is at predicting future job performance. For jobs with low complexity, on the other hand, high cognitive ability test scores are less important for predicting success- ful job performance (Hunter, 1980; Schmidt & Hunter, 2004).

The most popular general cognitive ability test is the Wonderlic Cognitive Ability Test. Devel- oped by Eldon F. Wonderlic in the 1930s, this test contains 50 items, has a 12-minute time

you83701_03_c03_067-102.indd 86 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

87

Section 3.5 Testing for Individual Differences

limit, and is used by both private businesses and government agencies. Test norms have been set by more than 450,000 working adults and are avail- able for over 140 different jobs. Test content cov- ers numerical reasoning, verbal comprehension, and spatial ability. The test begins with very easy questions and progresses to very difficult ones; due to this range and the short time limit, few peo- ple are able to answer all 50 questions correctly. The test is offered in a variety of versions, includ- ing computer- and paper-based formats, and is now available in nine languages, including French, Spanish, British English, and American English. In all, more than 130 million job applicants have taken the Wonderlic.

The Wechsler Adult Intelligence Scale-Revised (WAIS-R), developed by David Wechsler in 1955 and currently in its fourth edition, is another com- monly used general cognitive ability test. It differs

from the Wonderlic in both length and scope and is composed of 11 different tests (6 verbal and 5 performance), requiring 75 minutes to complete. The 6 verbal tests are comprehension, information, digit span, vocabulary, arithmetic, and similarities. The performance tests are picture completion, picture arrangement, object assembly, digital symbol, and block design. Naturally, this complex psychological assessment requires well-trained administrators to ensure accurate scoring and score interpretation. The WAIS-R is typically used when select- ing for senior management or other positions that require complex cognitive thinking.

Outside the world of work, a person’s cognitive ability also predicts his or her academic suc- cess. Using meta-analytic research, psychologists examined the relationship between the Miller Analogies cognitive ability test (a test commonly used to select both graduate students and professional-level employees) and student and professional success. Interestingly, results showed that there was no significant difference between the cognitive abilities required for academic and business success (Kuncel, Hezlett, & Ones, 2004).

Although they are valid performance predictors for many jobs, cognitive ability tests can pro- duce different selection rates for individuals in select classes. Whites typically score higher than African Americans, and there is much concern that these differences are due to bias within the test. If a test is biased to favor one ethnic group or population over others, then any employment selection or promotion program that utilizes it will be inherently flawed. Discovering bias within a test can be extremely difficult, but I/O psychologists have been able to reduce the impact of potentially biased cognitive ability tests by adding noncognitive tests, such as personality tests, to selection processes (Olson-Buchanan et al., 1998).

Physical Abilities Many jobs require a significant amount of physical effort. Firefighters and police officers, for example, may need physical strength, and mechanics may need finger dexterity. Other examples of hazardous or physically demanding work environments include factories, power

Roy Delgado/CartoonStock

you83701_03_c03_067-102.indd 87 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

88

Section 3.5 Testing for Individual Differences

plants, and hospitals. Organizations must be careful to use information from a job analysis to understand the specific physical requirements of a job before purchasing or developing any work-sample tests to use in the selection process. Fleishman (1967) identifies nine physical ability characteristics present in many jobs (see Table 3.1). Measures of each physical ability are not strongly correlated, however, which suggests that there is no overall measure of gen- eral physical ability.

Table 3.1: Fleishman’s physical ability dimensions

Physical ability Description

Static strength Maximum muscle force exerted by muscle groups (e.g., legs, arms, hands) over a continuous period of time

Explosive strength Explosive bursts of muscle energy over a very short duration to move the individual or an object

Dynamic strength Repeated use of a single muscle over an extended period of time

Trunk strength Ability of back and core muscles to support the body over repeated lifting movements

Extent flexibility Depth of flexibility of arms, legs, and body

Dynamic flexibility Speed of flexibility of arms, legs, and body

Gross body coordination Ability to coordinate arms, legs, and body to perform activities requiring whole-body movements

Gross body equilibrium Ability to coordinate arms, legs, and body to maintain balance and remain upright in unstable positions

Stamina Ability to exert oneself physically over a long duration

Research consistently demonstrates that physical ability tests predict job performance for physically demanding jobs (Hogan, 1991). Identifying individuals who cannot perform the essential physical functions of a job—especially in hazardous positions such as police officer, firefighter, and military personnel—can minimize the risk of physical harm to the job candi- date, other employees, and civilians. Another positive feature of physical ability tests is that they are not strongly correlated with cognitive ability tests, which as mentioned earlier tend to be biased. Thus, using physical ability tests in conjunction with cognitive ability tests can help decrease potential bias and make job performance predictions more accurate (Carroll, 1993).

I/O psychologists must be careful to design and measure physical ability tests so they do not discriminate against minority groups. Unfortunately, although a job may legitimately need candidates who possess specific physical skills, the standards and measures of those skills are often arbitrarily or inaccurately made. For example, height and weight are often used as a proxy for physical strength. Even though these measurements are quick and easy to make, they are not always the most accurate, and they have resulted in the underselection of women for physically demanding jobs (Meadows v. Ford Motor Co., 1975). Companies that fail to implement accurate, careful measurements of an applicant’s ability to perform a job’s necessary, specific physical requirements run the risk of discriminating against protected classes—something that will likely land them in court, where judges and juries consistently rule in favor of the plaintiff and award large settlements.

you83701_03_c03_067-102.indd 88 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

89

Section 3.5 Testing for Individual Differences

One example of a valid, commercially available physical ability test is the Crawford Small Parts Dexterity Test. This test is used to assess the fine motor skills and manual dexterity required for industrial jobs. It examines applicants’ ability to place small objects into small holes on a board. For their first task, test takers use a pair of tweezers to place 36 pins into holes and position a collar around each pin. In their second task, test takers use a screwdriver to insert 36 screws into threaded holes. The test is scored in two ways: by measuring the amount of time taken to complete both parts of the test and by measuring the number of items com- pleted during a set time limit (3 minutes for part 1 and 5 minutes for part 2).

Personality I/O psychologists have studied how personality affects the prediction of job performance since the early 20th century. After examining 113 studies published from 1913 to 1953, Ghis- elli and Barthol (1953) found positive but small correlations between personality and the prediction of job performance. The researchers were surprised that the correlation was not stronger and suggested the companies had used personality tests with weak validity evidence. Developing and using tests with stronger validity evidence, they concluded, would facilitate better predictions of applicants’ future job performance.

Guion and Gottier (1965) disagreed with this notion, suggesting instead that personality measures were unrelated to job performance—even though the two noted that their data came from studies that used poorly designed or theoretically unfounded personality mea- sures. In fact, most researchers at the time developed their own concepts of personality and created tests to match them, which naturally led to considerable inconsistency in the ways they measured personality constructs. Thus, although organizations continued to use person- ality tests to select candidates for management and sales positions, academic research in this area waned for the next 20 years.

In the early 1990s Barrick and Mount’s (1991) landmark meta-analysis established the five- factor model of personality, now commonly referred to as the Big Five personality factors— extraversion, agreeableness, conscientiousness, neuroticism, and openness to experience— which are described in detail in Table 3.2. This model is broader than earlier theories and therefore lends itself more readily to a useful classification for the interpretation of personal- ity constructs (Digman, 1990).

Table 3.2: Big Five personality factors

Factor Also referred to as Description

Neuroticism Adjustment Insecure, untrusting, worried, guilty

Extraversion Sociability Sociable, gregarious, fun, people person

Openness to experience Inquisitiveness Risk taking, creative, independent

Agreeableness Interpersonal sensitivity Empathetic, approachable, courteous

Conscientiousness Mindfulness Dependable, organized, hardworking

you83701_03_c03_067-102.indd 89 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

90

Section 3.5 Testing for Individual Differences

The most important advantage of the five-factor model is the functional structure it provides for predicting the relationships between personality and job performance. Barrick and Mount (1991) reviewed 117 criterion-related validation studies published between 1952 and 1988 that measured at least one of the five personality factors. Results showed that conscientious- ness (a measure of dependability, planfulness, and persistence) was predictive of job perfor- mance for all types of jobs. Further, extraversion (a measure of energy, enthusiasm, and gre- gariousness) predicted performance in sales and management jobs. The other three factors were found to be valid but were weaker predictors of some dimensions of performance in some occupations. That same year, Tett, Jackson, and Rothstein (1991) found strong positive predictive evidence not only for conscientiousness and extraversion but also for agreeable- ness and openness to experience. On the other hand, they found neuroticism to be negatively related to job performance. These researchers also discovered that validity was higher among studies that referenced job analysis information to create tests that linked specific personal- ity traits with job requirements. In summary, then, measures of the Big Five personality fac- tors can significantly predict job performance, but to do so, they must be carefully aligned with critical job functions.

In addition to strong criterion-related validity, measures of the Big Five factors also gener- ally show very little bias. Across a number of personality factors, score differences across racial groups and/or genders are minor and would be unlikely to adversely impact employ- ment decisions; two areas that fall outside this generalization are agreeableness, in which women score higher than men, and dominance (an element of extraversion), in which men score higher (Feingold, 1994; Foldes, Duehr, & Ones, 2008). Because they produce almost no adverse impact, personality tests can be used in conjunction with cognitive ability tests dur- ing selection processes to increase validity while reducing the threat of potential bias (Hough, Oswald, & Ployhart, 2001).

The first test to examine the Big Five personality factors was the NEO Personality Inventory, named for the first three factors of neuroticism, extraversion, and openness to experience. Composed of 240 items, the test can be completed in 30 to 40 minutes and is available in a number of languages, including Spanish, German, and British English. However, research now supports much shorter versions, as short as 10 items (Gosling, Rentfrow, & Swann, 2003), which are ideal for use in work settings, either separately or in combination with other tests and survey measures.

Types of Personality Tests There are two basic types of personality tests: projective tests and self-report inventories. The former presents test takers with an ambiguous image, such as an inkblot, and asks them to describe what they see. Trained psychologists then interpret the descriptions. The rationale for this type of test is that test takers will project their unconscious personalities into their descriptions of the image. Two examples of projective tests are the Rorschach (inkblot) test and the Thematic Apperception Test.

Projective tests are most often used by clinical psychologists and are rarely used in employee selection processes because they are expensive, are time consuming, and require professional interpretation. Instead, employers make use of self-report personality inventories, which ask individuals to identify how a situation, behavior, activity, or feeling is related to them. The rationale for this type of test is that test takers know themselves well enough to make an

you83701_03_c03_067-102.indd 90 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

91

Section 3.5 Testing for Individual Differences

accurate report of their own personality. Some advantages of the self-report inventories over projective tests are cost-effectiveness, standardization of administration practices, and ease in scoring and interpreting results. For example, the website below can help you assess your Big Five personality traits.

Find Out for Yourself: Your Big Five Personality Traits Visit the following website to find extensive information and research about the Big Five per- sonality traits, take the Big Five personality test, and get instant feedback.

The Big Five Project Personality Test

Unfortunately, self-report inventories have drawbacks. A major one is the tendency of test takers to distort or fake their responses. Because test items usually have no right or wrong answers, test takers can easily choose to provide socially acceptable responses instead of true answers in order to make themselves look better to the hiring organization. Indeed, in one controlled study, researchers instructed test takers to try to respond to a personality inventory in a way they felt would create the best impression, which resulted in more posi- tive scores (Hough, Eaton, Dunnette, Kamp, & McCloy, 1990). Furthermore, a significant num- ber of actual applicants who took personality inventories as part of a selection process were found to have distorted their responses to appear more attractive, even without having been told to do so (Stark, Chernyshenko, Chan, Lee, & Drasgow, 2001).

The real question for I/O psychologists is whether response distortion significantly affects the validity of personality inventories. Ones, Viswesvaran, and Reiss (1996) conducted a meta- analysis that examined the effects of social desirability on the relationship between measures of Big Five factors and both job performance and counterproductive behaviors. They found that test takers’ attempts to provide socially acceptable—not necessarily truthful—answers did affect their scores in the areas of emotional stability and conscientiousness but did not seri- ously influence test validity. However, faking answers can influence hiring decisions by chang- ing the rank ordering of job candidates (Christiansen, Goffin, Johnston, & Rothstein, 1994).

One interesting and paradoxical finding of personality test research is that people who are able to recognize the socially acceptable answers, whether they accurately represent the truth of that person, tend to perform better on the job than people who are unable do so (Ones et al., 1996). How can this be? One explanation is that people in the former group are better at reading a situation’s subtle social cues and are therefore more able to extrapolate what they need to do to fulfill coworkers’ and managers’ expectations.

To balance the advantages and disadvantages of projective and self-report tests, a new type of tests, called implicit measures, has emerged. Implicit measures are self-report tests in which the questions are intentionally designed to make the purpose of the test less obvious and thus less amenable to faking and social desirability biases. For example, the test taker may be given a few seemingly neutral situations and a list of thoughts, feelings, and actions and be directed to select the ones that most closely represent him or her in each situation. This intentional vagueness allows implicit measures to assess a construct more accurately and comprehen- sively (Bing, LeBreton, Davison, Migetz, & James, 2007; LeBel, & Paunonen, 2011).

you83701_03_c03_067-102.indd 91 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

92

Section 3.5 Testing for Individual Differences

Honesty and Integrity Organizations need to be able to identify individuals who are likely to engage in dishonest behaviors. Employee misconduct is more serious than distortion of answers on a personality test and can have a significant impact on both coworkers and the organization as a whole. Employee theft, embezzlement, and other forms of dishonesty may cost American businesses billions of dollars annually. According to the National White Collar Crime Center, embezzle- ment alone is estimated to cost companies as much as $90 billion each year (Bressler, 2009). In the past, organizations used polygraph tests, but polygraph test results are not always accurate, and applicants may find them to be an invasion of privacy. When the Employee Poly- graph Protection Act was passed in 1988, most private employers became unable to use these tests, thus requiring them to find another way to identify this trait.

A more valid way to measure employee dishonesty is with an integrity test. Integrity tests fall into two categories: overt integrity tests and personality-based integrity tests. The first type assesses an individual’s direct attitudes and actions toward theft and employment dishon- esty. Test items typically ask individuals to consider their opinions about theft behaviors or to think of their own dishonest behaviors. Sample questions include “Is it OK to take money from someone who is rich?” and “Have you taken illegal drugs in the past year?”

Personality-based integrity tests typically contain disguised-purpose, or covert, questions that measure various personality factors—such as responsibility, virtue, rule following, excitement seeking, anger, hostility, and social conformity—that are related to both produc- tive and counterproductive employee behaviors. Although overt integrity tests can predict theft and other glaring forms of dishonesty, personality-based integrity tests are able to pre- dict behaviors that are more subtly or secretly dishonest, such as absenteeism, insubordina- tion, and substance abuse.

Vocational Interests Unlike most of the tests we have discussed so far, vocational interest inventories are designed for career counseling and should not be used for employee selection. In these inventories, test takers respond to a series of statements pertaining to various interests and preferences. In theory, people who share the preferences and interests of successful workers in a given occupation should experience a high level of job satisfaction if they pursue that line of work. Vocational interest scores predict future occupational choices reasonably well, with between 50% and 60% of test takers subsequently working in jobs consistent with their vocational interests (Hansen & Dik, 2005). However, even though people are likely to get a job doing something in which they are interested, research does not support the notion that vocational interest will always lead to high job performance or satisfaction. In fact, interest congruence is only weakly related to job satisfaction and does not effectively predict either job or train- ing performance (Tranberg, Slane, & Ekeberg, 1993; Schmidt & Hunter, 1998). Keep in mind that just because someone is interested in a certain job does not mean he or she will be able to perform it well.

One frequently used measure of vocational interest is the Strong Interest Inventory (SII), previously known as the Strong Vocational Interest Blank. The SII is a self-report inventory composed of 291 items divided into six themes: artistic, conventional, social, realistic, enter- prising, and investigative. The test requires 25 minutes to complete and is administered and

you83701_03_c03_067-102.indd 92 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

93

Section 3.6 Developing a Testing Program

scored by computer. The results help test takers identify occupations, leisure activities, and work preferences that match their interests. It possesses norms for 211 different occupa- tions. Because interests tend to remain stable throughout a person’s life, it makes sense for high school and college students to take an interest inventory like the SII as they begin the process of developing their professional careers.

Another example is the Armed Services Vocational Aptitude Battery (ASVAB). This test assesses a wide range of abilities that predict future success in the military. More than 1 mil- lion military applicants, high school students, and postsecondary students take this test every year. Test subcategories include reading comprehension, world knowledge, science, math, electronics, and mechanics.

Find Out for Yourself: Occupational Interests and Personal Abilities

Complete the Interest Profiler available on the O*NET website.

O*NET Interest Profiler

What Did You Learn?

1. What did you find out about your occupational interests and personal abilities? 2. Are you currently working in a job that aligns with your occupational interests? Why or

why not?

3.6 Developing a Testing Program Although creating, identifying, and using valid tests are essential for any quality testing pro- gram, organizations also face a number of administrative decisions that can affect the pro- gram’s overall success.

Deciding When Not to Test Most important, an organization must decide whether to use testing in the selection process. Time and cost are extremely important considerations, and they include test development and design, necessary equipment, facility usage, and administrator/evaluator training and pay. Naturally, organizations will need to ensure that the benefits of their testing program outweigh the costs.

Sometimes, the level of employee productivity that a test is able to identify is not high enough to warrant the expense of a testing program. In these cases other measures, such as improv- ing employee training and development, can help advance new hires’ performance. Alter- nately, conducting a more careful review of applicants’ educational backgrounds or asking more in-depth interview questions can provide greater insight into the job-related skills and abilities of potential employees, without having to add tests to the preemployment process.

you83701_03_c03_067-102.indd 93 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

94

Section 3.6 Developing a Testing Program

In summary, then, it is important for an organization both to establish its employment needs and to determine the potential benefits and expected costs of testing programs before imple- menting these useful selection tools.

Finding Quality Tests Over the past few decades, the volume and variety of selection tests have increased dramati- cally. Unfortunately, the test publishers and consulting companies that design them use vary- ing levels of rigor and expertise. How, then, is an organization to know which tests have met acceptable standards of validity, reliability, and test norming? Two periodicals, the Mental Measures Yearbook and Tests in Print, are excellent reference sources that provide descrip- tions and expert reviews of a large number of tests, including those used for selection. As you learned in Chapter 1, you can also consult the original scientific research that was conducted to develop and validate various tests in order to assess their quality and rigor.

Find Out for Yourself: Quality of Selection Methods Research various selection methods with which you are familiar or that you have undergone in the past. Examples may include job applications, interviews, reference checks, medical exams, and referrals. Try to find validity and reliability scores for each method. Which ones are more valid? Which ones are more reliable? Why do you think that is the case?

Test Administrators A test’s usefulness depends in part on its proper administration, scoring, and interpretation. Organizations must not only train their testing administrators on these key functions but also establish quality controls and retrain administrators when necessary. The requirements for administrator qualifications and abilities vary from test to test, so it is important for organi- zations to be aware of the requirements outlined in each test’s manual when selecting and administering tests.

Addressing Ethical and Privacy Issues Test security is a major concern for I/O psychologists and organizations in order to maintain high ethical testing practices. Tests and scores must remain confidential. Questions should never be published or distributed to the public, and tests should only be administered to qualified individuals.

Some applicants may view tests as an invasion of privacy, particularly ones that assess per- sonality and integrity or screen for drugs. As we have noted, fear or mistrust in the selection process can have adverse consequences. Organizations can alleviate some of these concerns by communicating to applicants the reasons for the test, how test results will be used, and how confidentiality will be maintained.

you83701_03_c03_067-102.indd 94 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

95

Section 3.7 Psychological Testing: Special Issues

Testing People With Cultural and Language Differences Differences in cultural backgrounds can shape how test takers interpret and respond to test questions. As the American workforce becomes more racially and ethnically diverse, it is criti- cal that organizations emphasize the use of culturally unbiased tests. Moreover, English is no longer the primary language for a growing number of applicants. Naturally, applicants who cannot fluently read or speak the language used in a test will likely return artificially low scores, not because they lack skills or knowledge but simply because they cannot com- prehend the instructions or understand the test questions. To overcome language barriers, a test can be translated into a variety of languages. However, a common problem with this approach is that expressions and phrases used in the test items may be lost in translation, which decreases the test’s validity generalization. Thus, additional validation is necessary to assess validity generalization whenever a test will be used for a different racial or ethnic group or translated into a different language, and the test may need to be adapted accordingly.

Testing People With Disabilities The ADA protects qualified individuals with disabilities from discrimination in all areas of employment, including employment testing. It can be challenging for organizations to accom- modate individuals with disabilities; they must aim to be sensitive to the needs of the indi- vidual while also maintain the integrity of the testing process and avoid incurring undue hardship. Test administrators require training to understand and properly respond to accom- modation requests. Examples of reasonable accommodations include modifying test equip- ment or seating, ensuring accessibility to the testing facility, and providing a Braille or large- print version of a test to visually impaired candidates.

Establishing Appeals and Retest Processes Every applicant should have the opportunity to perform at his or her best on a test. Despite every intention to create this opportunity, sometimes it is simply not possible. Equipment can malfunction, the testing environment could be poor (noise, temperature, bad odors, or even disasters such as fire or flood), and candidates can be affected by outside stressors (illness or hospitalization of a family member, among others). With each of these situations, candidates could perform significantly better if given the opportunity to retake the test under conditions in which the negative influences are not present.

Test administrators should be trained to identify situations that produce invalid results and then implement a specific process to retest the candidate based on information and guidance provided by the test publisher. The organization should also establish policies for handling complaints regarding testing in order to resolve concerns fairly and consistently.

3.7 Psychological Testing: Special Issues Over the past decade, I/O psychologists have become interested in a number of questions related to employment testing. How do applicants feel about being tested? Do these feel- ings affect their perceptions of the company? Do online tests show the same validity as

you83701_03_c03_067-102.indd 95 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

96

Section 3.7 Psychological Testing: Special Issues

paper-and-pencil tests, and how can organizations keep applicants from cheating on them? Recent research findings shed some light on these interesting questions.

Applicants’ Reactions to Tests Most research about testing has focused on technical aspects—content, type, statistical mea- sures, scoring, and interpretation—and not on social aspects. The fact is, no matter how use- ful and important tests are, applicants generally do not like taking them. According to a study by Schmit and Ryan (1997), 1 out of 3 Americans have a negative perception of employment testing. Another study found that students, after completing a number of different selection measures as part of a simulated application process, preferred hiring processes that excluded testing (Rosse, Ringer, & Miller, 1996). Additional research has shown that applicants’ nega- tive perceptions about tests significantly affect their opinions about the organization giving the test. It is important for I/O psychologists to understand how and why this occurs so they can adopt testing procedures that are more agreeable to test takers and that reflect a more positive organizational image.

Typically, negative reactions to tests lower applicants’ perceptions of several organizational outcome variables, including organizational attraction (how much they like a company), job acceptance intentions (whether they will accept a job offer), recommendation intentions (whether they will tell others to patronize or apply for a job at the company), and purchasing intentions (whether they will shop at or do business with the company). A study conducted in 2006 found that “[e]mployment tests provide organizations with a serious dilemma [: . . .] how can [they] administer assessments in order to take advantage of their predictive capa- bilities without offending the applicants they are trying to attract?” (Noon, 2006, p. 2). With the increasing war for top-quality talent, organizations must develop recruitment and selec- tion tools that attract, rather than drive away, highly qualified personnel.

What Can Be Done? I/O psychologists have addressed this dilemma by identifying several ways to improve testing perceptions. One is to increase the test’s face validity. For example, applicants typically view cognitive ability tests negatively, but their reactions change if the test items are rewritten to reflect a business-situation perspective. Similarly, organizations can use test formats that already tend to be viewed positively, such as assessment centers and work samples, because they are easily relatable to the job (Smither et al., 1993).

Providing applicants with information about the test is another, less costly way to improve perceptions. Applicants can be told what a test is intended to measure, why it is necessary, who will see the results, and how these will be used (Ployhart & Hayes, 2003). Doing so should lead applicants to view both the organization and its treatment of them during the selection process more favorably (Gilliland, 1993). Noon’s 2006 study investigated applicants’ reac- tions to completing cognitive ability and personality tests as part of the selection process for a data-processing position. Half of the applicants received detailed information explaining

you83701_03_c03_067-102.indd 96 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

97

Section 3.7 Psychological Testing: Special Issues

the testing process, while the other half received the standard information normally provided by the company. Applicants who received the detailed information found the company more attractive and were more likely to recommend it to others, even if they did not receive a job offer. Although this tactic is not always used during testing process, providing detailed infor- mation about a test is a quick, cost-effective, and practical way for organizations to improve applicant test perceptions (Lounsbury, Bobrow, & Jensen, 1989).

Online Administration Over the past decade, online testing has increased dramatically. Also referred to as unproctored Internet testing, the online test has replaced many traditional paper-and-pencil alternatives, and almost every type of test is now available through online administration. Online tests have a number of advantages over paper-and-pencil tests. They can be taken by anyone from nearly anywhere in the world, increasing the pool of applicants for a job. Brick-and-mortar testing facilities and proctors no longer need to be a part of the testing program, because applicants can complete the test from home or a public library. The amount of administrative support also decreases, further lowering costs. Hiring decisions are made faster using online admin- istration, because the testing software provides immediate scoring and feedback on test perfor- mance. Finally, online tests often take advantage of new interactive technology, such as video clips, simulations, and gaming platforms, and thus test takers find them more engaging.

Despite the ease and mass marketability of Internet tests, however, I/O psychologists struggle to reach a consensus on their efficacy and ethicality as well as the validity of the test scores they yield. Online tests present a number of challenges, including applicant cheating, poten- tial for subgroup differences, and the inability to identify the test taker (Tippins et al., 2006). Current research suggests that unproctored Internet tests are not compromised by cheating (Nye, Do, Drasgow, & Fine, 2008), although undoubtedly there will be occasions when an applicant will feel the stakes are high enough to rationalize cheating. One solution to this potential problem is to require candidates to complete a portion of the test under proctored conditions at the organization prior to receiving a job offer.

Regardless of the challenges, online tests are here to stay. The advantages far outweigh the disadvantages, leaving I/O psychologists with the “delicate task of balancing best ‘academic’ practice with often conflicting practical and organizational needs to implement valid and reli- able online assessments” (Kaminski & Hemingway, 2009, p. 26).

Minerva Studio/iStock/Thinkstock

Online testing is becoming more prevalent and has many benefits, such as increasing the applicant pool and lowering costs.

you83701_03_c03_067-102.indd 97 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

98

Section 3.8 Testing for Positivity

3.8 Testing for Positivity Selecting for KSAOs is important, but so is selecting for positivity. Psychological testing can be an effective way to assess applicants’ positivity levels. Although valid and reliable tests are available to assess many physical, cognitive, social, and psychological traits and abilities, tests

are just starting to emerge that adequately assess pos- itivity in general as well as specific positive psycho- logical qualities in an applicant or an employee.

This is not to say that established psychological assessments are predominantly negative. To the con- trary, many of the most recognized psychological tests are positive in nature. For example, the Big Five per- sonality traits include four clearly positive traits— conscientiousness, extroversion, agreeableness, and openness to experience—and only one negative trait, neuroticism. However, many of the existing psycho- logical tests are primarily based on problem-oriented models and processes, which mainly focus on detect- ing what an applicant may be lacking rather than on assessing what makes an applicant flourish and thrive in the workplace.

Inherently, an applicant’s positivity and negativity are often considered to be opposite sides of the same coin. For example, an employer might assume that an employee who tests high on optimism will be low on pessimism, or that an employee who is high on posi- tive affect should also be low on negative affect. These assumptions have gone unchallenged for years, and many organizations and consultants have readily

made extrapolations from positive to negative and negative to positive psychological charac- teristics, as if they are two ends of the same continuum. However, recent studies show that these assumptions may not always hold true. For example, Schaufeli and Bakker’s (2004) study showed strong evidence that experiencing burnout and engagement at work are two distinct psychological constructs rather than polar opposites of the same continuum, and each is affected by different job characteristics. Another example is work behavior. Positive and negative work behaviors have been shown to be distinct and to yield different, not just opposite, performance outcomes (Dunlop & Lee, 2004; Sackett, Berry, Wiemann, & Laczo, 2006).

Several resources are available for those employers searching for valid and reliable tests of positivity. The reference Positive Psychological Assessment: A Handbook of Models and Mea- sures (Lopez & Snyder, 2003), published by the American Psychological Association, evalu- ates the validity and reliability of numerous tests of positive psychological constructs such as optimism, hope, confidence, creativity, and courage. Of course, for each psychological con- struct, there are several alternative tests. Some offer higher validity or reliability than others, so employers should carefully select not only which constructs they should test and evaluate but also which specific tests to use.

Monkey Business Images Ltd/Thinkstock

Psychological testing can be an effective way to assess applicants’ positivity levels.

you83701_03_c03_067-102.indd 98 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

99

Section 3.8 Testing for Positivity

Three specific positivity tests have been found particularly relevant and predictive of work performance and other desirable outcomes such as job satisfaction, organizational commit- ment, and overall well-being. The first of these is Gallup’s Strengthfinder (Rath, 2007), which assesses test takers on 34 different “talents” and is widely used in the United States and around the world to select, place, and fit job candidates in the right jobs, usually based on the test taker’s top five talents.

The second is the Psychological Capital Questionnaire (PCQ-24), which has been recently developed and validated by Fred Luthans and colleagues (Luthans, Avolio, Avey, & Norman, 2007; Luthans, Youssef-Morgan, & Avolio, 2015) at the University of Nebraska. It is a 24-item measure that assesses the test taker’s levels of hope, confidence, resilience, and optimism and combines these four psychological resources into one higher order positive construct. There is also a short, 12-item version of this test (PCQ-12; Avey, Avolio, & Luthans, 2011; Luthans, Youssef, Sweetman, & Harms, 2013) that has been translated into a number of languages and tested in numerous cultures (Wernsing, 2014), as well as an implicit version that uses adaptable positive, negative, and neutral situations to assess test takers’ reactions (Harms & Luthans, 2012).

The third is Barbara Fredrickson’s (2009) positivity ratio assessment, which measures posi- tivity and negativity as two independent constructs and then calculates the ratio of positive- to-negative responses. See the feature box Find Out for Yourself: How Positive Are You? to learn more about this assessment.

Find Out for Yourself: How Positive Are You? Visit the Positivity website to complete Barbara Fredrickson’s positivity ratio assessment and instantly obtain your own positivity ratio. Keep in mind that this assessment is somewhat volatile and will change depending on the situations you encountered the day before. To get a more accurate assessment, it is recommended that you complete this test multiple times over several days and take an average of your scores. Also keep in mind that some of the statisti- cal analysis behind positivity ratios has been criticized, so be sure to review the 2013 update posted by Fredrickson on the same website.

Positivity Ratio Assessment

you83701_03_c03_067-102.indd 99 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

100

Summary and Conclusion

Summary and Conclusion

Selecting the right candidates from the available pool of job applicants is critical for employee and organizational success. When the right employee is selected and placed in the right job, performance is higher, which can translate into higher productivity and finan- cial returns for the organization. For example, well-placed employees tend to go above and beyond the immediate job requirements, which can positively influence coworkers and pro- mote a positive organizational culture. Properly selected employees are also likely to stay with the organization longer and be absent less often, which can translate into enormous cost savings. Equally important, well-placed employees will likely experience more satisfac- tion with their jobs and the organization, have higher work engagement levels, and perceive their jobs as more meaningful, all of which contribute to higher employee well-being.

Because effective employee selection relies on predicting subsequent performance on the job, it is beneficial for managers to use the most accurate and consistent predictors avail- able. Psychological testing affords managers the opportunity to use valid and reliable tests that can fulfill this important role. However, many managers continue to use highly subjec- tive approaches to selection, which often ends up wasting their time and their organization’s resources on selecting, training, and managing the wrong job applicants, or worse, exposing their organization to discrimination-based lawsuits that can be time consuming and costly and can compromise its reputation.

The role an I/O psychologist plays in the test-selection process is threefold. First, I/O psy- chologists can educate managers and organizational decision makers on the importance of finding evidence for a test’s validity and reliability before attempting to use it. Second, they can use available evidence to help organizations discern among multiple tests, a process that managers often perceive as intimidating or difficult to understand. Third, I/O psycholo- gists contribute to the development of more valid and reliable tests and selection tools in areas where none currently exist, helping create the most appropriate and efficient methods for selecting the right candidates in an ever-changing workplace.

constructs Abstract, intangible personal attributes.

Cronbach’s alpha A statistical measure of the intercorrelations across items in a test.

meta-analysis A combination of the results of several studies that address a set of related research hypotheses.

objective tests Tests that have one clearly correct answer.

percentile scores The percentage of people in the standardized sample who scored below an individual’s raw score.

raw scores The number of points a person scores on a test.

reliability The extent to which the results from a predictor such as a selection tool, method, or procedure can be consistently replicated over time and across situations.

Key Terms

you83701_03_c03_067-102.indd 100 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

101

Summary and Conclusion

standardization sample A large group of people who take a test and act as a compari- son group against whose scores an individ- ual applicant’s scores can be compared.

subjective tests Tests that have no defini- tive right or wrong answer; thus, scores rely heavily on the evaluator’s interpretation and judgment.

test A method used to make an employment decision about whether to hire, retain, pro- mote, place, demote, or dismiss someone; an instrument or procedure that measures an individual’s employment and career-related qualifications and characteristics or mea- sures samples of behavior or performance.

test norms The comparison scores pro- vided by a standardization sample.

validity The extent to which a selection tool or procedure can accurately predict subse- quent performance.

validity generalization The notion that validity evidence transfers across situations.

you83701_03_c03_067-102.indd 101 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

you83701_03_c03_067-102.indd 102 4/20/17 4:22 PM

© 2017 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.

Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteDemy. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.

Do you need an answer to this or any other questions?

About Writedemy

We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.

How It Works

To make an Order you only need to click on “Order Now” and we will direct you to our Order Page. Fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Are there Discounts?

All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.

Hire a tutor today CLICK HERE to make your first order