Psyc421 item development analysis worksheet 1 1045
PSYC 421
ITEM DEVELOPMENT AND ANALYSIS WORKSHEET
Student Name:
Section: PSYC421-
PART 1: Writing Multiple Choice Test Items
Develop one multiple choice question that covers content from each of the four chapters listed below.
When writing your sample questions, please keep in mind the specifications regarding item
construction discussed in the textbook. Also, remember the importance of carefully crafted distractor
options. Finally, please limit the number of response options to 4 (1 correct response and 3
distractors), and avoid the options of “all of the above,” none of the above,” or the like.Be sure to
indicate which of the response options is the correct one.
Chapter 3 Multiple Choice Question (5 points)
Which one of these is NOT a measuring scale?
A. Normal Scale
B. Nominal Scale
C. Interval Scale
D. Ordinal Scale
Answer : A
Chapter 4 Multiple Choice Question (5 points)
A “good test” must contain all of the following characteristics EXCEPT;
A. Be valid
B. Serve a useful purpose.
C. Be reliable
D. Be simple
Answer: D
Chapter 5 Multiple Choice Question (5 points)
All of the information below is based on assumptions about reliability Except For
A. Parallel-Form
B. Split-Half
C. Line-of-Best Fit
D. Test-Retest
Answer: C
Chapter 6 Multiple Choice Question (5 points)
All of the following are qualities of a Criterion EXCEPT:
A. Uncontaminated
Page 1 of 6
PSYC 421
B. Relevant
C. Valid
D. Expert tested
Answer: D
PART 2: Item Analysis: Item Difficulty Index(Cohen & Swerdlik, 2017, pg. 248)
A test is only as good as its questions! When researchers, test constructors, and educators create items
for ability or achievement tests, we have a responsibility to evaluate the items and make sure that they
are useful and high-quality.The process that we use to evaluate test items is known as Item Analysis.
When bad items are identified and eliminated from a test, that increases the efficiency, reliability and
validity of the entire test!One way that we can distinguish among good and bad items is with the Item
Difficulty Index.
Part 2A: Calculating Item Difficulty
Using the data below, calculate the Item DifficultyIndex for the first 6 items onQuiz 1 from a recent
section of PSYC101. For each item, “1” means the item was answered correctly and “0” means it was
answered incorrectly. Type your answers in the spaces provided at the bottom of the table. (2 pts. each)
PSYC101 Quiz 1 Item Distribution and Total Scores
Examinee
Item 1
Item 2
Item 3
Item 4
Item 5
Item 6
Andre
1
1
1
1
1
1
Allison
0
1
1
1
0
0
Heather
1
1
1
1
0
0
Corey
1
1
0
1
1
0
Christina
0
0
1
0
0
1
Jeffrey
0
1
1
1
0
0
Shawn
1
1
1
1
0
1
Dana
0
0
1
1
0
1
Megan
1
1
1
1
0
1
David
0
1
1
1
0
1
Isabel
0
1
0
1
0
0
Lance
1
1
1
1
0
0
Aliyah
0
1
1
1
0
1
Blaire
0
1
1
1
1
1
Gabriel
0
0
1
1
0
0
Item
0.4
0.8
0.8667
0.9333
0.2
0.5333
Page 2 of 6
PSYC 421
Difficulty
Part 2B: Calculating Optimal Item Difficulty (1 pt. each)
1. For a test item with two response options (e.g., true/false), what is the probability of selecting the
correct answer by chance?
50
%
2. Calculate the optimal level of difficulty for a test questions with two response options.
0.75
3. For a test item with three response options, what is the probability of selecting the correct answer
by chance?
33
%
4. Calculate the optimal level of difficulty for a test questions with three response options.
0.7
5. For a test item with four response options, what is the probability of selecting the correct answer
by chance?
25
%
6. Calculate the optimal level of difficulty for a test questions with four response options.
0.6
7. For a test item with five response options, what is the probability of selecting the correct answer by
chance?
20
%
8. Calculate the optimal level of difficulty for a test questions with five response options.
0.6
PART 3: Item Analysis: Item Discrimination Index(Cohen & Swerdlik, 2017, pg. 250–253)
Another way that test creators can distinguish between good and bad items is with an analysis called
the Discrimination Index. The discrimination index measures how well an individual test item
distinguishes between high scorers and low scores on the test. An item is considered to be “good” if
most of the high scorers get it right, and most of the low scorers get it wrong.
Interpreting the Discrimination Index (d)
• The discrimination index can range from -1.0 to 1.0.
• The closer d is to 1.0, the better the item discriminates between high and low scorers
• The closer d is to 0, the more poorly the item discriminates between high and low scorers.
• An item with a negative discrimination index is considered a “negative discriminator” because
Page 3 of 6
PSYC 421
more low scorers get the item correct than high scorers.
A discrimination index of 1.0 means all the high scorers got the item correct and all of the low
scorers got it incorrect.
• A discrimination index of -1.0 means all of the low scorers got the item correct and all of the
high scorers got it incorrect.
• Items with d’s close to 0 or with negative d’s ought to be eliminated from the test!
Calculating the Item Discrimination Index (d)
•
Calculate the item discrimination index (d) for the 7 hypothetical test items presented below. Type
your answers in the spaces provided at the right of the table (2 pts. each).
Item #
U
L
n
d
Item 1
0
30
30
-1
Item 2
25
8
30
0.5667
Item 3
23
19
30
0.1333
Item 4
26
3
30
0.7667
Item 5
28
1
30
0.9
Item 6
19
5
30
0.4667
Item 7
3
26
30
-0.7667
Based on your calculations above, answer the following questions (2 pts. each).
1. Which item discriminates the best?
2. Which item discriminates most poorly?
Item 5
Item 3
3. Based on your analysis, identify which two items would you choose to eliminate from this test and
explain why you would eliminate each.
Part 4: Item Characteristic Curves(Cohen & Swerdlik, pg. 253–255)
Another method that test creators can use to assess the usefulness of test items is with Item
Characteristic Curves. Item characteristic curves provide a graphical depiction of examinees’
performance on individual test items. As indicated in the figure below, Total Test Score is plotted on
the x-axis of the curve, while proportion of examinees who got the item correct is plotted on the y-axis
Page 4 of 6
PSYC 421
Using the figure above, provide a written description of how test items A–E discriminate among
examinees at various levels of performance. In your responses, discuss why each item would be
considered a “good” or a “bad” item. EXAMPLE: “This item discriminates well among high scores,
but doesn’t discriminate well among low scorers. So this item would be considered a good item
because it discriminates at the highest levels of performance.” (4 pts. each)
Item A: This item is bad. It does not discriminates low among low scores or high among higher scores
Item B: This item would be regarded a bad item, because it discriminates high in low scores as well as
low in high scores since the slope is high and subsequently descends lower,.
Item C: This item is considered a good item because the item discriminates low in the low test scores
and scores high in the high test scores. This item discriminates at the highest levels of performance.
Item D: This item is considered a good item. The item discriminates low among low scores and slopes
high among higher scores.
Item E: This is a good item. This item has the difficulty increasing and creating challenges along the
way.
Part 5: Qualitative Item Analysis(Cohen & Swerdlik, pg. 258–260)
Qualitative item analysis refers to a set of non-statistical procedures used to gather information about
the usefulness of test items. These analyses typically involve interviews, panel discussions,
questionnaires and other forms of verbal exchange with test-takers to explore how individual test items
work.
As an online student, you have a very different test-taking experience than residential students. Based
on your readings from Chapter 8, identify 4 topics related to online test taking, and create 4 qualitative
questions that you could ask online test-takers to gain an understanding of their experiences with testtaking. Also, as students at a Christian institution of higher education, course assignments/assessments
are supposed to give students an opportunity to integrate course content with their Christian
worldview. Given the topic of faith and learning, create one qualitative question that you could ask
test-takers.
Page 5 of 6
PSYC 421
Qualitative Item Analysis
Topic(2 pts. each)
Sample Question for Test-Takers(2 pts. each)
Applicability
Do you believe this exam put you to the test on facts that you
could use outside of class?
Faith and Learning
Is this test, in your opinion, adequately incorporating a christian
viewpoint in connection to the learning process?
Relevance
Do you think this test was significant to the content of the lesson,
as well as the profession you want to pursue?
Difficulty
Do you think this test was more tough or less challenging than the
course lessons?
Page 6 of 6
…