The document evaluates 6 language assessment scenarios based on 7 key principles: reliability, validity, practicality, washback, authenticity, and transparency.
Scenario 1, a standardized multiple-choice test, is rated highly in reliability, validity, practicality and washback but low in authenticity. Scenario 2, a timed impromptu writing test, is rated medium in reliability and washback but low in validity, practicality and authenticity. Scenario 3, an oral interview, is rated medium in reliability but low in validity, practicality and authenticity. Scenario 5, multiple drafts of an essay with peer review, is rated highly in reliability and medium in practicality and washback but low in authentic
1 of 2
Download to read offline
More Related Content
Week 2 exercise_2015 (9)
1. Week 2 Exercise Name _Saida Afandiyeva
Azerbaijan____________________________
Think about the seven key concepts (usefulness, reliability, validity, practicality, washback,
authenticity, and transparency) defined in this week's reading from Dawn Rogier (2014), and as
defined by Coombe et al (2007) in last week's reading (their eighth cornerstone was "security").
We have borrowed six testing scenarios from an exercise by Brown & Abeywickrama (2010) and
selected 5 key principles. In the chart below, evaluate the assessment scenarios by rating how well
they fulfill the principles. Use H (high), M (medium), or L (low). An H indicates that the principle
is highly fulfilled and a score of L indicates very low or no fulfillment. Evaluate the scenarios by
using your best intuition in the absence of complete information for each context.
SCENARIO
(Rater)
Reliability
(Content)
Validity Practicality
Wash-
back Authenticity
1. Standardized multiple-choice
proficiency test, no oral or written
production
Ss (Students) receive a report which
includes a total score and subscores
for listening, grammar, proofreading,
and reading comprehension.
high high high high low
Comments: This is the best type of test among the other five ones, because it meets the demands of good test,
except authenticity
2. Timed impromptu test of written
English (TWE速 Test)
Ss receive a report listing one holistic
score ranging between 0 and 6. No
additional comment is provided.
Medium low low Medium low
Comments: This is not good type of test, because it does not meet three of four demands of good test
3. One-on-one oral interview to assess
overall oral production ability
Ss receive one holistic score ranging
between 0 and 5. No additional
comment is provided.
medium Low Low Medium Low
Comments:
This Is bad type of assessment
5. S creates multiple drafts of a three-
page essay. Early drafts are peer-
reviewed. S turns in a near-final
version to the T.
T comments on grammatical/rhetorical
errors only and returns it to S (no
grade).
High medium medium Low Low
Comments: This is good type of Assessment
6. S assembles a portfolio of materials
over a semester-long course.
T conferences with S on the portfolio
at the end of the semester, assigning
an overall grade.
high medium high High low
Comments: This is very good type ofAssessment
Exercise adapted from Brown, H.D. & Abeywickrama, P. (2010). Language Assessment. Pages 48-50.
2. Exercise adapted from Brown, H.D. & Abeywickrama, P. (2010). Language Assessment. Pages 48-50.