�ݺ�ߣ

Getting one voice:
tuning up experts’ assessment in
measuring accessibility
Silvia Mirri
Ludovico A. Muratori
Paola Salomoni
Matteo Battistelli

Department of Computer Science
University of Bologna

Summary

Introduction
Automatic and manual accessibility evaluations
Our proposed metric
Conclusions and future works

W4A 2012 – April 16th&17th, 2012 - Lyon, France 2

Introduction

Web accessibility evaluations

automatic tools + human assessment

Metrics quantify accessibility level or barriers, providing
numerical synthesis
• automatic tools return binary values
• human assessments are subjective and can get values from a
continuous range


Our main goal

Providing a metric to measure how far a Web
page is from its accessibility version, taking into
account

• integration of human assessments with automatic
evaluations on the same target
• many humans assessments


Steps

1. Mixing up the manual evaluation together with the
automatic ones

2. Combining the assessments coming from different
human evaluations
• Values distributed into a given range
• The more experts' assessments contribute to compute a
value, the more this value is stable and reliable


Automatic and manual evaluations: an example

Combination between the IMG element and its ALT
attribute:
1. If the ALT attribute is omitted the automatic check outputs 1
2. If the ALT attribute is present the automatic check outputs 0

Manual evaluation might state that:
• there is no lack of information once the images are hidden (this
can happen in case 1, if the image is a pure decorative one)
• there is a lack of information once the image is hidden


Our metric
• A first version of our metric (Barriers Impact Factor) is
computed on the basis of a barrier-error association
table
• This table reports the list of assistive
technologies/disabilities affected by any error
• screen reader/blindness
• screen magnifier/low vision
• color blindness
• input device independence/movement impairments
• deafness
• cognitive disabilities
• photosensitive epilepsy


Our metric

• Comparing automatic checks with WCAG 2.0 success
criteria and identified relationships

a certain error occurs or a
A check fails
manual control is necessary

• Each barrier is related to one success criterion and to
one level of conformity (A, AA or AAA)
• Manual evaluations take values on the [0, 1] real
numbers interval:
• 1 means that an accessibility error occurs
• 0 means the absence of that accessibility error


Our metric


Weighting automatic and manual checks

1. m(i)=a(i): the formula is a mere average among automatically
and manually detected errors
2. m(i)>a(i): the failure in manual assessment is considered more
significant than the automatic one
3. m(i)<a(i): the failure in automatic assessment is considered
more significant than the manual one

AUTOMATIC AUTOMATIC
0 1 0 1

[0, I III [0, I II
MANUAL

MANUAL
,1] II IV ,1] III IV


Some considerations

• The more human operators provide evaluations about
an accessibility barrier and the more the value of
accessibility level is reliable
• Behavior similar to online rating systems ones
• New users rating can be influenced by already
expressed evaluations from other users
• Variance must be considered so as to reinforce the
computed accessibility level


A first assessment
PAGE CONTENT MANUAL EVALUATIONS

0,7 Expert A

1 Expert B

0,8 Expert C

1 Expert D

ALT=“Image” 0,5 Expert E

NO LINK, NO TITLE
CBIF

AUTOMATIC EVALUATION
m=2
a=1
0 (no known errors, Average=0,8 CBIF=0,53
1 alert: placeholder Variance=0,036
detected)


Conclusions

• We have defined an accessibility metric with the aim to
evaluate barriers as a whole, combining results
provided by using automatic tools and manual
evaluations done by experts
• The metric has been preliminary tested by measuring
accessibility barriers in several local public
administration Web sites
• Five experts are manually evaluating barriers related to
WCAG 2.0 1.1.1 (using an automatic monitoring system
to verify the page content and to collect data from
manual evaluations)


Future Work

• Propose and discuss weights for the whole WCAG 2.0
set of barriers

• Investigate how the number of experts involved in the
evaluation, together with their rating variance, could
influence the reliability of the computed values


Contacts

 Thank you for your attention!

 For further information:
 silvia.mirri@unibo.it


�ݺ�ߣ

Mirri w4a2012

Recommended

More Related Content

Similar to Mirri w4a2012 (20)

Recently uploaded (20)

Mirri w4a2012