The document proposes a new metric to measure website accessibility by combining results from automatic accessibility evaluations and manual evaluations from multiple human experts. The metric averages accessibility scores from tools and experts, and factors in the number of evaluations to produce a single accessibility score between 0 and 1. A preliminary test evaluated barriers on public websites to demonstrate the approach. Future work will refine the metric and evaluation process.
1 of 15
More Related Content
Mirri w4a2012
1. Getting one voice:
tuning up experts assessment in
measuring accessibility
Silvia Mirri
Ludovico A. Muratori
Paola Salomoni
Matteo Battistelli
Department of Computer Science
University of Bologna
2. Summary
Introduction
Automatic and manual accessibility evaluations
Our proposed metric
Conclusions and future works
W4A 2012 April 16th&17th, 2012 - Lyon, France 2
3. Introduction
Web accessibility evaluations
automatic tools + human assessment
Metrics quantify accessibility level or barriers, providing
numerical synthesis
automatic tools return binary values
human assessments are subjective and can get values from a
continuous range
W4A 2012 April 16th&17th, 2012 - Lyon, France 3
4. Our main goal
Providing a metric to measure how far a Web
page is from its accessibility version, taking into
account
integration of human assessments with automatic
evaluations on the same target
many humans assessments
W4A 2012 April 16th&17th, 2012 - Lyon, France 4
5. Steps
1. Mixing up the manual evaluation together with the
automatic ones
2. Combining the assessments coming from different
human evaluations
Values distributed into a given range
The more experts' assessments contribute to compute a
value, the more this value is stable and reliable
W4A 2012 April 16th&17th, 2012 - Lyon, France 5
6. Automatic and manual evaluations: an example
Combination between the IMG element and its ALT
attribute:
1. If the ALT attribute is omitted the automatic check outputs 1
2. If the ALT attribute is present the automatic check outputs 0
Manual evaluation might state that:
there is no lack of information once the images are hidden (this
can happen in case 1, if the image is a pure decorative one)
there is a lack of information once the image is hidden
W4A 2012 April 16th&17th, 2012 - Lyon, France 6
7. Our metric
A first version of our metric (Barriers Impact Factor) is
computed on the basis of a barrier-error association
table
This table reports the list of assistive
technologies/disabilities affected by any error
screen reader/blindness
screen magnifier/low vision
color blindness
input device independence/movement impairments
deafness
cognitive disabilities
photosensitive epilepsy
W4A 2012 April 16th&17th, 2012 - Lyon, France 7
8. Our metric
Comparing automatic checks with WCAG 2.0 success
criteria and identified relationships
a certain error occurs or a
A check fails
manual control is necessary
Each barrier is related to one success criterion and to
one level of conformity (A, AA or AAA)
Manual evaluations take values on the [0, 1] real
numbers interval:
1 means that an accessibility error occurs
0 means the absence of that accessibility error
W4A 2012 April 16th&17th, 2012 - Lyon, France 8
10. Weighting automatic and manual checks
1. m(i)=a(i): the formula is a mere average among automatically
and manually detected errors
2. m(i)>a(i): the failure in manual assessment is considered more
significant than the automatic one
3. m(i)<a(i): the failure in automatic assessment is considered
more significant than the manual one
AUTOMATIC AUTOMATIC
0 1 0 1
[0, I III [0, I II
MANUAL
MANUAL
,1] II IV ,1] III IV
W4A 2012 April 16th&17th, 2012 - Lyon, France 10
11. Some considerations
The more human operators provide evaluations about
an accessibility barrier and the more the value of
accessibility level is reliable
Behavior similar to online rating systems ones
New users rating can be influenced by already
expressed evaluations from other users
Variance must be considered so as to reinforce the
computed accessibility level
W4A 2012 April 16th&17th, 2012 - Lyon, France 11
12. A first assessment
PAGE CONTENT MANUAL EVALUATIONS
0,7 Expert A
1 Expert B
0,8 Expert C
1 Expert D
ALT=Image 0,5 Expert E
NO LINK, NO TITLE
CBIF
AUTOMATIC EVALUATION
m=2
a=1
0 (no known errors, Average=0,8 CBIF=0,53
1 alert: placeholder Variance=0,036
detected)
W4A 2012 April 16th&17th, 2012 - Lyon, France 12
13. Conclusions
We have defined an accessibility metric with the aim to
evaluate barriers as a whole, combining results
provided by using automatic tools and manual
evaluations done by experts
The metric has been preliminary tested by measuring
accessibility barriers in several local public
administration Web sites
Five experts are manually evaluating barriers related to
WCAG 2.0 1.1.1 (using an automatic monitoring system
to verify the page content and to collect data from
manual evaluations)
W4A 2012 April 16th&17th, 2012 - Lyon, France 13
14. Future Work
Propose and discuss weights for the whole WCAG 2.0
set of barriers
Investigate how the number of experts involved in the
evaluation, together with their rating variance, could
influence the reliability of the computed values
W4A 2012 April 16th&17th, 2012 - Lyon, France 14
15. Contacts
Thank you for your attention!
For further information:
silvia.mirri@unibo.it
W4A 2012 April 16th&17th, 2012 - Lyon, France 15