John Foliot, a video accessibility expert and contributor to the W3Cs new Media Accessibility User Requirements (MAUR), will go over the latest updates to HTML5 video as well as new guidelines for making media accessible on the web.
1 of 40
More Related Content
Getting ahead of the curve - Scalable, Accessible, Enterprise-class Video on the Web
1. Getting Ahead of the
Curve
SCALABLE, ACCESSIBLE, ENTERPRISE-CLASS VIDEO ON THE WEB
4. Production and Delivery Requirements
Alternative Content Technologies
Captioning
Described video
Text video description
Transcripts
Extended video descriptions
Enhanced captions/subtitles
Sign translation
Clean audio
Content navigation by content structure
System Requirements
Access to interactive controls / menus
Discovery and activation/deactivation of available
alternative content by the user
Granularity level control for structural navigation
Time-scale modification
Requirements on making properties available to
the accessibility interface
Requirements on the use of the viewport
Requirements on secondary screens and other
devices
6. Alternative Content Technologies:
Captions
Captioning is the process of converting the audio
content of a television broadcast, webcast, film, video,
live event, or other productions into text and displaying
the text on a screen or monitor.
Captions not only display words as the textual
equivalent of spoken dialogue or narration, but they
also include speaker identification, sound effects, and
music description.
(http://www.dcmp.org/captioningkey/)
7. Alternative Content Technologies:
Captions
It is important that the captions are:
synchronized and appear at approximately the same time as
the audio is delivered
equivalent and equal in content to that of the audio, including
speaker identification and sound effects
accessible and readily available to those who need or want
them
Captions can be either Open or Closed
While similar in construction, Captions and Sub-Titles are not the
same thing.
8. Alternative Content Technologies:
Captions
WCAG Success Criteria
S.C. 1.2.2 Captions (Prerecorded) (A)
S.C. 1.2.4 Captions (Live) (AA)
S.C. 3.1.2 Language of Parts (AA)
S.C. 1.4.8 Visual Presentation (AAA)
UAAG Success Criteria
S.C. 1.1.1 Render Alternative Content (A)
S.C. 1.1.3 Replace Non-Text Content (A)
S.C. 3.1.3 Retrieval Progress (A)
S.C. 2.11.8 Video Contrast and Brightness (AAA)
9. Alternative Content Technologies:
Captions (Timing and Display)
Formats for captions, subtitles or foreign-language subtitles must:
Render text in a time-synchronized manner, using the media resource as
the time-base master.
Be available in a text encoding. (e.g. UTF-8)
Support positioning in all parts of the screen - either inside the media
viewport but also possibly in a determined space next to the media
viewport.
Permit a range of font faces and sizes. Permit rendered text and
backgrounds in a range of colors, supporting a full range of opacities.
Additional requirements supporting internationalization and visual display
properties
11. Alternative Content Technologies:
Descriptions
Described video contains descriptive narration of key visual
elements designed to make visual media accessible to people
who are blind or visually impaired.
The descriptions include actions, costumes, gestures, scene
changes or any other important visual information that someone
who cannot see the screen might ordinarily miss.
12. Alternative Content Technologies:
Descriptions
WCAG Success Criteria
S.C. 2.4.6 Headings and Labels (AA)
S.C. 1.2.5 Audio Description (Prerecorded) (AA)
S.C. 1.4.7 Low or No Background Audio(AAA)
UAAG Success Criteria
S.C. 1.1.1 Render Alternative Content (A)
S.C. 1.1.3 Replace Non-Text Content (A)
S.C. 3.1.3 Retrieval Progress (A)
S.C. 2.11.8 Video Contrast and Brightness (AAA)
13. Alternative Content Technologies:
Transcripts
A full transcript supports different user needs and is not
a replacement for captioning. A transcript can either be
presented simultaneously with the media material,
which can assist slower readers or those who need
more time to reference context, but it should also be
made available independently of the media.
14. Alternative Content Technologies:
Transcripts
WCAG Success Criteria
S.C. 1.2.1 Audio-only and Video-only (Prerecorded) (AA)
S.C. 2.4.6 Headings and Labels (AA)
S.C. 3.1.1 Language of Page (A)
UAAG Success Criteria
S.C. 1.1.1 Render Alternative Content (A)
S.C. 1.1.3 Replace Non-Text Content (A)
S.C. 3.1.3 Retrieval Progress (A)
S.C. 2.11.8 Video Contrast and Brightness (AAA)
15. Alternative Content Technologies:
Extended and Enhanced
Extended descriptions work by pausing the video and program audio at key
moments, playing a longer description than would normally be permitted,
and then resuming playback when the description is finished playing. This will
naturally extend the timeline of the entire presentation
Enhanced captions are timed text cues that have been enriched with further
information - examples are glossary definitions for acronyms and other
intialisms, foreign terms (for example, Latin), jargon or descriptions for other
difficult language. They may be age-graded, so that multiple caption tracks
are supplied, or the glossary function may be added dynamically through
machine lookup.
19. What is an HTML5 Media Player?
<video width=800" height=600" controls>
<source src=/slideshow/getting-ahead-of-the-curve-scalable-accessible-enterpriseclass-video-on-the-web/58782161/MyVideo.mp4" type="video/mp4" />
<source src=MyVideo.webm" type="video/webm" />
<track src=MyCaptions.vtt" kind=Captions" srclang="en" label="English" />
<!-- fallback for legacy browsers -->
</video>
The browser is the video player the video player is the browser!
HTML5 anticipates author scripted & customized controls
Many HTML5 video players today use Flash players as a fallback
20. System Requirements
Access to interactive controls / menus
Interactive controls and menus must be available
to all users for all means in which the controls
are exposed - no matter whether they are
exposed by the user agent, or are scripted.
Controls must be device independent, so that
control may be achieved by keyboard, pointing
device, speech, etc.
21. System Requirements
Discovery and activation/deactivation of
available alternative content by the user
Alternative content must be both discoverable
by the user, and accessible in device agnostic
ways.
The development of APIs and user-agent
controls should adhere to UAAG guidance.
22. System Requirements
Discovery & activation/deactivation of available alternative content
In cases where the alternative content has different dimensions than the
original content, the user has the option to specify how the layout/reflow
of the document should be handled.
The user can browse the alternatives and switch between them.
Synchronized alternatives for time-based media can be rendered at the
same time as their associated audio tracks and visual tracks
Non-synchronized alternatives can be rendered as replacements for the
original rendered content.
Provide the user with the option to load time-based media content such
that the first frame is displayed (if video), but the content is not played
until explicit user request.
23. System Requirements
Granularity level control for structural
navigation
A real-time control mechanism must be
provided for adjusting the granularity of the
specific structural navigation point next and
previous.
Users must be able to set the range/scope of
next and previous in real time.
24. System Requirements
Time-scale modification
While all devices may not support the capability,
a standard control API must support the ability
to speed up or slow down content presentation
without altering audio pitch.
(This feature has been present on many devices,
especially audiobook players, for years. )
25. System Requirements
Making properties available to the
accessibility interface (AAPI)
For user agents supporting accessibility APIs
implemented for a platform, any media controls
need to be connected to that API.
26. System Requirements
Requirements on the use of the viewport
The video viewport traditionally provides a
bounding box for many of the visually
represented alternative-content technologies,
although some alternative content does not rely
on a viewport. Remember when designing player
skins is that the lower-third of the video may be
needed for caption text.
27. Summary of System Requirements
Controls must be accessible, via platform AAPIs
Alternate content must be discoverable, and able to
be modified by the end user
Content navigation and content display should allow
for personalization
29. Lets Do Video!
Producing Accessible videos
Streaming your Accessible videos
Managing your Accessible video library
30. Caption & Description Resources
Outsourcing
Captions
3PlayMedia (http://3playmedia.com)
Automatic Sync (http://www.automaticsync.com/caption)
CaptionMax (http://captionmax.com/)
National Captioning Institute (http://www.ncicap.org)
Described Video
Bridge Multimedia (http://www.bridgemultimedia.com/audiodescription/ )
Dicapta (http://www.dicapta.com/)
CaptionMax (http://captionmax.com/)
More resources (http://www.acb.org/adp/services.html)
31. Producing Accessible videos
Video Captioning Software
MAGpie (http://ncam.wgbh.org/invent_build/web_multimedia/tools-
guidelines/magpie)
Media Access Generator is the original free caption and audio-description
authoring tool for making multimedia accessible to persons with sensory
disabilities.
Subtitle Workshop
(http://www.urusoft.net/products.php?amp;cat=sw&lang=1)
The most complete, efficient and convenient freeware subtitle editing tool.
It supports all the subtitle formats you need and has all the features you
would want from a subtitle editing program.
Caption Generator (http://www.vttcaptions.com/)
Caption Generator lets you create, synchronize and edit .vtt captions.
32. Producing Accessible videos
Video Description Software
YouDescribe (http://www.youdescribe.org/)
Free web-based tool that allows anyone to record descriptions of YouTube
videos and/or play previously described YouTube videos.
CapScribe Open (http://www.inclusivemedia.ca/captionme/)
Free Mac-based video editor for captioning and description
MAGpie (http://ncam.wgbh.org/invent_build/web_multimedia/tools-
guidelines/magpie)
Media Access Generator is the original free caption and audio-description
authoring tool for making multimedia accessible to persons with sensory
disabilities.
Livedescribe (http://www.livedescribe.com/)
Video description software designed, prototyped and developed at The
Center for Learning Technology.
33. Producing Accessible videos
HTML5
Out of band
<video controls>
<source src=/slideshow/getting-ahead-of-the-curve-scalable-accessible-enterpriseclass-video-on-the-web/58782161/movie.webm>
<source src=movie.mp4>
<track src=english.vtt kind=captions srclang=en>
<track src=french.vtt kind=captions srclang=fr>
<p>Fallback content here with links to download video files</p>
</video>
36. Streaming your Accessible videos
Adaptive Streaming
Media Server
Adaptive Streaming - HTTP
(http://en.wikipedia.org/wiki/Adaptive_bitrat
e_streaming)
MPEG-DASH
Apple HTTP Adaptive Streaming
Microsoft Smooth Streaming
Adobe Dynamic Streaming for Flash
http://www.streamingmedia.com/Articles/Editorial/Featured-Articles/How-to-Produce-for-
Adaptive-Streaming-81020.aspx
37. Streaming your Accessible videos
Real Time Streaming Protocol (RTMP)
Media Server
Helix DNA Server / Helix Universal Server :
RealNetworks' streaming server. Comes in both open-source and proprietary
flavors.
RealNetworks commercial streaming server for RTSP, RTMP, iOS, Silverlight
and HTTP streaming media clients
QuickTime Streaming Server:
Apple's closed-source streaming server that ships with Mac OS X Server.
Windows Media Services:
Microsoft streaming server previously included with Windows Server that uses
RTSP modified with Windows Media extensions
38. Streaming your Accessible videos
Real Time Streaming Protocol (RTMP)
Media Server
Wowza Media Server:
Multi-format streaming server for RTSP/RTP,
RTMP, MPEG-TS, ICY, HTTP (HTTP Live Streaming,
HTTP Dynamic Streaming,
Smooth Streaming)
39. Managing your Accessible video
library
Not all Media CMS solutions are equal.
An accessible Media CMS supports the following:
Integrated upload and tracking of Captions, Video Descriptions, Transcripts
and of course Videos
Integrated Adaptive Streaming
Supports/converts multiple codecs (.mp4, WebM)
1.2.2 Captions (Prerecorded): Captions are provided for all prerecorded audio content in synchronized media1.2.4 Captions (Live): Captions are provided for all live audio content in synchronized media.
3.1.2 Language of Parts: The human language of each passage or phrase in the content can be programmatically determined
1.4.8 Visual Presentation: For the visual presentation of blocks of text, a mechanism is available to achieve Foreground and background colors can be selected by the user, Text can be resized without assistive technology up to 200 percent
1.1.1 Render Alternative Content: The user can choose to render any type of recognized alternative content that is present for a content element.
1.1.3 Replace Non-Text Content: The user can request a placeholder that incorporates recognized text alternative content instead of recognized non-text content, until explicit user request to render the non-text content. (Level A)
3.1.3 Retrieval Progress: By default, the user agent shows the state of content retrieval activity.
2.11.8 Video Contrast and Brightness: Users can adjust the contrast and brightness of visual time-based media.
1.2.2 Captions (Prerecorded): Captions are provided for all prerecorded audio content in synchronized media1.2.4 Captions (Live): Captions are provided for all live audio content in synchronized media.
3.1.2 Language of Parts: The human language of each passage or phrase in the content can be programmatically determined
1.4.8 Visual Presentation: For the visual presentation of blocks of text, a mechanism is available to achieve Foreground and background colors can be selected by the user, Text can be resized without assistive technology up to 200 percent
1.1.1 Render Alternative Content: The user can choose to render any type of recognized alternative content that is present for a content element.
1.1.3 Replace Non-Text Content: The user can request a placeholder that incorporates recognized text alternative content instead of recognized non-text content, until explicit user request to render the non-text content. (Level A)
3.1.3 Retrieval Progress: By default, the user agent shows the state of content retrieval activity.
2.11.8 Video Contrast and Brightness: Users can adjust the contrast and brightness of visual time-based media.
1.2.2 Captions (Prerecorded): Captions are provided for all prerecorded audio content in synchronized media1.2.4 Captions (Live): Captions are provided for all live audio content in synchronized media.
3.1.2 Language of Parts: The human language of each passage or phrase in the content can be programmatically determined
1.4.8 Visual Presentation: For the visual presentation of blocks of text, a mechanism is available to achieve Foreground and background colors can be selected by the user, Text can be resized without assistive technology up to 200 percent
1.1.1 Render Alternative Content: The user can choose to render any type of recognized alternative content that is present for a content element.
1.1.3 Replace Non-Text Content: The user can request a placeholder that incorporates recognized text alternative content instead of recognized non-text content, until explicit user request to render the non-text content. (Level A)
3.1.3 Retrieval Progress: By default, the user agent shows the state of content retrieval activity.
2.11.8 Video Contrast and Brightness: Users can adjust the contrast and brightness of visual time-based media.
Render text in a time-synchronized manner, using the media resource as the time-base master.
Allow the author to specify erasures, i.e., times when no text is displayed on the screen (no text cues are active).
Allow the author to assign timestamps so that one caption/subtitle follows another, with no perceivable gap in between.
Be available in a text encoding. (e.g. UTF-8)
Support positioning in all parts of the screen - either inside the media viewport but also possibly in a determined space next to the media viewport.
Support the display of multiple regions of text simultaneously.
Display multiple rows of text when rendered as text in a right-to-left or left-to-right language.
Allow the author to specify line breaks.
Permit a range of font faces and sizes.
Render a background in a range of colors, supporting a full range of opacities.
Render text in a range of colors.
Enable rendering of text with a thicker outline or a drop shadow to allow for better contrast with the background.
Where a background is used, it is preferable to keep the caption background visible even in times where no text is displayed, such that it minimizes distraction.
Allow the use of mixed display styles e.g., mixing paint-on captions with pop-on captions within a single caption cue or in the caption stream as a whole.
Support positioning such that the lowest line of captions appears at least 1/12 of the total screen height above the bottom of the screen.
Use conventions that include inserting left-to-right and right-to-left segments within a vertical run when rendered as text in a top-to-bottom oriented language.
Represent content of different natural languages. (i.e. support mixed languages)
Represent content of at least those specific natural languages that may be represented with [Unicode 3.2], including common typographical conventions of that language (e.g., through the use of furigana and other forms of ruby text).
Present the full range of typographical glyphs, layout and punctuation marks normally associated with the natural language's print-writing system.
Permit in-line mark-up for foreign words or phrases.
2.4.6 Headings and Labels: Headings and labels describe topic or purpose. (Level AA)
1.2.5 Audio Description (Prerecorded): Audio description is provided for all prerecorded video content in synchronized media. (Level AA)
1.4.7 Low or No Background Audio: For prerecorded audio-only content that (1) contains primarily speech in the foreground, (2) is not an audio CAPTCHA or audio logo, and (3) is not vocalization intended to be primarily musical expression such as singing or rapping, at least one of the following is true: (Level AAA)
No Background: The audio does not contain background sounds.
Turn Off: The background sounds can be turned off.
20 dB: The background sounds are at least 20 decibels lower than the foreground speech content, with the exception of occasional sounds that last for only one or two seconds.
1.1.1 Render Alternative Content: The user can choose to render any type of recognized alternative content that is present for a content element.
1.1.3 Replace Non-Text Content: The user can request a placeholder that incorporates recognized text alternative content instead of recognized non-text content, until explicit user request to render the non-text content. (Level A)
3.1.3 Retrieval Progress: By default, the user agent shows the state of content retrieval activity.
2.11.8 Video Contrast and Brightness: Users can adjust the contrast and brightness of visual time-based media.
1.2.1 Audio-only and Video-only (Prerecorded): For prerecorded audio-only and prerecorded video-only media, the following are true, except when the audio or video is a media alternative for text and is clearly labeled as such: (Level A)
Prerecorded Audio-only: An alternative for time-based media is provided that presents equivalent information for prerecorded audio-only content.
Prerecorded Video-only: Either an alternative for time-based media or an audio track is provided that presents equivalent information for prerecorded video-only content.
2.4.6 Headings and Labels: Headings and labels describe topic or purpose. (Level AA)
3.1.1 Language of Page: The default human language of each Web page can be programmatically determined. (Level A)
1.1.1 Render Alternative Content: The user can choose to render any type of recognized alternative content that is present for a content element.
1.1.3 Replace Non-Text Content: The user can request a placeholder that incorporates recognized text alternative content instead of recognized non-text content, until explicit user request to render the non-text content. (Level A)
3.1.3 Retrieval Progress: By default, the user agent shows the state of content retrieval activity.
2.11.8 Video Contrast and Brightness: Users can adjust the contrast and brightness of visual time-based media.
A relatively recent development in television accessibility is the concept of clean audio, which takes advantage of the increased adoption of multichannel audio. This is primarily aimed at audiences who are hard of hearing, and consists of isolating the audio channel containing the spoken dialog and important non-speech information that can then be amplified or otherwise modified, while other channels containing music or ambient sounds are attenuated.
Effective navigation of a multi-level hierarchy will require an additional control not typically available using current media players. This mechanism, which we are calling a "granularity-level control," will allow the user to adjust the level of granularity applied to "next" and "previous" controls. (Ref: Interactive transcripts)
Acknowledging that not all devices will be capable of handling multiple video streams, this is a should requirement for browsers where hardware is capable of support. Strong authoring guidance for content creators will mitigate situations where user-agents are unable to support multiple video streams (WCAG) - for example, on mobile devices that cannot support multiple streams, authors should be encouraged to offer two versions of the media stream, including one with signed captions burned into the media.
Support operation of all functionality via the keyboard on systems where a keyboard is (or can be) present, and where focusable elements are used for interaction.
Support a rich set of native controls for media operation, including but not limited to play, pause, stop, jump to beginning, jump to end, scale player size (up to full screen), adjust volume, mute, captions on/off, descriptions on/off, selection of audio language, selection of caption language, selection of audio description language, location of captions, size of captions, video contrast/brightness, playback rate, content navigation on same level (next/prev) and between levels (up/down) etc. This is also a particularly important requirement on mobile devices or devices without a keyboard.
The user has the ability to have indicators rendered along with rendered elements that have alternative content (e.g., visual icons rendered in proximity of content which has short text alternatives, long descriptions, or captions). In cases where the alternative content has different dimensions than the original content, the user has the option to specify how the layout/reflow of the document should be handled.
The user has a global option to specify which types of alternative content by default and, in cases where the alternative content has different dimensions than the original content, how the layout/reflow of the document should be handled.
The user can browse the alternatives and switch between them.
* All identified structures, including ancillary content as defined in "Content Navigation", must be accessible with the use of "next" and "previous," as refined by the granularity control.
* Users must be able to discover, skip, play-in-line, or directly access ancillary content structures.
* Users need to be able to access the granularity control using any input mode, e.g., keyboard, speech, pointer, etc.
* Producers and authors may optionally provide additional access options to identified structures, such as direct access to any node in a table of contents.
The user can adjust the playback rate of the time-based media tracks to between 50% and 250% of real time.
Speech whose playback rate has been adjusted by the user maintains pitch in order to limit degradation of the speech quality.
All provided alternative media tracks remain synchronized across this required range of playback rates.
The user agent provides a function that resets the playback rate to normal (100%).
The user can stop, pause, and resume rendered audio and animation content (including video and animated images) that last three or more seconds at their default playback rate.
http://john.foliot.ca/demos/NCAM/hgp.html
The existence of alternative-content tracks for a media resource must be exposed to the user agent.
Since authors will need access to the alternative content tracks, the structure needs to be exposed to authors as well, which requires a dynamic interface.
Accessibility APIs need to gain access to alternative content tracks no matter whether those content tracks come from within a resource or are combined through markup on the page.
The video viewport plays a particularly important role with respect to alternative-content technologies. Mostly it provides a bounding box for many of the visually represented alternative-content technologies (e.g., captions, hierarchical navigation points, sign language), although some alternative content does not rely on a viewport (e.g., full transcripts, descriptive video).
sOne key principle to remember when designing player skins is that the lower-third of the video may be needed for caption text. Caption consumers rely on being able to make fast eye movements between the captions and the video content. If the captions are in a non-standard place, this may cause viewers to miss information.
1.2.2 Captions (Prerecorded): Captions are provided for all prerecorded audio content in synchronized media1.2.4 Captions (Live): Captions are provided for all live audio content in synchronized media.
3.1.2 Language of Parts: The human language of each passage or phrase in the content can be programmatically determined
1.4.8 Visual Presentation: For the visual presentation of blocks of text, a mechanism is available to achieve Foreground and background colors can be selected by the user, Text can be resized without assistive technology up to 200 percent
MAGpie (free) (Windows and Mac) A tool for creating captions and audio description for multiple formats and media types.
Subtitle Workshop (free) (Windows only) The most complete, efficient and convenient freeware subtitle editing tool. It supports all the subtitle formats you need and has all the features you would want from a subtitle editing program.
CaptionMaker and MacCaption Video closed captioning for any Mac/PC digital workflow.
MovCaptioner (Mac only [Windows version in development]) Utilizes a GUI to create and synchronize captions in a number of popular formats. Single- and multi-user licenses available.
YouDescribe
Free web-based tool that allows anyone to record descriptions of YouTube videos and/or play previously described YouTube videos. Developed by The Smith-Kettlewell Video Description Research and Development Center (VDRDC)
CapScribe Open
Free Mac-based video editor for captioning and description developed by Inclusive Media and Design.
MAGpie
Media Access Generator is the original free caption and audio-description authoring tool for making multimedia accessible to persons with sensory disabilities, developed by NCAM, the National Center for Accessible Media.
Livedescribe
Video description software designed, prototyped and developed at The Center for Learning Technology by developer Carmen Branje.