際際滷

際際滷Share a Scribd company logo
Nomad3D                                                    White Paper




  White Paper: Mobile Interactive Online 3D
            Video Requirements

                            Dr. Alain Fogel, CEO Nomad3D
                                  www.nomad3D.com




     Contact: Dr. Alain Fogel, CEO Nomad3D
     Email: alain.fogel@nomad3D




Date: 29.07.2012                                                   Page 1
Nomad3D                                                                            White Paper


1 Nomad3D Executive and Technical Summary
     Nomad3D has developed a revolutionary 3D video codec that runs as an efficient
     extension on existing 2D video codecs. This codec, referred to as Nomad3D 3D+F, or
     3D+F for short, is able to provide interactive 3D video capabilities on top of existing 2D
     video infrastructure, where traditional 3D extensions of 2D video codecs fail. Nomad3D
     3D+F provides for an efficient and low latency coding and decoding of stereoscopic 3D
     video as a thin software layer on top of existing 2D video codec infrastructure. It is well
     positioned for use in interactive 3D video gaming, telemedicine, 3D teleconferencing,
     3D military communication and mobile 3D environments.
     3D+F uses a representation of stereoscopic left/right views as a fused single 2D view
     plus additional fusion meta-data. The fused 2D view is coded using a traditional 2D
     video codec (H.264, VP8), whereas the fusion metadata (FusionData) is efficiently
     encoded using Nomad3Ds fusion codec. Nomad3D IP protects the fusion and de-fusion
     technology, as well as technical aspects of the fusion codec.
     Traditional stereoscopic 3D video codecs essentially operate by separately encoding left
     and right views. This approach leads to an increase by a factor of two compared to a 2D
     video codec in terms of complexity, required resources and bandwidth. The 3D+F codec
     resolves this complexity issue by transforming a stereoscopic pair into (1) a single fused
     (Cyclops) 2D view, and (2) FusionData. The FusionData encode technical elements of
     the fusion process and assist in recreating left-right views at decoding time. The
     complexity of the fusion and defusion, as well as the fusion codec, is very small.
     Moreover, the bandwidth required for FusionData is small. The overall complexity of
     the 3D+F codec is therefore mainly determined by the complexity of the 2D video
     codec in the lower branch of Figure 1. The 2D video codec in 3D+F is a free parameter,
     and may be chosen from a number of well-performing video codecs, such as
     H.264/AVC (Wikipedia, 2012) and VP8 (WebM, 2012).
     For a given video quality target, 3D+F may gain close to a factor of two compared to a
     standard 3D video codec in coding complexity, required resources and bandwidth.
     3D+F is therefore very well suited to be deployed in demanding and/or resource-
     constrained ecosystems.




Date: 29.07.2012                                                                             Page 2
Nomad3D                                                                              White Paper



                                  Fusion Codec               Fusion Codec
                                   (Software)                 (Software)
                       Fusion
                        Data
                                  FusionData                   FusionData
   Left-Right                      Encoding                     Decoding                   Left-Right
     Views                                                                                   Views

                    Fusion                                                     De-Fusion


                                   2D Video                    2D Video
                                   Encoding                    Decoding
                   Cyclops (2D)
                      View
                                   Standard                     Standard
                                   2D Codec                     2D Codec


                                         Figure 1 - 3D+F Overview




2 Interactive Video Overview
      Interactive video involves two or more parties interacting in real-time via graphical,
      video, textual or audio interfaces. Interactive video is important and/or relevant in video
      gaming, telemedicine, and military communications, among others. The respective
      markets are growing at tremendous speeds due to the increasing availability of powerful
      mobile devices and increasingly powerful connectivity. As stereoscopic 3D (S3D)
      becomes more available due to maturing technologies such as auto-stereoscopic
      smartphones and tablets (no glasses), interactive video will migrate to S3D devices. The
      use of interactive 3D video is believed to lead to a richer user experience and/or a better
      understanding of context and situation.

2.1 Interactive Video Gaming
      Interactive video gaming over the Internet is a large and growing market, including the
      developing Online Cloud Gaming (OCG) market. According to PWC (PWC, 2012), the
      online market will take over the traditional gaming market as shown on Figure 2 and
      clearly this trend is supported by the deployment of mobile devices. Social Gaming
      companies such as Zynga and Playfish and Game on Demand companies such as
      OnLive, Microsoft, and Sony Computer Entertainment are forming the bulk of the
      ecosystem of OCG.

2.2 Telemedicine
      Telemedicine is a developing area, benefitting from the ubiquity of smartphones and
      tablets. The technology leaders in this market (e.g. Philips, Cisco, Lucent-Alcatel, and

Date: 29.07.2012                                                                                Page 3
Nomad3D                                                                              White Paper


     Honeywell) are cooperating with leading carriers (e.g. Orange, and Vodafone), leading
     device manufacturers (e.g. Apple, Samsung, and LG) and providers of video
     conferencing systems (e.g. Cisco, Polycom, and Vidyo).

2.3 Military communications
     Military Communications have specific requirements for interactive video, in particular
     in the area of airborne surveillance (e.g. drones), target identification, tracking and
     engaging. The use of interactive 3D video is critical to a better understanding of context
     and situation.




           Figure 2 Online/Wireless Gaming Market vs. Console/PC Market




3 Interactive Online Mobile Video  The Pain and Requirements
     In the following, we present the pain and requirements of interactive online mobile
     video. We point out the differences between 2D and 3D and the resulting differences in
     the requirements and the compared consequences due to video codecs.

3.1 3D Cursor
     3D interactive applications very often require a 3D cursor, i.e. a cursor that is being
     perceived at the depth of the object it is pointing to. The correspondence between the
     3D cursor and the actual pixel pointers in the left and right view is hard to implement


Date: 29.07.2012                                                                               Page 4
Nomad3D                                                                                   White Paper


        without the help of a depth or disparity map. This depth/disparity must therefore be
        provided in addition to the H.264 baseline flows, implying additional significant
        computational, power and bandwidth resources.

3.2     Latency
        Latency is the time between capturing and displaying a video frame. Latency is a
        primary issue in interactive video and is preferably less than 100ms end-to-end. The
        standard codec H.264 is not able to achieve such latency in its highest profiles where the
        latency is often above 1 second. Hence, the standard codec will be typically used in
        baseline profile. Latency gets worse for 3D video content with separate or multi-view
        encoding of left and right views.

3.3 Bit Rate
        The required bit rate (or bandwidth) for video transmission depends on the resolution,
        frame rate, quality requirement and video codec capability to compress and decompress
        within the limitations of the communication channel bandwidth (network capability).
        For 3D content, the required bit rate for a given quality may increase by as much as a
        factor of two.

3.3.1    Cloud gaming
        For OCG the required bit rate for good video quality with H.264 exceeds 3 Mbps. In the
        US, available bit rates on the public Internet are usually between 1 and 7 Mbps, making
        3D gaming with H.264 problematic. In addition 3D OCG very often needs the presence
        of a depth or disparity map for implementation of a 3D cursor requiring significant
        additional transmission bandwidth.

3.3.2    Telemedicine and the military communication
        The required bit rate for interactive telemedicine and military applications is in the order
        of 10 Mbps. This rate is difficult to sustain on public US networks. In Europe and Asia
        the situation is slightly better, but still on the cutting edge of possibilities. For 3D
        interactive video with the requirement of a 3D cursor using a standard H.264 codec,
        network performance is insufficient to sustain acceptable 3D quality.

3.4 Computing Power

3.4.1    Cloud gaming and military
        Strong computing capabilities are required to decode a flow of H.264-coded 2D video.
        3D video decoding using a standard H.264 solution requires doubling of this compute

Date: 29.07.2012                                                                                   Page 5
Nomad3D                                                                               White Paper


        power. On a 4.3 smartphone, this implies up to 40% of reduction of battery life and on
        a tablet, up to 20%.

3.4.2    Telemedicine and military
        In addition to decoding computing capabilities, telemedicine and military
        communication also require HD encoding capabilities. The H.264 encoder is much
        more power hungry than the decoder (at least 2 times) and therefore the impact on
        computing power and battery life is severe.

3.5 Conclusion
        The requirements for interactive 3D video are at or over the limit of current capabilities
        of public networks and terminals using standard H.264 coding techniques.




4 The Solution: NOMAD3D 3D+F Codec

4.1 Features
        Nomad3D has developed an innovative 3D CODEC with the following features:
              3D+F is compatible with 2D codecs (e.g. H.264, VP8)
              3D+F is a layer on top a 2D underlying codec that can be reused for 3D as is.
              3D+F is high performance and is power efficient.
              3D+F requires no change in the hardware of 2D application processors.
              3D+F is enabled by pre-installed or downloadable software.

4.2      Compliance of 3D+F with Interactive 3D Online Video

        The Nomad3D 3D+F 3D codec is the solution for complying with the requirements of
        3D interactive online video. Specifically, compared to 2D:
              3D+F adds at most one video frame of latency (30ms for 30 fps).
              3D+F requires at most a 30% of bit rate increase (H.264: 130%).
              3D+F does not need an additional depth map to implement a 3D cursor.
              3D+F increases decoder power consumption by less than 15%.




Date: 29.07.2012                                                                                Page 6
Nomad3D                                                                        White Paper


     The advantages of 3D+F vs. H.264 for interactive online 3D video are summarized on
     Table 1 below:
                                            3D H.264                NOMAD3D
                                          Baseline Profile            3D+F
       Latency                                 Low                     Low
       Decoder Computing
                                               High                 Very Low
       Power
       Encoder Computing
                                            Very High                  Low
       Power
       Software
                                                No                     Yes
       Implementation
       Hardware Impact                         High                   None
      Required Bit Rate                        High                    Low
      3D cursor complexity                     High                    Low
       Additional Depth or
                                                Yes                    No
       Disparity Map


       Table 1  Comparison of H.264 and 3D+F for Interactive Online 3D Video



     REFERENCES
[ 1] - WebM. (2012, July 25). Welcome to the WebM Project. Retrieved July 25, 2012, from
WebM: http://www.webmproject.org/

[ 2] - Wikipedia. (2012, July 25). H.264/MPEG-4 AVC. Retrieved July 25, 2012, from
Wikipedia: http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC

[ 3]  PwC Internet source - http://www.pwc.com/gx/en/global-entertainment-media-
outlook/segment-insights/video-games.jhtml




Date: 29.07.2012                                                                       Page 7

More Related Content

Interactive 3D Online Video Requirements

  • 1. Nomad3D White Paper White Paper: Mobile Interactive Online 3D Video Requirements Dr. Alain Fogel, CEO Nomad3D www.nomad3D.com Contact: Dr. Alain Fogel, CEO Nomad3D Email: alain.fogel@nomad3D Date: 29.07.2012 Page 1
  • 2. Nomad3D White Paper 1 Nomad3D Executive and Technical Summary Nomad3D has developed a revolutionary 3D video codec that runs as an efficient extension on existing 2D video codecs. This codec, referred to as Nomad3D 3D+F, or 3D+F for short, is able to provide interactive 3D video capabilities on top of existing 2D video infrastructure, where traditional 3D extensions of 2D video codecs fail. Nomad3D 3D+F provides for an efficient and low latency coding and decoding of stereoscopic 3D video as a thin software layer on top of existing 2D video codec infrastructure. It is well positioned for use in interactive 3D video gaming, telemedicine, 3D teleconferencing, 3D military communication and mobile 3D environments. 3D+F uses a representation of stereoscopic left/right views as a fused single 2D view plus additional fusion meta-data. The fused 2D view is coded using a traditional 2D video codec (H.264, VP8), whereas the fusion metadata (FusionData) is efficiently encoded using Nomad3Ds fusion codec. Nomad3D IP protects the fusion and de-fusion technology, as well as technical aspects of the fusion codec. Traditional stereoscopic 3D video codecs essentially operate by separately encoding left and right views. This approach leads to an increase by a factor of two compared to a 2D video codec in terms of complexity, required resources and bandwidth. The 3D+F codec resolves this complexity issue by transforming a stereoscopic pair into (1) a single fused (Cyclops) 2D view, and (2) FusionData. The FusionData encode technical elements of the fusion process and assist in recreating left-right views at decoding time. The complexity of the fusion and defusion, as well as the fusion codec, is very small. Moreover, the bandwidth required for FusionData is small. The overall complexity of the 3D+F codec is therefore mainly determined by the complexity of the 2D video codec in the lower branch of Figure 1. The 2D video codec in 3D+F is a free parameter, and may be chosen from a number of well-performing video codecs, such as H.264/AVC (Wikipedia, 2012) and VP8 (WebM, 2012). For a given video quality target, 3D+F may gain close to a factor of two compared to a standard 3D video codec in coding complexity, required resources and bandwidth. 3D+F is therefore very well suited to be deployed in demanding and/or resource- constrained ecosystems. Date: 29.07.2012 Page 2
  • 3. Nomad3D White Paper Fusion Codec Fusion Codec (Software) (Software) Fusion Data FusionData FusionData Left-Right Encoding Decoding Left-Right Views Views Fusion De-Fusion 2D Video 2D Video Encoding Decoding Cyclops (2D) View Standard Standard 2D Codec 2D Codec Figure 1 - 3D+F Overview 2 Interactive Video Overview Interactive video involves two or more parties interacting in real-time via graphical, video, textual or audio interfaces. Interactive video is important and/or relevant in video gaming, telemedicine, and military communications, among others. The respective markets are growing at tremendous speeds due to the increasing availability of powerful mobile devices and increasingly powerful connectivity. As stereoscopic 3D (S3D) becomes more available due to maturing technologies such as auto-stereoscopic smartphones and tablets (no glasses), interactive video will migrate to S3D devices. The use of interactive 3D video is believed to lead to a richer user experience and/or a better understanding of context and situation. 2.1 Interactive Video Gaming Interactive video gaming over the Internet is a large and growing market, including the developing Online Cloud Gaming (OCG) market. According to PWC (PWC, 2012), the online market will take over the traditional gaming market as shown on Figure 2 and clearly this trend is supported by the deployment of mobile devices. Social Gaming companies such as Zynga and Playfish and Game on Demand companies such as OnLive, Microsoft, and Sony Computer Entertainment are forming the bulk of the ecosystem of OCG. 2.2 Telemedicine Telemedicine is a developing area, benefitting from the ubiquity of smartphones and tablets. The technology leaders in this market (e.g. Philips, Cisco, Lucent-Alcatel, and Date: 29.07.2012 Page 3
  • 4. Nomad3D White Paper Honeywell) are cooperating with leading carriers (e.g. Orange, and Vodafone), leading device manufacturers (e.g. Apple, Samsung, and LG) and providers of video conferencing systems (e.g. Cisco, Polycom, and Vidyo). 2.3 Military communications Military Communications have specific requirements for interactive video, in particular in the area of airborne surveillance (e.g. drones), target identification, tracking and engaging. The use of interactive 3D video is critical to a better understanding of context and situation. Figure 2 Online/Wireless Gaming Market vs. Console/PC Market 3 Interactive Online Mobile Video The Pain and Requirements In the following, we present the pain and requirements of interactive online mobile video. We point out the differences between 2D and 3D and the resulting differences in the requirements and the compared consequences due to video codecs. 3.1 3D Cursor 3D interactive applications very often require a 3D cursor, i.e. a cursor that is being perceived at the depth of the object it is pointing to. The correspondence between the 3D cursor and the actual pixel pointers in the left and right view is hard to implement Date: 29.07.2012 Page 4
  • 5. Nomad3D White Paper without the help of a depth or disparity map. This depth/disparity must therefore be provided in addition to the H.264 baseline flows, implying additional significant computational, power and bandwidth resources. 3.2 Latency Latency is the time between capturing and displaying a video frame. Latency is a primary issue in interactive video and is preferably less than 100ms end-to-end. The standard codec H.264 is not able to achieve such latency in its highest profiles where the latency is often above 1 second. Hence, the standard codec will be typically used in baseline profile. Latency gets worse for 3D video content with separate or multi-view encoding of left and right views. 3.3 Bit Rate The required bit rate (or bandwidth) for video transmission depends on the resolution, frame rate, quality requirement and video codec capability to compress and decompress within the limitations of the communication channel bandwidth (network capability). For 3D content, the required bit rate for a given quality may increase by as much as a factor of two. 3.3.1 Cloud gaming For OCG the required bit rate for good video quality with H.264 exceeds 3 Mbps. In the US, available bit rates on the public Internet are usually between 1 and 7 Mbps, making 3D gaming with H.264 problematic. In addition 3D OCG very often needs the presence of a depth or disparity map for implementation of a 3D cursor requiring significant additional transmission bandwidth. 3.3.2 Telemedicine and the military communication The required bit rate for interactive telemedicine and military applications is in the order of 10 Mbps. This rate is difficult to sustain on public US networks. In Europe and Asia the situation is slightly better, but still on the cutting edge of possibilities. For 3D interactive video with the requirement of a 3D cursor using a standard H.264 codec, network performance is insufficient to sustain acceptable 3D quality. 3.4 Computing Power 3.4.1 Cloud gaming and military Strong computing capabilities are required to decode a flow of H.264-coded 2D video. 3D video decoding using a standard H.264 solution requires doubling of this compute Date: 29.07.2012 Page 5
  • 6. Nomad3D White Paper power. On a 4.3 smartphone, this implies up to 40% of reduction of battery life and on a tablet, up to 20%. 3.4.2 Telemedicine and military In addition to decoding computing capabilities, telemedicine and military communication also require HD encoding capabilities. The H.264 encoder is much more power hungry than the decoder (at least 2 times) and therefore the impact on computing power and battery life is severe. 3.5 Conclusion The requirements for interactive 3D video are at or over the limit of current capabilities of public networks and terminals using standard H.264 coding techniques. 4 The Solution: NOMAD3D 3D+F Codec 4.1 Features Nomad3D has developed an innovative 3D CODEC with the following features: 3D+F is compatible with 2D codecs (e.g. H.264, VP8) 3D+F is a layer on top a 2D underlying codec that can be reused for 3D as is. 3D+F is high performance and is power efficient. 3D+F requires no change in the hardware of 2D application processors. 3D+F is enabled by pre-installed or downloadable software. 4.2 Compliance of 3D+F with Interactive 3D Online Video The Nomad3D 3D+F 3D codec is the solution for complying with the requirements of 3D interactive online video. Specifically, compared to 2D: 3D+F adds at most one video frame of latency (30ms for 30 fps). 3D+F requires at most a 30% of bit rate increase (H.264: 130%). 3D+F does not need an additional depth map to implement a 3D cursor. 3D+F increases decoder power consumption by less than 15%. Date: 29.07.2012 Page 6
  • 7. Nomad3D White Paper The advantages of 3D+F vs. H.264 for interactive online 3D video are summarized on Table 1 below: 3D H.264 NOMAD3D Baseline Profile 3D+F Latency Low Low Decoder Computing High Very Low Power Encoder Computing Very High Low Power Software No Yes Implementation Hardware Impact High None Required Bit Rate High Low 3D cursor complexity High Low Additional Depth or Yes No Disparity Map Table 1 Comparison of H.264 and 3D+F for Interactive Online 3D Video REFERENCES [ 1] - WebM. (2012, July 25). Welcome to the WebM Project. Retrieved July 25, 2012, from WebM: http://www.webmproject.org/ [ 2] - Wikipedia. (2012, July 25). H.264/MPEG-4 AVC. Retrieved July 25, 2012, from Wikipedia: http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC [ 3] PwC Internet source - http://www.pwc.com/gx/en/global-entertainment-media- outlook/segment-insights/video-games.jhtml Date: 29.07.2012 Page 7