際際滷

際際滷Share a Scribd company logo
(12) United States Patent
Ramasastry et a].
US007522774B2
US 7,522,774 B2
Apr. 21, 2009
(10) Patent N0.:
(45) Date of Patent:
(54)
(75)
(73)
(21)
(22)
(65)
(60)
(51)
(52)
(58)
METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA
Inventors: Jayaram Ramasastry, Woodinville, CA
(US); Partho Choudhury, Maharashtra
(IN); Ramesh Prasad, Maharashtra (IN)
Assignee: Sindhara Supermedia, Inc., Redmond,
WA (US)
Notice: Subject to any disclaimer, the term ofthis
patent is extended or adjusted under 35
U.S.C. 154(b) by 607 days.
Appl. No.: 11/077,106
Filed: Mar. 9, 2005
Prior Publication Data
US 2005/0207664 A1 Sep. 22, 2005
Related US. Application Data
Provisional application No. 60/552,153, ?led on Mar.
10, 2004, provisional application No. 60/552,356,
?led on Mar. 10, 2004, provisional application No.
60/552,270, ?led on Mar. 10, 2004.
Int. Cl.
G06K 9/36 (2006.01)
US. Cl. .................................................... .. 382/232
Field of Classi?cation Search ............... .. 382/232,
382/239, 240, 248; 708/317, 4004401; 375/240,
375/240.01, 240.02, 240.11, 240.18, 240.19;
348/384.1, 398.1, 404.1
See application ?le for complete search history.
L I QLJ
Server /0 I
l
i
I Encode Acquisition
l
Encoder  Decoder
Network
(9.9., wired and/or Wireless)
(56) References Cited
U.S. PATENT DOCUMENTS
5,585,852 A * 12/1996 Agarwal .............. .. 375/24011
5,881,176 A * 3/1999 Keith et a1. ............... .. 382/248
6,967,600 B2 * 11/2005 Kadono et a1. .............. .. 341/67
7,295,608 B2 * 11/2007 Reynolds et a1. ..... .. 375/24001
7,333,814 B2* 2/2008 Roberts ................. .. 455/4522
* cited by examiner
Primary Examinerilose L Couso
(74) Attorney, Agent, or FirmiBlakely, Sokoloff, Taylor &
Zafman LLP
(57) ABSTRACT
Methods and apparatuses for compressing digital image data
are described herein. In one embodiment, a Wavelet transform
is performed on each pixel of a frame to generate multiple
Wavelet coef?cients representing each pixel in a frequency
domain. The Wavelet coef?cients of a sub-band of the frame
are iteratively encoded into a bit stream based on a target
transmission rate, Where the sub-band ofthe frame is obtained
from a parent sub-band of a previous iteration. The encoded
Wavelet coef?cients satisfy a predetermined threshold based
on a predetermined algorithm While the Wavelet coef?cients
that do not satisfy the predetermined threshold are ignored in
the respective iteration. Other methods and apparatuses are
also described.
30 Claims, 18 Drawing Sheets
1 , Optional
 ) Decoder
Optional
Encoder
__J
Decoder
- (if)
Client I o l)
US7522774
US7522774
US. Patent Apr. 21, 2009 Sheet 3 0f 18 US 7,522,774 B2
Physicat Layer (W~CUMA, CDMA 1 X, cdmaZOOO, GSNLGPRS, UMTS, iBe-n) (1)
Data Link Control (DLC) (2)
Streaming protocol stack (RTP. RTSP. RTCP,
' 
DDP) (4) } Third party ISO protocol stack (TCP/lP/UDP)
(3)
Billing and other ancillary services (5)
Network Aware Layer (NAL) (6)
Application Layer APIs for QwikStream m, Qwikvu1M and QwikTexW (7)
Content Generation Engine (8)
Data Repository (9)
Fig. 3
US. Patent Apr. 21, 2009 Sheet 4 0f 18 US 7,522,774 B2
Raw YUV color frame data
4 o r
1 t
t
!
Jo
Wavelet Transform filter bank
#02. r___________
Y
Source Encoder (ARIES)
,3 1
l
1lt
rjrt
i5' Channel encoding (Tree partitioning, CRC,
t
RCPC)
494 i
l
tr
QL.
US. Patent Apr. 21, 2009 Sheet 5 0f 18
Compressed Image (,qvx ?le format) I
Channel decoding (Tree merging, CRC, RCPC)
Source Decoder(l-ARIES) 1
Inverse Wavelet Transform
1
l
l
I
Raw YUV data
Fig. 48
US 7,522,774 B2
US. Patent Apr. 21, 2009 Sheet 6 0f 18 US 7,522,774 B2
i 5190l
i ll Perform a wavelet transformation on each image pixel to
 transform the pixel into one or more coef?cients in one or
more wavelet maps.
l
l
l
l
l Encode each wavelet map by representing the signi?cance,
sign and bit plane information of the pixel using a single bit
in a bit stream. In 3.0L
l
Encode the signi?cant bits into a context variable '
dependent upon the information represented by the bit and
its location of the coef?cient being coded (e.g., the l
l probability of occurrence of a predetermined set of bits [
immediately preceding the current bit). ,
l l
ll
ll
Transmit the content of the context variable as a bit stream
as an output representing the encoded pixels.
1
US. Patent Apr. 21, 2009 Sheet 7 0f 18 US 7,522,774 B2
y.
//
Sub-tree 1 Sub'lree 2 Sub-tree 3
 (HL) (LH) (HH)
Fig. 6
US. Patent Apr. 21, 2009 Sheet 8 0f 18 US 7,522,774 B2
Fig. 7
US. Patent Apr. 21, 2009 Sheet 9 0f 18 US 7,522,774 B2
l
Determine a number of iterations (nl) based on a number of I
quantization levels, which may be determined on the I @/
largest wavelet coefficient, and set an initial quantization
threshold T = 2 l h ga t
Populate all insigni?cant pixels in lPQ, all insigni?cant pixel
having descendants in ISQ, and all signi?cant pixels in
SPQ.
l
For each type i entry of lSQ, if the entry is signi?cant with
respect to a current quantization threshold, remove the
respective entry from ISO and append it in the SPQ
l
For each type I entry of lSQ, if the entry is insigni?cant with
respect to a current quantization threshold, remove the
respective entry from lSQ and append it in the lPQ
lIf the respective type t entry includes descendants, remove
the entry from the lSQ and append it at the end of ISO as
type it entry for next iteration; otherwise, the entry is
purged.
lFor each type It entry of ISQ, if the entry is signi?cant with
respect to a current quantization threshold, all offspring of
the current lSQ entry are appended to the end of lSQ as
type I entries for next iteration. a損 {a g
l
Remove any entry in IPQ that is signi?cant with respect to
the current quantization threshold and append it in the
gal,
is m
dxfoq,
US. Patent Apr. 21, 2009 Sheet 10 0f 18 US 7,522,774 B2
Wavelet Transform filter bank
'BypassTAE/MC' t
___ fortframes
D
U
Mmhk f I '7 [ ME/MC'
F, f
l/ e
4 tt .
t motion if, 2 _/
P__2 J....."L  information I  /
1
Channel encoding (Tree
partitioning, CRC, RCPC)
t
 compressed me
1
t
Fig. 9A "
Streaming data
US. Patent Apr. 21, 2009 Sheet 11 0f 18 US 7,522,774 B2
Raw YUV color frame data
1 RNME/MC
,ltmlmwarIo.,.05$35
I frames
t
information
Source Encoder (ARIES VII)l-ARIES l/ll
Channel encoding (Tree
partitioning, CRC, RCPC)
Compressed Fite
Optxonat
Streaming data
Fig. 9B
US. Patent Apr. 21, 2009 Sheet 12 0f 18 US 7,522,774 B2
Streaming data
Optional
Compressed Video (.qsx ?le
format)
CABAC coded
motion
information
iChannel decoding (Tree merging, CRC,
RCPC)
Source Decoder (l-ARIES l/ll)
Bypass Mi? for I ]
_frames
21/
MC"Frame Buffer
41 ~

inverse Wavelet Transform
Fig. 10ARaw YUV data
US. Patent Apr. 21, 2009 Sheet 13 0f 18 US 7,522,774 B2
Streaming data
Optional
Compressed Video (.qsx ?le format)
CABAC coded
motion
information
Channel decoding (Tree merging, CRC,
RCPC)
Source Decoder (l-ARIES I/ll)
QL
Bypass M/Oy W" "Ya; I 
if frames WW
Frame Buffer
Inverse Wavelet Transform
Raw YUV data
Fig. 108
US. Patent Apr. 21, 2009 Sheet 14 0f 18 US 7,522,774 B2
frame)
lPerform a ME/MC on the coarsest subbands as parent
subbands of a current frame other than the l-frame with
respect to the identi?ed reference frame to generate one or
more motion vectors for the coarsest subbands.
l Identify a reference frame (129., the ?rst frame or an I- '
l l
Estimate the spatiat shifting of pixels of child subbands
using the motion vectors of the parent subbands to
determine a search area of the child saubbandsv
l
Perform a ME/MC for the child subbands to determine the
motion vectors of the child subbands.
More child subbands?)
l 7Perform compression on the predicted/compensated data L/xl
{ 05into compressed data (6.9., see, Figs. 5 and 8)
Fig. 11
US. Patent Apr. 21, 2009 Sheet 15 0f 18 US 7,522,774 B2
Fig. 12
l
i
>
o
kCmDeCnCrCfP.R
,IIHMHa,
L._DL
blah,UnllsoHmwH,amLVEer2%k0O
k
Boundary of the
 - -  Search Area for
refinement MVs
Refinement Vector
A 0 fox level k,
k orientation 031:12:; Block Neighborhood
US. Patent Apr. 21, 2009 Sheet 16 0f 18 US 7,522,774 B2
integer Motion Prediction
Half-Pei Motion
Prediction
U S. Patent Apr. 21, 2009 Sheet 17 0f 18 US 7,522,774 B2
ictionion PredInteger Mot
Half-Pei Motion
Prediction
Fig. 14
US. Patent Apr. 21, 2009 Sheet 18 0f 18 US 7,522,774 B2
Block
currently
being
tested
Matching
block
OBMC when
current block
being tested is in
iMV mode
> Motion Vector (identical colors
__ __> denote MVs of the same block)
Displaced MV to transiate
8WtVn?eEUmow9moOM
t
kmCCb8.DmCllmooInwma...8moP-->
being tested
Fig. 15
OBMC when
current block
being tested is in
4MV mode
US 7,522,774 B2
1
METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA
This application claims the bene?t of US. Provisional
Application No. 60/552,l53, ?led Mar. 10, 2004, US. Pro
visionalApplicationNo. 60/552,356, ?led Mar. 10, 2004, and
US. Provisional Application No. 60/552,270, ?led Mar. 10,
2004. The above-identi?ed applications are hereby incorpo
rated by reference in their entirety.
FIELD OF THE INVENTION
The present invention relates generally to multimedia
applications. More particularly, this invention relates to com
pressing digital image data.
BACKGROUND OF THE INVENTION
Avariety ofsystems have been developed forthe encoding
and decoding ofaudio/video data for transmission over Wire
line and/or Wireless communication systems over the past
decade. Most systems in this category employ standard com
pression/transmission techniques, such as, for example, the
ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC
Rec. l4496-l0AVC (also referred to as MPEG-4) standards.
HoWever, due to their inherent generality, they lack the spe
ci?c qualities needed for seamless implementation on loW
poWer, loW complexity systems (such as hand held devices
including, but not restricted to, personal digital assistants and
smart phones) over noisy, loW bit rate Wireless channels.
Due to the likely business models rapidly emerging in the
Wireless market, in Which cost incurred by the consumer is
directly proportional to the actual volume oftransmitted data,
and also due to the limited bandWidth, processing capability,
storage capacity and battery poWer, ef?ciency and speed in
compression of audio/video data to be transmitted is a major
factor in the eventual success ofany such multimedia content
delivery system. Most systems in use today are retro?tted
versions of identical systems used on higher end desktop
Workstations. Unlike desktop systems, Where error control is
not a critical issue due to the inherent reliability of cable
LAN/WAN data transmission, and bandWidth may be
assumed to be almost unlimited, transmission over limited
capacity Wireless netWorks require integration of such sys
tems that may leverage suitable processing and error-control
technologies to achieve the level of ?delity expected of a
commercially viable multimedia compression and transmis
sion system.
Conventional video compression engines, or codecs, can
be broadly classi?ed into tWo broad categories. One class of
coding strategies, knoWn as a doWnload-and-play (D&P) pro
?le, not only requires the entire ?le to be doWnloaded onto the
local memory before playback, leading to a large latency time
(depending on the available bandWidth and the actual ?le
siZe), but also makes stringent demands on the amount of
buffer memory to be made available for the doWnloaded
payload. Even With the more sophisticated streaming pro?le,
the current physical limitations on current generation trans
mission equipment at the physical layer force service provid
ers to incorporate a pseudo-streaming capability, Which
requires an initial period oflatency (at the beginning oftrans
mission), and continuous buffering henceforth, Which
imposes a strain on the limited processing capabilities of the
hand-held processor. Most commercial compression solu
tions in the market today do not possess a progressive trans
mission capability, Which means that transmission is possible
only until the last integral frame, packet or bit before band
20
25
30
35
40
45
50
55
60
65
2
Width drops beloW the minimum threshold. In case of video
codecs, if the connection breaks before the transmission of
the current frame, this frame is lost forever.
Another draWback in conventional video compression
codes is the introduction of blocking artifacts due to the
block-based coding schemes used in most codecs. Apart from
the degradation in subjective visual quality, such systems
suffer from poor performance due to bottlenecks introduced
by the additional de-blocking ?lters. Yet another draWback is
that, due to the limitations in the Word siZe ofthe computing
platform, the coded coef?cients are truncated to an approxi
mate value. This is especially prominent along object bound
aries, Where Gibbs phenomenon leads to the generation of a
visual phenomenon knoWn as mosquito noise. Due to this, the
blurring along the object boundaries becomes more promi
nent, leading to degradation in overall frame quality.
Additionally, the local nature ofmotion prediction in some
codes introduces motion-induced artifacts, Which cannot be
easily smoothenedby a simple ?ltering operation. Suchprob
lems arise especially in cases offast motion clips and systems
Where the frame rate is beloW that ofnatural video (e.g., 25 or
30 fps non-interlaced video). In either case, the temporal
redundancy betWeen tWo consecutive frames is extremely
loW (since much of the motion is lost in betWeen the frames
itself), leading to poorer tracking ofthe motion across frames.
This effect is cumulative in nature, especially for a longer
group of frames (GoF).
Furthermore, mobile end-user devices are constrained by
loW processing poWer and storage capacity. Due to the limi
tations on the silicon footprint, most mobile and hand-held
systems in the market have to time-share the resources ofthe
central processing unit (microcontroller or RISC/CISC pro
cessor) to perform all its DSP, control and communication
tasks, With little or no provisions for a dedicated processor to
take the video/audio processing load offthe central processor.
Moreover, most general-purpose central processors lack the
unique architecture needed for optimal DSP performance.
Therefore, a mobile video-codec design must have minimal
client-end complexity While maintaining consistency on the
ef?ciency and robustness front.
SUMMARY OF THE INVENTION
Methods and apparatuses for compressing digital image
data are described herein. In one embodiment, a Wavelet
transform is performed on each pixel of a frame to generate
multiple Wavelet coef?cients representing each pixel in a
frequency domain. The Wavelet coef?cients of a sub-band of
the frame are iteratively encoded into a bit stream based on a
target transmission rate, Where the sub-band of the frame is
obtained from a parent sub-band of a previous iteration. The
encoded Wavelet coef?cients satisfy a predetermined thresh
old based on a predetermined algorithm While the Wavelet
coef?cients that do not satisfy the predetermined threshold
are ignored in the respective iteration.
Other features of the present invention Will be apparent
from the accompanying draWings and from the detailed
description Which folloWs.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by Way ofexample and
not limitation in the ?gures ofthe accompanying draWings in
Which like references indicate similar elements.
FIG. 1 is a block diagram illustrating an exemplary multi
media streaming system according to one embodiment.
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
US7522774
Ad

More Related Content

More from Partho Choudhury (9)

Final version - MIT FT Capstone 405 SR - MPower
Final version - MIT FT Capstone 405 SR - MPowerFinal version - MIT FT Capstone 405 SR - MPower
Final version - MIT FT Capstone 405 SR - MPower
Partho Choudhury
Technical Report on the DVB-H and DVB-SH
Technical Report on the DVB-H and DVB-SHTechnical Report on the DVB-H and DVB-SH
Technical Report on the DVB-H and DVB-SH
Partho Choudhury
DVB-SH Link Budget Analysis
DVB-SH Link Budget AnalysisDVB-SH Link Budget Analysis
DVB-SH Link Budget Analysis
Partho Choudhury
Satellite Downlink Budget Analysis
Satellite Downlink Budget AnalysisSatellite Downlink Budget Analysis
Satellite Downlink Budget Analysis
Partho Choudhury
The Digital Video Broadcast (DVB) Project
The Digital Video Broadcast (DVB) ProjectThe Digital Video Broadcast (DVB) Project
The Digital Video Broadcast (DVB) Project
Partho Choudhury
Wireless_Video_Access_Networks_ppt
Wireless_Video_Access_Networks_pptWireless_Video_Access_Networks_ppt
Wireless_Video_Access_Networks_ppt
Partho Choudhury
Ultra_Wide_Band_ppt
Ultra_Wide_Band_pptUltra_Wide_Band_ppt
Ultra_Wide_Band_ppt
Partho Choudhury
UMAandFemtocells-MakingFMCHappen
UMAandFemtocells-MakingFMCHappenUMAandFemtocells-MakingFMCHappen
UMAandFemtocells-MakingFMCHappen
Partho Choudhury
Final version - MIT FT Capstone 405 SR - MPower
Final version - MIT FT Capstone 405 SR - MPowerFinal version - MIT FT Capstone 405 SR - MPower
Final version - MIT FT Capstone 405 SR - MPower
Partho Choudhury
Technical Report on the DVB-H and DVB-SH
Technical Report on the DVB-H and DVB-SHTechnical Report on the DVB-H and DVB-SH
Technical Report on the DVB-H and DVB-SH
Partho Choudhury
DVB-SH Link Budget Analysis
DVB-SH Link Budget AnalysisDVB-SH Link Budget Analysis
DVB-SH Link Budget Analysis
Partho Choudhury
Satellite Downlink Budget Analysis
Satellite Downlink Budget AnalysisSatellite Downlink Budget Analysis
Satellite Downlink Budget Analysis
Partho Choudhury
The Digital Video Broadcast (DVB) Project
The Digital Video Broadcast (DVB) ProjectThe Digital Video Broadcast (DVB) Project
The Digital Video Broadcast (DVB) Project
Partho Choudhury
Wireless_Video_Access_Networks_ppt
Wireless_Video_Access_Networks_pptWireless_Video_Access_Networks_ppt
Wireless_Video_Access_Networks_ppt
Partho Choudhury
UMAandFemtocells-MakingFMCHappen
UMAandFemtocells-MakingFMCHappenUMAandFemtocells-MakingFMCHappen
UMAandFemtocells-MakingFMCHappen
Partho Choudhury

US7522774

  • 1. (12) United States Patent Ramasastry et a]. US007522774B2 US 7,522,774 B2 Apr. 21, 2009 (10) Patent N0.: (45) Date of Patent: (54) (75) (73) (21) (22) (65) (60) (51) (52) (58) METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA Inventors: Jayaram Ramasastry, Woodinville, CA (US); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN) Assignee: Sindhara Supermedia, Inc., Redmond, WA (US) Notice: Subject to any disclaimer, the term ofthis patent is extended or adjusted under 35 U.S.C. 154(b) by 607 days. Appl. No.: 11/077,106 Filed: Mar. 9, 2005 Prior Publication Data US 2005/0207664 A1 Sep. 22, 2005 Related US. Application Data Provisional application No. 60/552,153, ?led on Mar. 10, 2004, provisional application No. 60/552,356, ?led on Mar. 10, 2004, provisional application No. 60/552,270, ?led on Mar. 10, 2004. Int. Cl. G06K 9/36 (2006.01) US. Cl. .................................................... .. 382/232 Field of Classi?cation Search ............... .. 382/232, 382/239, 240, 248; 708/317, 4004401; 375/240, 375/240.01, 240.02, 240.11, 240.18, 240.19; 348/384.1, 398.1, 404.1 See application ?le for complete search history. L I QLJ Server /0 I l i I Encode Acquisition l Encoder Decoder Network (9.9., wired and/or Wireless) (56) References Cited U.S. PATENT DOCUMENTS 5,585,852 A * 12/1996 Agarwal .............. .. 375/24011 5,881,176 A * 3/1999 Keith et a1. ............... .. 382/248 6,967,600 B2 * 11/2005 Kadono et a1. .............. .. 341/67 7,295,608 B2 * 11/2007 Reynolds et a1. ..... .. 375/24001 7,333,814 B2* 2/2008 Roberts ................. .. 455/4522 * cited by examiner Primary Examinerilose L Couso (74) Attorney, Agent, or FirmiBlakely, Sokoloff, Taylor & Zafman LLP (57) ABSTRACT Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band ofthe frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined threshold based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other methods and apparatuses are also described. 30 Claims, 18 Drawing Sheets 1 , Optional ) Decoder Optional Encoder __J Decoder - (if) Client I o l)
  • 4. US. Patent Apr. 21, 2009 Sheet 3 0f 18 US 7,522,774 B2 Physicat Layer (W~CUMA, CDMA 1 X, cdmaZOOO, GSNLGPRS, UMTS, iBe-n) (1) Data Link Control (DLC) (2) Streaming protocol stack (RTP. RTSP. RTCP, ' DDP) (4) } Third party ISO protocol stack (TCP/lP/UDP) (3) Billing and other ancillary services (5) Network Aware Layer (NAL) (6) Application Layer APIs for QwikStream m, Qwikvu1M and QwikTexW (7) Content Generation Engine (8) Data Repository (9) Fig. 3
  • 5. US. Patent Apr. 21, 2009 Sheet 4 0f 18 US 7,522,774 B2 Raw YUV color frame data 4 o r 1 t t ! Jo Wavelet Transform filter bank #02. r___________ Y Source Encoder (ARIES) ,3 1 l 1lt rjrt i5' Channel encoding (Tree partitioning, CRC, t RCPC) 494 i l tr QL.
  • 6. US. Patent Apr. 21, 2009 Sheet 5 0f 18 Compressed Image (,qvx ?le format) I Channel decoding (Tree merging, CRC, RCPC) Source Decoder(l-ARIES) 1 Inverse Wavelet Transform 1 l l I Raw YUV data Fig. 48 US 7,522,774 B2
  • 7. US. Patent Apr. 21, 2009 Sheet 6 0f 18 US 7,522,774 B2 i 5190l i ll Perform a wavelet transformation on each image pixel to transform the pixel into one or more coef?cients in one or more wavelet maps. l l l l l Encode each wavelet map by representing the signi?cance, sign and bit plane information of the pixel using a single bit in a bit stream. In 3.0L l Encode the signi?cant bits into a context variable ' dependent upon the information represented by the bit and its location of the coef?cient being coded (e.g., the l l probability of occurrence of a predetermined set of bits [ immediately preceding the current bit). , l l ll ll Transmit the content of the context variable as a bit stream as an output representing the encoded pixels. 1
  • 8. US. Patent Apr. 21, 2009 Sheet 7 0f 18 US 7,522,774 B2 y. // Sub-tree 1 Sub'lree 2 Sub-tree 3 (HL) (LH) (HH) Fig. 6
  • 9. US. Patent Apr. 21, 2009 Sheet 8 0f 18 US 7,522,774 B2 Fig. 7
  • 10. US. Patent Apr. 21, 2009 Sheet 9 0f 18 US 7,522,774 B2 l Determine a number of iterations (nl) based on a number of I quantization levels, which may be determined on the I @/ largest wavelet coefficient, and set an initial quantization threshold T = 2 l h ga t Populate all insigni?cant pixels in lPQ, all insigni?cant pixel having descendants in ISQ, and all signi?cant pixels in SPQ. l For each type i entry of lSQ, if the entry is signi?cant with respect to a current quantization threshold, remove the respective entry from ISO and append it in the SPQ l For each type I entry of lSQ, if the entry is insigni?cant with respect to a current quantization threshold, remove the respective entry from lSQ and append it in the lPQ lIf the respective type t entry includes descendants, remove the entry from the lSQ and append it at the end of ISO as type it entry for next iteration; otherwise, the entry is purged. lFor each type It entry of ISQ, if the entry is signi?cant with respect to a current quantization threshold, all offspring of the current lSQ entry are appended to the end of lSQ as type I entries for next iteration. a損 {a g l Remove any entry in IPQ that is signi?cant with respect to the current quantization threshold and append it in the gal, is m dxfoq,
  • 11. US. Patent Apr. 21, 2009 Sheet 10 0f 18 US 7,522,774 B2 Wavelet Transform filter bank 'BypassTAE/MC' t ___ fortframes D U Mmhk f I '7 [ ME/MC' F, f l/ e 4 tt . t motion if, 2 _/ P__2 J....."L information I / 1 Channel encoding (Tree partitioning, CRC, RCPC) t compressed me 1 t Fig. 9A " Streaming data
  • 12. US. Patent Apr. 21, 2009 Sheet 11 0f 18 US 7,522,774 B2 Raw YUV color frame data 1 RNME/MC ,ltmlmwarIo.,.05$35 I frames t information Source Encoder (ARIES VII)l-ARIES l/ll Channel encoding (Tree partitioning, CRC, RCPC) Compressed Fite Optxonat Streaming data Fig. 9B
  • 13. US. Patent Apr. 21, 2009 Sheet 12 0f 18 US 7,522,774 B2 Streaming data Optional Compressed Video (.qsx ?le format) CABAC coded motion information iChannel decoding (Tree merging, CRC, RCPC) Source Decoder (l-ARIES l/ll) Bypass Mi? for I ] _frames 21/ MC"Frame Buffer 41 ~ inverse Wavelet Transform Fig. 10ARaw YUV data
  • 14. US. Patent Apr. 21, 2009 Sheet 13 0f 18 US 7,522,774 B2 Streaming data Optional Compressed Video (.qsx ?le format) CABAC coded motion information Channel decoding (Tree merging, CRC, RCPC) Source Decoder (l-ARIES I/ll) QL Bypass M/Oy W" "Ya; I if frames WW Frame Buffer Inverse Wavelet Transform Raw YUV data Fig. 108
  • 15. US. Patent Apr. 21, 2009 Sheet 14 0f 18 US 7,522,774 B2 frame) lPerform a ME/MC on the coarsest subbands as parent subbands of a current frame other than the l-frame with respect to the identi?ed reference frame to generate one or more motion vectors for the coarsest subbands. l Identify a reference frame (129., the ?rst frame or an I- ' l l Estimate the spatiat shifting of pixels of child subbands using the motion vectors of the parent subbands to determine a search area of the child saubbandsv l Perform a ME/MC for the child subbands to determine the motion vectors of the child subbands. More child subbands?) l 7Perform compression on the predicted/compensated data L/xl { 05into compressed data (6.9., see, Figs. 5 and 8) Fig. 11
  • 16. US. Patent Apr. 21, 2009 Sheet 15 0f 18 US 7,522,774 B2 Fig. 12 l i > o kCmDeCnCrCfP.R ,IIHMHa, L._DL blah,UnllsoHmwH,amLVEer2%k0O k Boundary of the - - Search Area for refinement MVs Refinement Vector A 0 fox level k, k orientation 031:12:; Block Neighborhood
  • 17. US. Patent Apr. 21, 2009 Sheet 16 0f 18 US 7,522,774 B2 integer Motion Prediction Half-Pei Motion Prediction
  • 18. U S. Patent Apr. 21, 2009 Sheet 17 0f 18 US 7,522,774 B2 ictionion PredInteger Mot Half-Pei Motion Prediction Fig. 14
  • 19. US. Patent Apr. 21, 2009 Sheet 18 0f 18 US 7,522,774 B2 Block currently being tested Matching block OBMC when current block being tested is in iMV mode > Motion Vector (identical colors __ __> denote MVs of the same block) Displaced MV to transiate 8WtVn?eEUmow9moOM t kmCCb8.DmCllmooInwma...8moP--> being tested Fig. 15 OBMC when current block being tested is in 4MV mode
  • 20. US 7,522,774 B2 1 METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA This application claims the bene?t of US. Provisional Application No. 60/552,l53, ?led Mar. 10, 2004, US. Pro visionalApplicationNo. 60/552,356, ?led Mar. 10, 2004, and US. Provisional Application No. 60/552,270, ?led Mar. 10, 2004. The above-identi?ed applications are hereby incorpo rated by reference in their entirety. FIELD OF THE INVENTION The present invention relates generally to multimedia applications. More particularly, this invention relates to com pressing digital image data. BACKGROUND OF THE INVENTION Avariety ofsystems have been developed forthe encoding and decoding ofaudio/video data for transmission over Wire line and/or Wireless communication systems over the past decade. Most systems in this category employ standard com pression/transmission techniques, such as, for example, the ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC Rec. l4496-l0AVC (also referred to as MPEG-4) standards. HoWever, due to their inherent generality, they lack the spe ci?c qualities needed for seamless implementation on loW poWer, loW complexity systems (such as hand held devices including, but not restricted to, personal digital assistants and smart phones) over noisy, loW bit rate Wireless channels. Due to the likely business models rapidly emerging in the Wireless market, in Which cost incurred by the consumer is directly proportional to the actual volume oftransmitted data, and also due to the limited bandWidth, processing capability, storage capacity and battery poWer, ef?ciency and speed in compression of audio/video data to be transmitted is a major factor in the eventual success ofany such multimedia content delivery system. Most systems in use today are retro?tted versions of identical systems used on higher end desktop Workstations. Unlike desktop systems, Where error control is not a critical issue due to the inherent reliability of cable LAN/WAN data transmission, and bandWidth may be assumed to be almost unlimited, transmission over limited capacity Wireless netWorks require integration of such sys tems that may leverage suitable processing and error-control technologies to achieve the level of ?delity expected of a commercially viable multimedia compression and transmis sion system. Conventional video compression engines, or codecs, can be broadly classi?ed into tWo broad categories. One class of coding strategies, knoWn as a doWnload-and-play (D&P) pro ?le, not only requires the entire ?le to be doWnloaded onto the local memory before playback, leading to a large latency time (depending on the available bandWidth and the actual ?le siZe), but also makes stringent demands on the amount of buffer memory to be made available for the doWnloaded payload. Even With the more sophisticated streaming pro?le, the current physical limitations on current generation trans mission equipment at the physical layer force service provid ers to incorporate a pseudo-streaming capability, Which requires an initial period oflatency (at the beginning oftrans mission), and continuous buffering henceforth, Which imposes a strain on the limited processing capabilities of the hand-held processor. Most commercial compression solu tions in the market today do not possess a progressive trans mission capability, Which means that transmission is possible only until the last integral frame, packet or bit before band 20 25 30 35 40 45 50 55 60 65 2 Width drops beloW the minimum threshold. In case of video codecs, if the connection breaks before the transmission of the current frame, this frame is lost forever. Another draWback in conventional video compression codes is the introduction of blocking artifacts due to the block-based coding schemes used in most codecs. Apart from the degradation in subjective visual quality, such systems suffer from poor performance due to bottlenecks introduced by the additional de-blocking ?lters. Yet another draWback is that, due to the limitations in the Word siZe ofthe computing platform, the coded coef?cients are truncated to an approxi mate value. This is especially prominent along object bound aries, Where Gibbs phenomenon leads to the generation of a visual phenomenon knoWn as mosquito noise. Due to this, the blurring along the object boundaries becomes more promi nent, leading to degradation in overall frame quality. Additionally, the local nature ofmotion prediction in some codes introduces motion-induced artifacts, Which cannot be easily smoothenedby a simple ?ltering operation. Suchprob lems arise especially in cases offast motion clips and systems Where the frame rate is beloW that ofnatural video (e.g., 25 or 30 fps non-interlaced video). In either case, the temporal redundancy betWeen tWo consecutive frames is extremely loW (since much of the motion is lost in betWeen the frames itself), leading to poorer tracking ofthe motion across frames. This effect is cumulative in nature, especially for a longer group of frames (GoF). Furthermore, mobile end-user devices are constrained by loW processing poWer and storage capacity. Due to the limi tations on the silicon footprint, most mobile and hand-held systems in the market have to time-share the resources ofthe central processing unit (microcontroller or RISC/CISC pro cessor) to perform all its DSP, control and communication tasks, With little or no provisions for a dedicated processor to take the video/audio processing load offthe central processor. Moreover, most general-purpose central processors lack the unique architecture needed for optimal DSP performance. Therefore, a mobile video-codec design must have minimal client-end complexity While maintaining consistency on the ef?ciency and robustness front. SUMMARY OF THE INVENTION Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined thresh old based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other features of the present invention Will be apparent from the accompanying draWings and from the detailed description Which folloWs. BRIEF DESCRIPTION OF THE DRAWINGS The present invention is illustrated by Way ofexample and not limitation in the ?gures ofthe accompanying draWings in Which like references indicate similar elements. FIG. 1 is a block diagram illustrating an exemplary multi media streaming system according to one embodiment.