�ݺ�ߣ

(12) United States Patent
Ramasastry et a].
US007522774B2
US 7,522,774 B2
Apr. 21, 2009
(10) Patent N0.:
(45) Date of Patent:
(54)
(75)
(73)
(21)
(22)
(65)
(60)
(51)
(52)
(58)
METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA
Inventors: Jayaram Ramasastry, Woodinville, CA
(US); Partho Choudhury, Maharashtra
(IN); Ramesh Prasad, Maharashtra (IN)
Assignee: Sindhara Supermedia, Inc., Redmond,
WA (US)
Notice: Subject to any disclaimer, the term ofthis
patent is extended or adjusted under 35
U.S.C. 154(b) by 607 days.
Appl. No.: 11/077,106
Filed: Mar. 9, 2005
Prior Publication Data
US 2005/0207664 A1 Sep. 22, 2005
Related US. Application Data
Provisional application No. 60/552,153, ?led on Mar.
10, 2004, provisional application No. 60/552,356,
?led on Mar. 10, 2004, provisional application No.
60/552,270, ?led on Mar. 10, 2004.
Int. Cl.
G06K 9/36 (2006.01)
US. Cl. .................................................... .. 382/232
Field of Classi?cation Search ............... .. 382/232,
382/239, 240, 248; 708/317, 4004401; 375/240,
375/240.01, 240.02, 240.11, 240.18, 240.19;
348/384.1, 398.1, 404.1
See application ?le for complete search history.
L I QLJ
Server /0 I
l
i
I Encode’ Acquisition
l
Encoder ‘ Decoder
Network
(9.9., wired and/or Wireless)
(56) References Cited
U.S. PATENT DOCUMENTS
5,585,852 A * 12/1996 Agarwal .............. .. 375/24011
5,881,176 A * 3/1999 Keith et a1. ............... .. 382/248
6,967,600 B2 * 11/2005 Kadono et a1. .............. .. 341/67
7,295,608 B2 * 11/2007 Reynolds et a1. ..... .. 375/24001
7,333,814 B2* 2/2008 Roberts ................. .. 455/4522
* cited by examiner
Primary Examinerilose L Couso
(74) Attorney, Agent, or FirmiBlakely, Sokoloff, Taylor &
Zafman LLP
(57) ABSTRACT
Methods and apparatuses for compressing digital image data
are described herein. In one embodiment, a Wavelet transform
is performed on each pixel of a frame to generate multiple
Wavelet coef?cients representing each pixel in a frequency
domain. The Wavelet coef?cients of a sub-band of the frame
are iteratively encoded into a bit stream based on a target
transmission rate, Where the sub-band ofthe frame is obtained
from a parent sub-band of a previous iteration. The encoded
Wavelet coef?cients satisfy a predetermined threshold based
on a predetermined algorithm While the Wavelet coef?cients
that do not satisfy the predetermined threshold are ignored in
the respective iteration. Other methods and apparatuses are
also described.
30 Claims, 18 Drawing Sheets
1 , Optional
‘ ) Decoder
Optional
Encoder
__J
Decoder
- (if)
Client I o l)‘

US. Patent Apr. 21, 2009 Sheet 3 0f 18 US 7,522,774 B2
Physicat Layer (W~CUMA, CDMA 1 X, cdmaZOOO, GSNLGPRS, UMTS, iBe-n) (1)
Data Link Control (DLC) (2)
Streaming protocol stack (RTP. RTSP. RTCP,
'
DDP) (4) } Third party ISO protocol stack (TCP/lP/UDP)
(3)
Billing and other ancillary services (5)
Network Aware Layer (NAL) (6)
Application Layer APIs for QwikStream m, Qwikvu1M and QwikTexW (7)
Content Generation Engine (8)
Data Repository (9)
Fig. 3

Raw YUV color frame data
4 o r
1 t
t
!
Jo
Wavelet Transform filter bank
#02. r___________
Y
Source Encoder (ARIES)
,3 1
l
1lt
rjrt
i5' Channel encoding (Tree partitioning, CRC,
t
RCPC)
494 i
l
tr
QL.

US. Patent Apr. 21, 2009 Sheet 5 0f 18
Compressed Image (,qvx ?le format) I
Channel decoding (Tree merging, CRC, RCPC)
Source Decoder(l-ARIES) 1
Inverse Wavelet Transform
1
l
l
I
Raw YUV data
Fig. 48
US 7,522,774 B2

i 5190l
i ll Perform a wavelet transformation on each image pixel to
’ transform the pixel into one or more coef?cients in one or
more wavelet maps.
l
l
l
l
l Encode each wavelet map by representing the signi?cance,
sign and bit plane information of the pixel using a single bit
in a bit stream. In’ 3.0L
l
Encode the signi?cant bits into a context variable '
dependent upon the information represented by the bit and
its location of the coef?cient being coded (e.g., the l
l probability of occurrence of a predetermined set of bits [
immediately preceding the current bit). ,
l l
ll
ll
Transmit the content of the context variable as a bit stream
as an output representing the encoded pixels.
1

y.
//
Sub-tree 1 Sub'lree 2 Sub-tree 3
(HL) (LH) (HH)
Fig. 6

Fig. 7

l
Determine a number of iterations (nl) based on a number of I
quantization levels, which may be determined on the I @/
largest wavelet coefficient, and set an initial quantization
threshold T = 2 “l h ga t
Populate all insigni?cant pixels in lPQ, all insigni?cant pixel
having descendants in ISQ, and all signi?cant pixels in
SPQ.
l
For each type i entry of lSQ, if the entry is signi?cant with
respect to a current quantization threshold, remove the
respective entry from ISO and append it in the SPQ
l
For each type I entry of lSQ, if the entry is insigni?cant with
respect to a current quantization threshold, remove the
respective entry from lSQ and append it in the lPQ
lIf the respective type t entry includes descendants, remove
the entry from the lSQ and append it at the end of ISO as
type it entry for next iteration; otherwise, the entry is
purged.
lFor each type It entry of ISQ, if the entry is signi?cant with
respect to a current quantization threshold, all offspring of
the current lSQ entry are appended to the end of lSQ as
type I entries for next iteration. a» {a g
l
Remove any entry in IPQ that is signi?cant with respect to
the current quantization threshold and append it in the
“gal,
is m
dxfoq,

Wavelet Transform filter bank
'BypassTAE/MC' t
___ fortframes
D
U
Mmhk f I '7 [ ME/MC'
F, f
l/ e
4”’ tt .
t motion if, 2 _/
P_—_2 J....."L ‘ information I /
1
Channel encoding (Tree
partitioning, CRC, RCPC)
t
‘ compressed ‘me
1
t
Fig. 9A "
Streaming data

Raw YUV color frame data
1 RNME/MC
,ltmlmwarIo.,.05$35
I frames
t
information
Source Encoder (ARIES VII)l-ARIES l/ll
Channel encoding (Tree
partitioning, CRC, RCPC)
Compressed Fite
Optxonat
Streaming data
Fig. 9B

Streaming data
Optional
Compressed Video (.qsx ?le
format)
CABAC coded
motion
information
iChannel decoding (Tree merging, CRC,
RCPC)
Source Decoder (l-ARIES l/ll)
Bypass Mi? for I ]
_frames
21/
MC"Frame Buffer
41 ~

inverse Wavelet Transform
Fig. 10ARaw YUV data

Streaming data
Optional
Compressed Video (.qsx ?le format)
CABAC coded
motion
information
Channel decoding (Tree merging, CRC,
RCPC)
Source Decoder (l-ARIES I/ll)
QL
Bypass M/Oy W" "Ya; I ‘
if frames WW
Frame Buffer
Inverse Wavelet Transform
Raw YUV data
Fig. 108

frame)
lPerform a ME/MC on the coarsest subbands as parent
subbands of a current frame other than the l-frame with
respect to the identi?ed reference frame to generate one or
more motion vectors for the coarsest subbands.
l Identify a reference frame (129., the ?rst frame or an I- '
l l
Estimate the spatiat shifting of pixels of child subbands
using the motion vectors of the parent subbands to
determine a search area of the child saubbandsv
l
Perform a ME/MC for the child subbands to determine the
motion vectors of the child subbands.
More child subbands?)
‘l 7Perform compression on the predicted/compensated data L/xl
{ 05into compressed data (6.9., see, Figs. 5 and 8)
Fig. 11

Fig. 12
l
i
>
o
kCm“DeCnCrCfP.R
,IIHMHa,
L._DL
blah,UnllsoHmwH,amLVEer2%k0O
k
Boundary of the
— - - — Search Area for
refinement MVs
Refinement Vector
A 0 fox level k,
k orientation 031:12:; Block Neighborhood

integer Motion Prediction
Half-Pei Motion
Prediction

U S. Patent Apr. 21, 2009 Sheet 17 0f 18 US 7,522,774 B2
ictionion PredInteger Mot
Half-Pei Motion
Prediction
Fig. 14

Block
currently
being
tested
Matching
block
OBMC when
current block
being tested is in
iMV mode
> Motion Vector (identical colors
__ __> denote MVs of the same block)
Displaced MV to transiate
8WtVn?eEUmow9“moOM
t
kmCCb8.DmCllmoo‘Inwma...“8moP-->
being tested
Fig. 15
OBMC when
current block
being tested is in
4MV mode

US 7,522,774 B2
1
METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA
This application claims the bene?t of US. Provisional
Application No. 60/552,l53, ?led Mar. 10, 2004, US. Pro
visionalApplicationNo. 60/552,356, ?led Mar. 10, 2004, and
US. Provisional Application No. 60/552,270, ?led Mar. 10,
2004. The above-identi?ed applications are hereby incorpo
rated by reference in their entirety.
FIELD OF THE INVENTION
The present invention relates generally to multimedia
applications. More particularly, this invention relates to com
pressing digital image data.
BACKGROUND OF THE INVENTION
Avariety ofsystems have been developed forthe encoding
and decoding ofaudio/video data for transmission over Wire
line and/or Wireless communication systems over the past
decade. Most systems in this category employ standard com
pression/transmission techniques, such as, for example, the
ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC
Rec. l4496-l0AVC (also referred to as MPEG-4) standards.
HoWever, due to their inherent generality, they lack the spe
ci?c qualities needed for seamless implementation on loW
poWer, loW complexity systems (such as hand held devices
including, but not restricted to, personal digital assistants and
smart phones) over noisy, loW bit rate Wireless channels.
Due to the likely business models rapidly emerging in the
Wireless market, in Which cost incurred by the consumer is
directly proportional to the actual volume oftransmitted data,
and also due to the limited bandWidth, processing capability,
storage capacity and battery poWer, ef?ciency and speed in
compression of audio/video data to be transmitted is a major
factor in the eventual success ofany such multimedia content
delivery system. Most systems in use today are retro?tted
versions of identical systems used on higher end desktop
Workstations. Unlike desktop systems, Where error control is
not a critical issue due to the inherent reliability of cable
LAN/WAN data transmission, and bandWidth may be
assumed to be almost unlimited, transmission over limited
capacity Wireless netWorks require integration of such sys
tems that may leverage suitable processing and error-control
technologies to achieve the level of ?delity expected of a
commercially viable multimedia compression and transmis
sion system.
Conventional video compression engines, or codecs, can
be broadly classi?ed into tWo broad categories. One class of
coding strategies, knoWn as a doWnload-and-play (D&P) pro
?le, not only requires the entire ?le to be doWnloaded onto the
local memory before playback, leading to a large latency time
(depending on the available bandWidth and the actual ?le
siZe), but also makes stringent demands on the amount of
buffer memory to be made available for the doWnloaded
payload. Even With the more sophisticated streaming pro?le,
the current physical limitations on current generation trans
mission equipment at the physical layer force service provid
ers to incorporate a pseudo-streaming capability, Which
requires an initial period oflatency (at the beginning oftrans
mission), and continuous buffering henceforth, Which
imposes a strain on the limited processing capabilities of the
hand-held processor. Most commercial compression solu
tions in the market today do not possess a progressive trans
mission capability, Which means that transmission is possible
only until the last integral frame, packet or bit before band
20
25
30
35
40
45
50
55
60
65
2
Width drops beloW the minimum threshold. In case of video
codecs, if the connection breaks before the transmission of
the current frame, this frame is lost forever.
Another draWback in conventional video compression
codes is the introduction of blocking artifacts due to the
block-based coding schemes used in most codecs. Apart from
the degradation in subjective visual quality, such systems
suffer from poor performance due to bottlenecks introduced
by the additional de-blocking ?lters. Yet another draWback is
that, due to the limitations in the Word siZe ofthe computing
platform, the coded coef?cients are truncated to an approxi
mate value. This is especially prominent along object bound
aries, Where Gibbs’ phenomenon leads to the generation of a
visual phenomenon knoWn as mosquito noise. Due to this, the
blurring along the object boundaries becomes more promi
nent, leading to degradation in overall frame quality.
Additionally, the local nature ofmotion prediction in some
codes introduces motion-induced artifacts, Which cannot be
easily smoothenedby a simple ?ltering operation. Suchprob
lems arise especially in cases offast motion clips and systems
Where the frame rate is beloW that ofnatural video (e.g., 25 or
30 fps non-interlaced video). In either case, the temporal
redundancy betWeen tWo consecutive frames is extremely
loW (since much of the motion is lost in betWeen the frames
itself), leading to poorer tracking ofthe motion across frames.
This effect is cumulative in nature, especially for a longer
group of frames (GoF).
Furthermore, mobile end-user devices are constrained by
loW processing poWer and storage capacity. Due to the limi
tations on the silicon footprint, most mobile and hand-held
systems in the market have to time-share the resources ofthe
central processing unit (microcontroller or RISC/CISC pro
cessor) to perform all its DSP, control and communication
tasks, With little or no provisions for a dedicated processor to
take the video/audio processing load offthe central processor.
Moreover, most general-purpose central processors lack the
unique architecture needed for optimal DSP performance.
Therefore, a mobile video-codec design must have minimal
client-end complexity While maintaining consistency on the
ef?ciency and robustness front.
SUMMARY OF THE INVENTION
Methods and apparatuses for compressing digital image
data are described herein. In one embodiment, a Wavelet
transform is performed on each pixel of a frame to generate
multiple Wavelet coef?cients representing each pixel in a
frequency domain. The Wavelet coef?cients of a sub-band of
the frame are iteratively encoded into a bit stream based on a
target transmission rate, Where the sub-band of the frame is
obtained from a parent sub-band of a previous iteration. The
encoded Wavelet coef?cients satisfy a predetermined thresh
old based on a predetermined algorithm While the Wavelet
coef?cients that do not satisfy the predetermined threshold
are ignored in the respective iteration.
Other features of the present invention Will be apparent
from the accompanying draWings and from the detailed
description Which folloWs.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by Way ofexample and
not limitation in the ?gures ofthe accompanying draWings in
Which like references indicate similar elements.
FIG. 1 is a block diagram illustrating an exemplary multi
media streaming system according to one embodiment.

�ݺ�ߣ

US7522774

Recommended

More Related Content

More from Partho Choudhury (9)

US7522774