�ݺ�ߣ

US 20050207495A1
(12) Patent Application Publication (10) Pub. No.: US 2005/0207495 A1
(19) United States
Ramasastry et al. (43) Pub. Date: Sep. 22, 2005
(54) METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA
WITH MOTION PREDICTION
(76) Inventors: Jayaram Ramasastry, Woodinville, CA
(US); Partho Choudhury, Maharashtra
(IN); Ramesh Prasad, Maharashtra
(IN)
Correspondence Address:
BLAKELY SOKOLOFF TAYLOR & ZAFMAN
12400 WILSHIRE BOULEVARD
SEVENTH FLOOR
LOS ANGELES, CA 90025-1030 (US)
(21) Appl. No.: 11/076,746
(22) Filed: Mar. 9, 2005
Related US. Application Data
(60) Provisional application No. 60/552,153, ?led on Mar.
10, 2004. Provisional application No. 60/552,356,
?led on Mar. 10, 2004. Provisional application No.
60/552,270, ?led on Mar. 10, 2004.
Publication Classi?cation
(51) Int. Cl? .......................................................H04N 7/12
(52) Us. 01. ........ 375/240.16; 375/240.19; 375/240.12;
375/240.11; 375/240.24
(57) ABSTRACT
Methods and apparatuses for compressing digital image data
With motion prediction are described herein. In one embodi
ment, for each tWo consecutive frames of an image
sequence, a motion prediction is performed betWeen the
consecutive frames by tracking motion on a luminance map
of the frames to generate motion prediction information for
the luminance component. The motion prediction informa
tion of the luminance component is then applied to the
chrominance maps. In response to the motion prediction, the
Wavelet coef?cients of each frame and the motion prediction
information are encoded into a bit stream based on a target
transmission rate, Where the encoded Wavelet coefficients
satisfy a predetermined threshold according to a predeter
mined algorithm. Other methods and apparatuses are also
described.
Encoder D3,"?Acqulsmon
/ ° 6 I vs“
Optional
Decoder
Server /0 I
Network
(13.9., wired and/or Wireless)
[01/
Optimal
Encoder
Client /a]

Patent Application Publication Sep. 22, 2005 Sheet 1 0f 18 US 2005/0207495 A1
&uE26
Q
WSQMQ582wm3:250Qe

Physical Layer (W-CDMA, CDMA 1.x. cdma2000, GSM-GPRS, UMTS, iBen) (1)
Data Link Control (DLC) (2)
Streaming piolocol stack (RTP. RTSP. RTCP.
so") (4) Third party ISO proiocoi 5m (TCP/lP/UDP)
(3)
Billing and other ancillary services (5)
Network Aware Layer (NAL) (6)
Application Layer APIs ior QwikSUeam'". QNikVu’" and Qwiklex'" (7)
Content Generation Engine (8)
Data Repository (9)
Fig. 3

l
l
|
Raw YUV color frame data ‘ 4L0 o
Wavelet Transfonn ?lter bank
#07.
Source Encoder (ARIES)
1H3
Channel encoding (Tree partitioning, CRC,
RCPC)
M
Compressed File
(e.g., .qvx ?le)
Fig. 4A

Compressed Image (.qvx ?le format) 4;
Channel decoding (Tree merging. CRC, RCPC)
Source Decoder (l-ARIES)
Inverse Wavelet Transfonn
Raw YUV data
Fig. 4B

Perform a wavelet transformation on each image pixel to _
transform the pixel into one or more coefficients in one or
more wavelet maps.
Encode each wavelet map by representing the signi?cance,
sign and bit plane infomiation of the pixel using a single bit
in a bit stream. A, 90 L
Encode the signi?cant bits into a context variable
dependent upon the information represented by the bit and
its location of the coefficient being coded (e.g.. the
probability of occurrence of a predetermined set of bits
I immediately preceding the current bit). A’ $"o3
l
Transmit the content of the context variable as a bit stream
as an output representing the encoded pixels.
~ 5-04.

Patent Application Publication
Sub-tree 1
(HL)
Sep. 22, 2005 Sheet 7 0f 18
m
W
Fig. 6
US 2005/0207495 A1
Sub-tree 3
(HH)

Fig. 7

1
Determine a number of iterations (nl) based on a number ot|
quantization levels, which may be determined on the Z 9 ’
largest wavelet coef?cient, and set an initial quantization /
threshold T = 2 "’ l“ g’ l
l
Populate all insigni?cant pixels in IPQ. all insigni?cant pixel
having descendants in ISO, and all signi?cant pixels in
SPQ.
A K a L
For each type I entry of ISO, if the entry is signi?cant with
respect to a current quantization threshold, remove the
respective entry from ISO and append it in the SPQ
l“ YoI
l
For each type I entry of ISO, if the entry is insignificant with
respect to a current quantization threshold, remove the
respective entry from ISO and append it in the lPQ
lIt the respective type t entry includes descendants, remove
the entry from the ISO and append it at the end of ISO as
type II entry for next iteration; otherwise, the entry is
purged. ~ g’ r
lFor each type II entry of ISO, if the entry is signi?cant with
respect to a current quantization threshold, all offspring of
the current lSQ entry are appended to the end of ISO as
type I entries for next iteration. I» Z,‘
l
Remove any entry in lPQ that is signi?cant with respect to
the current quantization threshold and append it in the
'xyolk

l-ARIES llil
Raw YUV color frame data
‘ I
1
WaveletTransform?lterbank I
b
u
f r MEIMC'
f
f e l/
2 2 ' I
2
35 —l CABAC ooded l
‘n motion l
information
I.
Source Encoder (ARIES llll)
' I
Fig'. 9A
Channel encoding (Tree
paniiioning, CRC, RCPC)
compressedfile I
Optional
Streaming data

Patent Application Publication Sep. 22, 2005 Sheet 11 0f 18
1
Raw YUV color frame data
Bynau[oritrill
t
Inverse Discrete Wavelet
Transform (I-DWT)
I‘ l-ARIES1m }
US 2005/0207495 A1
‘/ ME/MC' ’ Bypass ME/MC'for———~-——— I ltrames
T I I
CABAC I
coded
motion
infon-nation I
l
L
Discrete Wavctet
TranstorrMDWT)
I
l
I
Source Encoder (ARIES l/II)
Fig. 9B
5 lmChannel encoding (Tree
partitioning, CRC, RCPC)
V
Compressed File
Streaming data

Streaming data
I I Optional
l
I _ o . .___.4_i._.>_aiier t
‘ Compressed Video (.qsx ?le
I fon'nat)
CABAC coded I
motion
information
Channel decoding (Tree merging. CRC,
RCPC)
Source Decoder (l-ARIES llll)
KBypass MC‘ for l
Frame Buffer ) MC‘ frames
Inverse Wavelet Transform
L
RawYUVdata

, Streaming data
I Compressed Video (.qsx ?le format)
I
CABAC coded I
motion I
information
I
Channel decoding (Tree merging, CRC,
RCPC)
E &Source Decoder (l-ARIES II")
I l _.
a ....... a
Frame Buffer % MC‘ frames
[ InverseWaveletTransform
I RawYUV data
Fig. 108

Identify a reference frame (e.g._ the ?rst frame or an I- I I 9 a
frame) /
Ana‘
iPerform a MEJMC on the coarsest subbands as parent
subbands of a current frame other than the i-frame with
respect to the identi?ed reference frame to generate one or
more motion vectors for the coarsest subbands.
~Ho1
Estimate the spatial shifting of pixels of child subbands
using the motion vectors of the parent subbands to
determine a search area of the child subbands.
l
Perform a ME/MC for the child subbands to deten'nine the
motion vectors of the child subbands.
AIlla?
More child subbands?
I q
Perform compression on the predicted/compensated data
into compressed data (e.g., see, Figs. 5 and 8) M! a;
Fig. 11

A8+02
w022V“4v_v_H
0Vov8A1
"v/
M..0
V202*vi
2*
=204
04A/‘Qt.
Fig. 12
//////////////
'
Ill/Ill;/
r
Z21
kmBemmfMk=leve| of sub band
o=orientation (LL, HL,
LH HH)
Boundary of the
_ - - .- Search Area for
re?nement MVs
Re?nement Vector
for level k
orientation 0
Block Neighborhoodé MOIIOI'I
Vector

Sep. 22, 2005 Sheet 16 0f 18 US 2005/0207495 A1
31m.F
Integer Motion Prediction
////////an”
a
T.
,
r.
2m"VA/
wank4%mi.,
/////////
2);)»,
1.,W
Patent Application Publication

/////// Integer Motion Prediction
HaIf-Pel Motion
/ Prediction
Fig. 14

Block
currently
being
tested
~95
I Matching
block
‘W1 22::V. ____ ‘2 i being tested is in
1MV mode
> Motion Vector (identical colors
_ _) denote MVs of the same block)
Displaced MV to translate
matching block to the relative
- -> position of macroblock currently
being tested
Fig. 15
current block
being tested is in
4MV mode
I

US 2005/0207495 A1
METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA WITH
MOTION PREDICTION
[0001] This application claims the bene?t of US. Provi
sional Application No. 60/552,153, ?led Mar. 10, 2004, US.
Provisional Application No. 60/552,356, ?led Mar. 10,
2004, and Us. Provisional Application No. 60/552,270,
?led Mar. 10, 2004. The above-identi?ed applications are
hereby incorporated by references.
FIELD OF THE INVENTION
[0002] The present invention relates generally to multi
media applications. More particularly, this invention relates
to compressing digital image data With motion prediction.
BACKGROUND OF THE INVENTION
[0003] A variety of systems have been developed for the
encoding and decoding of audio/video data for transmission
over Wireline and/or Wireless communication systems over
the past decade. Most systems in this category employ
standard compression/transmission techniques, such as, for
example, the ITU-T Rec. H.264 (also referred to as H.264)
and ISO/IEC Rec. 14496-10 AVC (also referred to as
MPEG-4) standards. HoWever, due to their inherent gener
ality, they lack the speci?c qualities needed for seamless
implementation on loW poWer, loW complexity systems
(such as hand held devices including, but not restricted to,
personal digital assistants and smart phones) over noisy, loW
bit rate Wireless channels.
[0004] Due to the likely business models rapidly emerging
in the Wireless market, in Which cost incurred by the
consumer is directly proportional to the actual volume of
transmitted data, and also due to the limited bandWidth,
processing capability, storage capacity and battery poWer,
ef?ciency and speed in compression of audio/video data to
be transmitted is a major factor in the eventual success of
any such multimedia content delivery system. Most systems
in use today are retro?tted versions of identical systems used
on higher end desktop Workstations. Unlike desktop sys
tems, Where error control is not a critical issue due to the
inherent reliability of cable LAN/WAN data transmission,
and bandWidth may be assumed to be almost unlimited,
transmission over limited capacity Wireless netWorks require
integration of such systems that may leverage suitable
processing and error-control technologies to achieve the
level of ?delity expected of a commercially viable multi
media compression and transmission system.
[0005] Conventional video compression engines, or
codecs, can be broadly classi?ed into tWo broad categories.
One class of coding strategies, knoWn as a doWnload-and
play (D&P) pro?le, not only requires the entire ?le to be
doWnloaded onto the local memory before playback, leading
to a large latency time (depending on the available band
Width and the actual ?le siZe), but also makes stringent
demands on the amount of buffer memory to be made
available for the doWnloaded payload. Even With the more
sophisticated streaming pro?le, the current physical limita
tions on current generation transmission equipment at the
physical layer force service providers to incorporate a
pseudo-streaming capability, Which requires an initial period
of latency (at the beginning of transmission), and continuous
buffering henceforth, Which imposes a strain on the limited
Sep. 22, 2005
processing capabilities of the hand-held processor. Most
commercial compression solutions in the market today do
not possess a progressive transmission capability, Which
means that transmission is possible only until the last
integral frame, packet or bit before bandWidth drops beloW
the minimum threshold. In case of video codecs, if the
connection breaks before the transmission of the current
frame, this frame is lost forever.
[0006] Another draWback in conventional video compres
sion codes is the introduction of blocking artifacts due to the
block-based coding schemes used in most codecs. Apart
from the degradation in subjective visual quality, such
systems suffer from poor performance due to bottlenecks
introduced by the additional de-blocking ?lters. Yet another
draWback is that, due to the limitations in the Word siZe of
the computing platform, the coded coef?cients are truncated
to an approximate value. This is especially prominent along
object boundaries, Where Gibbs’ phenomenon leads to the
generation of a visual phenomenon knoWn as mosquito
noise. Due to this, the blurring along the object boundaries
becomes more prominent, leading to degradation in overall
frame quality.
[0007] Additionally, the local nature of motion prediction
in some codes introduces motion-induced artifacts, Which
cannot be easily smoothened by a simple ?ltering operation.
Such problems arise especially in cases of fast motion clips
and systems Where the frame rate is beloW that of natural
video (e.g., 25 or 30 fps non-interlaced video). In either case,
the temporal redundancy betWeen tWo consecutive frames is
extremely loW (since much of the motion is lost in betWeen
the frames itself), leading to poorer tracking of the motion
across frames. This effect is cumulative in nature, especially
for a longer group of frames (GoF).
[0008] Furthermore, mobile end-user devices are con
strained by loW processing poWer and storage capacity. Due
to the limitations on the silicon footprint, most mobile and
hand-held systems in the market have to time-share the
resources of the central processing unit (microcontroller or
RISC/CISC processor) to perform all its DSP, control and
communication tasks, With little or no provisions for a
dedicated processor to take the video/audio processing load
off the central processor. Moreover, most general-purpose
central processors lack the unique architecture needed for
optimal DSP performance. Therefore, a mobile video-codec
design must have minimal client-end complexity While
maintaining consistency on the ef?ciency and robustness
front.
SUMMARY OF THE INVENTION
[0009] Methods and apparatuses for compressing digital
image data With motion prediction are described herein. In
one embodiment, for each tWo consecutive frames of an
image sequence, a motion prediction is performed betWeen
the consecutive frames by tracking motion on a luminance
map of the frames to generate motion prediction information
for the luminance component. The motion prediction infor
mation of the luminance component is then applied to the
chrominance maps. In response to the motion prediction, the
Wavelet coef?cients of each frame and the motion prediction
information are encoded into a bit stream based on a target
transmission rate, Where the encoded Wavelet coefficients
satisfy a predetermined threshold according to a predeter
mined algorithm.

�ݺ�ߣ

US20050207495

Recommended

More Related Content

What's hot (11)

Similar to US20050207495 (20)

More from Partho Choudhury (9)

US20050207495