際際滷

際際滷Share a Scribd company logo
US 20050207495A1
(12) Patent Application Publication (10) Pub. No.: US 2005/0207495 A1
(19) United States
Ramasastry et al. (43) Pub. Date: Sep. 22, 2005
(54) METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA
WITH MOTION PREDICTION
(76) Inventors: Jayaram Ramasastry, Woodinville, CA
(US); Partho Choudhury, Maharashtra
(IN); Ramesh Prasad, Maharashtra
(IN)
Correspondence Address:
BLAKELY SOKOLOFF TAYLOR & ZAFMAN
12400 WILSHIRE BOULEVARD
SEVENTH FLOOR
LOS ANGELES, CA 90025-1030 (US)
(21) Appl. No.: 11/076,746
(22) Filed: Mar. 9, 2005
Related US. Application Data
(60) Provisional application No. 60/552,153, ?led on Mar.
10, 2004. Provisional application No. 60/552,356,
?led on Mar. 10, 2004. Provisional application No.
60/552,270, ?led on Mar. 10, 2004.
Publication Classi?cation
(51) Int. Cl? .......................................................H04N 7/12
(52) Us. 01. ........ 375/240.16; 375/240.19; 375/240.12;
375/240.11; 375/240.24
(57) ABSTRACT
Methods and apparatuses for compressing digital image data
With motion prediction are described herein. In one embodi
ment, for each tWo consecutive frames of an image
sequence, a motion prediction is performed betWeen the
consecutive frames by tracking motion on a luminance map
of the frames to generate motion prediction information for
the luminance component. The motion prediction informa
tion of the luminance component is then applied to the
chrominance maps. In response to the motion prediction, the
Wavelet coef?cients of each frame and the motion prediction
information are encoded into a bit stream based on a target
transmission rate, Where the encoded Wavelet coefficients
satisfy a predetermined threshold according to a predeter
mined algorithm. Other methods and apparatuses are also
described.
Encoder D3,"?Acqulsmon
/ 属 6 I vs
Optional
Decoder
Server /0 I
Network
(13.9., wired and/or Wireless)
[01/
Optimal
Encoder
Client /a]
Patent Application Publication Sep. 22, 2005 Sheet 1 0f 18 US 2005/0207495 A1
&uE26
Q
WSQMQ582wm3:250Qe
US20050207495
Patent Application Publication Sep. 22, 2005 Sheet 3 0f 18 US 2005/0207495 A1
Physical Layer (W-CDMA, CDMA 1.x. cdma2000, GSM-GPRS, UMTS, iBen) (1)
Data Link Control (DLC) (2)
Streaming piolocol stack (RTP. RTSP. RTCP.
so") (4) Third party ISO proiocoi 5m (TCP/lP/UDP)
(3)
Billing and other ancillary services (5)
Network Aware Layer (NAL) (6)
Application Layer APIs ior QwikSUeam'". QNikVu" and Qwiklex'" (7)
Content Generation Engine (8)
Data Repository (9)
Fig. 3
Patent Application Publication Sep. 22, 2005 Sheet 4 0f 18 US 2005/0207495 A1
l
l
|
Raw YUV color frame data  4L0 o
Wavelet Transfonn ?lter bank
#07.
Source Encoder (ARIES)
1H3
Channel encoding (Tree partitioning, CRC,
RCPC)
M
Compressed File
(e.g., .qvx ?le)
Fig. 4A
Patent Application Publication Sep. 22, 2005 Sheet 5 0f 18 US 2005/0207495 A1
Compressed Image (.qvx ?le format) 4;
Channel decoding (Tree merging. CRC, RCPC)
Source Decoder (l-ARIES)
Inverse Wavelet Transfonn
Raw YUV data
Fig. 4B
Patent Application Publication Sep. 22, 2005 Sheet 6 0f 18 US 2005/0207495 A1
Perform a wavelet transformation on each image pixel to _
transform the pixel into one or more coefficients in one or
more wavelet maps.
Encode each wavelet map by representing the signi?cance,
sign and bit plane infomiation of the pixel using a single bit
in a bit stream. A, 90 L
Encode the signi?cant bits into a context variable
dependent upon the information represented by the bit and
its location of the coefficient being coded (e.g.. the
probability of occurrence of a predetermined set of bits
I immediately preceding the current bit). A $"o3
l
Transmit the content of the context variable as a bit stream
as an output representing the encoded pixels.
~ 5-04.
Patent Application Publication
Sub-tree 1
(HL)
Sep. 22, 2005 Sheet 7 0f 18
m
W
Fig. 6
US 2005/0207495 A1
Sub-tree 3
(HH)
Patent Application Publication Sep. 22, 2005 Sheet 8 0f 18 US 2005/0207495 A1
Fig. 7
Patent Application Publication Sep. 22, 2005 Sheet 9 0f 18 US 2005/0207495 A1
1
Determine a number of iterations (nl) based on a number ot|
quantization levels, which may be determined on the Z 9 
largest wavelet coef?cient, and set an initial quantization /
threshold T = 2 " l g l
l
Populate all insigni?cant pixels in IPQ. all insigni?cant pixel
having descendants in ISO, and all signi?cant pixels in
SPQ.
A K a L
For each type I entry of ISO, if the entry is signi?cant with
respect to a current quantization threshold, remove the
respective entry from ISO and append it in the SPQ
l YoI
l
For each type I entry of ISO, if the entry is insignificant with
respect to a current quantization threshold, remove the
respective entry from ISO and append it in the lPQ
lIt the respective type t entry includes descendants, remove
the entry from the ISO and append it at the end of ISO as
type II entry for next iteration; otherwise, the entry is
purged. ~ g r
lFor each type II entry of ISO, if the entry is signi?cant with
respect to a current quantization threshold, all offspring of
the current lSQ entry are appended to the end of ISO as
type I entries for next iteration. I損 Z,
l
Remove any entry in lPQ that is signi?cant with respect to
the current quantization threshold and append it in the
'xyolk
Patent Application Publication Sep. 22, 2005 Sheet 10 0f 18 US 2005/0207495 A1
l-ARIES llil
Raw YUV color frame data
 I
1
WaveletTransform?lterbank I
b
u
f r MEIMC'
f
f e l/
2 2 ' I
2
35 l CABAC ooded l
n motion l
information
I.
Source Encoder (ARIES llll)
' I
Fig'. 9A
Channel encoding (Tree
paniiioning, CRC, RCPC)
compressedfile I
Optional
Streaming data
Patent Application Publication Sep. 22, 2005 Sheet 11 0f 18
1
Raw YUV color frame data
Bynau[oritrill
t
Inverse Discrete Wavelet
Transform (I-DWT)
I l-ARIES1m }
US 2005/0207495 A1
/ ME/MC'  Bypass ME/MC'for~- I ltrames
T I I
CABAC I
coded
motion
infon-nation I
l
L
Discrete Wavctet
TranstorrMDWT)
I
l
I
Source Encoder (ARIES l/II)
Fig. 9B
5 lmChannel encoding (Tree
partitioning, CRC, RCPC)
V
Compressed File
Streaming data
Patent Application Publication Sep. 22, 2005 Sheet 12 0f 18 US 2005/0207495 A1
Streaming data
I I Optional
l
I _ o . .___.4_i._.>_aiier t
 Compressed Video (.qsx ?le
I fon'nat)
CABAC coded I
motion
information
Channel decoding (Tree merging. CRC,
RCPC)
Source Decoder (l-ARIES llll)
KBypass MC for l
Frame Buffer ) MC frames
Inverse Wavelet Transform
L
RawYUVdata
Patent Application Publication Sep. 22, 2005 Sheet 13 0f 18 US 2005/0207495 A1
, Streaming data
I Compressed Video (.qsx ?le format)
I
CABAC coded I
motion I
information
I
Channel decoding (Tree merging, CRC,
RCPC)
E &Source Decoder (l-ARIES II")
I l _.
a ....... a
Frame Buffer % MC frames
[ InverseWaveletTransform
I RawYUV data
Fig. 108
Patent Application Publication Sep. 22, 2005 Sheet 14 0f 18 US 2005/0207495 A1
Identify a reference frame (e.g._ the ?rst frame or an I- I I 9 a
frame) /
Ana
iPerform a MEJMC on the coarsest subbands as parent
subbands of a current frame other than the i-frame with
respect to the identi?ed reference frame to generate one or
more motion vectors for the coarsest subbands.
~Ho1
Estimate the spatial shifting of pixels of child subbands
using the motion vectors of the parent subbands to
determine a search area of the child subbands.
l
Perform a ME/MC for the child subbands to deten'nine the
motion vectors of the child subbands.
AIlla?
More child subbands?
I q
Perform compression on the predicted/compensated data
into compressed data (e.g., see, Figs. 5 and 8) M! a;
Fig. 11
Patent Application Publication Sep. 22, 2005 Sheet 15 0f 18 US 2005/0207495 A1
A8+02
w022V4v_v_H
0Vov8A1
"v/
M..0
V202*vi
2*
=204
04A/Qt.
Fig. 12
//////////////
'
Ill/Ill;/
r
Z21
kmBemmfMk=leve| of sub band
o=orientation (LL, HL,
LH HH)
Boundary of the
_ - - .- Search Area for
re?nement MVs
Re?nement Vector
for level k
orientation 0
Block Neighborhood辿 MOIIOI'I
Vector
Sep. 22, 2005 Sheet 16 0f 18 US 2005/0207495 A1
31m.F
Integer Motion Prediction
////////an
a
T.
,
r.
2m"VA/
wank4%mi.,
/////////
2);)損,
1.,W
Patent Application Publication
Patent Application Publication Sep. 22, 2005 Sheet 17 0f 18 US 2005/0207495 A1
/////// Integer Motion Prediction
HaIf-Pel Motion
/ Prediction
Fig. 14
Patent Application Publication Sep. 22, 2005 Sheet 18 0f 18 US 2005/0207495 A1
Block
currently
being
tested
~95
I Matching
block
W1 22::V. ____ 2 i being tested is in
1MV mode
> Motion Vector (identical colors
_ _) denote MVs of the same block)
Displaced MV to translate
matching block to the relative
- -> position of macroblock currently
being tested
Fig. 15
current block
being tested is in
4MV mode
I
US 2005/0207495 A1
METHODS AND APPARATUSES FOR
COMPRESSING DIGITAL IMAGE DATA WITH
MOTION PREDICTION
[0001] This application claims the bene?t of US. Provi
sional Application No. 60/552,153, ?led Mar. 10, 2004, US.
Provisional Application No. 60/552,356, ?led Mar. 10,
2004, and Us. Provisional Application No. 60/552,270,
?led Mar. 10, 2004. The above-identi?ed applications are
hereby incorporated by references.
FIELD OF THE INVENTION
[0002] The present invention relates generally to multi
media applications. More particularly, this invention relates
to compressing digital image data With motion prediction.
BACKGROUND OF THE INVENTION
[0003] A variety of systems have been developed for the
encoding and decoding of audio/video data for transmission
over Wireline and/or Wireless communication systems over
the past decade. Most systems in this category employ
standard compression/transmission techniques, such as, for
example, the ITU-T Rec. H.264 (also referred to as H.264)
and ISO/IEC Rec. 14496-10 AVC (also referred to as
MPEG-4) standards. HoWever, due to their inherent gener
ality, they lack the speci?c qualities needed for seamless
implementation on loW poWer, loW complexity systems
(such as hand held devices including, but not restricted to,
personal digital assistants and smart phones) over noisy, loW
bit rate Wireless channels.
[0004] Due to the likely business models rapidly emerging
in the Wireless market, in Which cost incurred by the
consumer is directly proportional to the actual volume of
transmitted data, and also due to the limited bandWidth,
processing capability, storage capacity and battery poWer,
ef?ciency and speed in compression of audio/video data to
be transmitted is a major factor in the eventual success of
any such multimedia content delivery system. Most systems
in use today are retro?tted versions of identical systems used
on higher end desktop Workstations. Unlike desktop sys
tems, Where error control is not a critical issue due to the
inherent reliability of cable LAN/WAN data transmission,
and bandWidth may be assumed to be almost unlimited,
transmission over limited capacity Wireless netWorks require
integration of such systems that may leverage suitable
processing and error-control technologies to achieve the
level of ?delity expected of a commercially viable multi
media compression and transmission system.
[0005] Conventional video compression engines, or
codecs, can be broadly classi?ed into tWo broad categories.
One class of coding strategies, knoWn as a doWnload-and
play (D&P) pro?le, not only requires the entire ?le to be
doWnloaded onto the local memory before playback, leading
to a large latency time (depending on the available band
Width and the actual ?le siZe), but also makes stringent
demands on the amount of buffer memory to be made
available for the doWnloaded payload. Even With the more
sophisticated streaming pro?le, the current physical limita
tions on current generation transmission equipment at the
physical layer force service providers to incorporate a
pseudo-streaming capability, Which requires an initial period
of latency (at the beginning of transmission), and continuous
buffering henceforth, Which imposes a strain on the limited
Sep. 22, 2005
processing capabilities of the hand-held processor. Most
commercial compression solutions in the market today do
not possess a progressive transmission capability, Which
means that transmission is possible only until the last
integral frame, packet or bit before bandWidth drops beloW
the minimum threshold. In case of video codecs, if the
connection breaks before the transmission of the current
frame, this frame is lost forever.
[0006] Another draWback in conventional video compres
sion codes is the introduction of blocking artifacts due to the
block-based coding schemes used in most codecs. Apart
from the degradation in subjective visual quality, such
systems suffer from poor performance due to bottlenecks
introduced by the additional de-blocking ?lters. Yet another
draWback is that, due to the limitations in the Word siZe of
the computing platform, the coded coef?cients are truncated
to an approximate value. This is especially prominent along
object boundaries, Where Gibbs phenomenon leads to the
generation of a visual phenomenon knoWn as mosquito
noise. Due to this, the blurring along the object boundaries
becomes more prominent, leading to degradation in overall
frame quality.
[0007] Additionally, the local nature of motion prediction
in some codes introduces motion-induced artifacts, Which
cannot be easily smoothened by a simple ?ltering operation.
Such problems arise especially in cases of fast motion clips
and systems Where the frame rate is beloW that of natural
video (e.g., 25 or 30 fps non-interlaced video). In either case,
the temporal redundancy betWeen tWo consecutive frames is
extremely loW (since much of the motion is lost in betWeen
the frames itself), leading to poorer tracking of the motion
across frames. This effect is cumulative in nature, especially
for a longer group of frames (GoF).
[0008] Furthermore, mobile end-user devices are con
strained by loW processing poWer and storage capacity. Due
to the limitations on the silicon footprint, most mobile and
hand-held systems in the market have to time-share the
resources of the central processing unit (microcontroller or
RISC/CISC processor) to perform all its DSP, control and
communication tasks, With little or no provisions for a
dedicated processor to take the video/audio processing load
off the central processor. Moreover, most general-purpose
central processors lack the unique architecture needed for
optimal DSP performance. Therefore, a mobile video-codec
design must have minimal client-end complexity While
maintaining consistency on the ef?ciency and robustness
front.
SUMMARY OF THE INVENTION
[0009] Methods and apparatuses for compressing digital
image data With motion prediction are described herein. In
one embodiment, for each tWo consecutive frames of an
image sequence, a motion prediction is performed betWeen
the consecutive frames by tracking motion on a luminance
map of the frames to generate motion prediction information
for the luminance component. The motion prediction infor
mation of the luminance component is then applied to the
chrominance maps. In response to the motion prediction, the
Wavelet coef?cients of each frame and the motion prediction
information are encoded into a bit stream based on a target
transmission rate, Where the encoded Wavelet coefficients
satisfy a predetermined threshold according to a predeter
mined algorithm.
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
US20050207495
Ad

Recommended

US20090263030
US20090263030
Partho Choudhury
US7522774
US7522774
Partho Choudhury
Unit 2 Complete Notes.pdf
Unit 2 Complete Notes.pdf
Joseecote
Section 1 8051 microcontroller instruction set
Section 1 8051 microcontroller instruction set
nueng-kk
03 addr mode & instructions
03 addr mode & instructions
ShubhamBakshi14
ARL-TR-7315
ARL-TR-7315
Richard Haney
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
Saumitra Rukmangad
8085 stack & machine control instruction
8085 stack & machine control instruction
prashant1271
9c 2012 FLIGHT PLAN CONTENT CHANGES
9c 2012 FLIGHT PLAN CONTENT CHANGES
guest3a43e10
8085:branching and logical instruction
8085:branching and logical instruction
Nemish Bhojani
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Hsien-Hsin Sean Lee, Ph.D.
Microprocessor system - summarize
Microprocessor system - summarize
Hisham Mat Hussin
Data transfer instruction set of 8085 micro processor
Data transfer instruction set of 8085 micro processor
vishalgohel12195
8085 branching instruction
8085 branching instruction
prashant1271
Bilbo 1
Bilbo 1
Naveenkumar G
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
Netronome
8085 data transfer instruction set
8085 data transfer instruction set
prashant1271
Axes Tech
Axes Tech
ncct
8085 assembly language programming
8085 assembly language programming
Prof. Dr. K. Adisesha
Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)
Joel P
notes_Image Compression.ppt
notes_Image Compression.ppt
HarisMasood20
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
ijcsa
notes_Image Compression_edited.ppt
notes_Image Compression_edited.ppt
HarisMasood20
Wavelet based Image Coding Schemes: A Recent Survey
Wavelet based Image Coding Schemes: A Recent Survey
ijsc
rapport
rapport
Harald Nordgren
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
ijtsrd
image compression ppt
image compression ppt
Shivangi Saxena
International Journal on Soft Computing ( IJSC )
International Journal on Soft Computing ( IJSC )
ijsc
Image and Video Compression Techniques
Image and Video Compression Techniques
MangaiK4
Image and Video Compression Techniques In Image Processing an Overview
Image and Video Compression Techniques In Image Processing an Overview
MangaiK4

More Related Content

What's hot (11)

9c 2012 FLIGHT PLAN CONTENT CHANGES
9c 2012 FLIGHT PLAN CONTENT CHANGES
guest3a43e10
8085:branching and logical instruction
8085:branching and logical instruction
Nemish Bhojani
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Hsien-Hsin Sean Lee, Ph.D.
Microprocessor system - summarize
Microprocessor system - summarize
Hisham Mat Hussin
Data transfer instruction set of 8085 micro processor
Data transfer instruction set of 8085 micro processor
vishalgohel12195
8085 branching instruction
8085 branching instruction
prashant1271
Bilbo 1
Bilbo 1
Naveenkumar G
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
Netronome
8085 data transfer instruction set
8085 data transfer instruction set
prashant1271
Axes Tech
Axes Tech
ncct
8085 assembly language programming
8085 assembly language programming
Prof. Dr. K. Adisesha
9c 2012 FLIGHT PLAN CONTENT CHANGES
9c 2012 FLIGHT PLAN CONTENT CHANGES
guest3a43e10
8085:branching and logical instruction
8085:branching and logical instruction
Nemish Bhojani
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Hsien-Hsin Sean Lee, Ph.D.
Microprocessor system - summarize
Microprocessor system - summarize
Hisham Mat Hussin
Data transfer instruction set of 8085 micro processor
Data transfer instruction set of 8085 micro processor
vishalgohel12195
8085 branching instruction
8085 branching instruction
prashant1271
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
Netronome
8085 data transfer instruction set
8085 data transfer instruction set
prashant1271
Axes Tech
Axes Tech
ncct
8085 assembly language programming
8085 assembly language programming
Prof. Dr. K. Adisesha

Similar to US20050207495 (20)

Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)
Joel P
notes_Image Compression.ppt
notes_Image Compression.ppt
HarisMasood20
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
ijcsa
notes_Image Compression_edited.ppt
notes_Image Compression_edited.ppt
HarisMasood20
Wavelet based Image Coding Schemes: A Recent Survey
Wavelet based Image Coding Schemes: A Recent Survey
ijsc
rapport
rapport
Harald Nordgren
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
ijtsrd
image compression ppt
image compression ppt
Shivangi Saxena
International Journal on Soft Computing ( IJSC )
International Journal on Soft Computing ( IJSC )
ijsc
Image and Video Compression Techniques
Image and Video Compression Techniques
MangaiK4
Image and Video Compression Techniques In Image Processing an Overview
Image and Video Compression Techniques In Image Processing an Overview
MangaiK4
MTech Dissertation.ppt
MTech Dissertation.ppt
ssuser64322e
Multimedia.pdf
Multimedia.pdf
SunayanaShivthare1
Ceis 4
Ceis 4
Alexander Decker
Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches
ijsc
Fidelity criteria in image compression
Fidelity criteria in image compression
KadamPawan
Nq2422332236
Nq2422332236
IJERA Editor
Design of Image Compression Algorithm using MATLAB
Design of Image Compression Algorithm using MATLAB
IJEEE
Image compression and its security1
Image compression and its security1
Reyad Hossain
AN OPTIMIZED BLOCK ESTIMATION BASED IMAGE COMPRESSION AND DECOMPRESSION ALGOR...
AN OPTIMIZED BLOCK ESTIMATION BASED IMAGE COMPRESSION AND DECOMPRESSION ALGOR...
IAEME Publication
Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)
Joel P
notes_Image Compression.ppt
notes_Image Compression.ppt
HarisMasood20
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
ijcsa
notes_Image Compression_edited.ppt
notes_Image Compression_edited.ppt
HarisMasood20
Wavelet based Image Coding Schemes: A Recent Survey
Wavelet based Image Coding Schemes: A Recent Survey
ijsc
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
ijtsrd
image compression ppt
image compression ppt
Shivangi Saxena
International Journal on Soft Computing ( IJSC )
International Journal on Soft Computing ( IJSC )
ijsc
Image and Video Compression Techniques
Image and Video Compression Techniques
MangaiK4
Image and Video Compression Techniques In Image Processing an Overview
Image and Video Compression Techniques In Image Processing an Overview
MangaiK4
MTech Dissertation.ppt
MTech Dissertation.ppt
ssuser64322e
Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches
ijsc
Fidelity criteria in image compression
Fidelity criteria in image compression
KadamPawan
Design of Image Compression Algorithm using MATLAB
Design of Image Compression Algorithm using MATLAB
IJEEE
Image compression and its security1
Image compression and its security1
Reyad Hossain
AN OPTIMIZED BLOCK ESTIMATION BASED IMAGE COMPRESSION AND DECOMPRESSION ALGOR...
AN OPTIMIZED BLOCK ESTIMATION BASED IMAGE COMPRESSION AND DECOMPRESSION ALGOR...
IAEME Publication
Ad

More from Partho Choudhury (9)

Final version - MIT FT Capstone 405 SR - MPower
Final version - MIT FT Capstone 405 SR - MPower
Partho Choudhury
Technical Report on the DVB-H and DVB-SH
Technical Report on the DVB-H and DVB-SH
Partho Choudhury
DVB-SH Link Budget Analysis
DVB-SH Link Budget Analysis
Partho Choudhury
Satellite Downlink Budget Analysis
Satellite Downlink Budget Analysis
Partho Choudhury
The Digital Video Broadcast (DVB) Project
The Digital Video Broadcast (DVB) Project
Partho Choudhury
Wireless_Video_Access_Networks_ppt
Wireless_Video_Access_Networks_ppt
Partho Choudhury
Ultra_Wide_Band_ppt
Ultra_Wide_Band_ppt
Partho Choudhury
UMAandFemtocells-MakingFMCHappen
UMAandFemtocells-MakingFMCHappen
Partho Choudhury
Final version - MIT FT Capstone 405 SR - MPower
Final version - MIT FT Capstone 405 SR - MPower
Partho Choudhury
Technical Report on the DVB-H and DVB-SH
Technical Report on the DVB-H and DVB-SH
Partho Choudhury
DVB-SH Link Budget Analysis
DVB-SH Link Budget Analysis
Partho Choudhury
Satellite Downlink Budget Analysis
Satellite Downlink Budget Analysis
Partho Choudhury
The Digital Video Broadcast (DVB) Project
The Digital Video Broadcast (DVB) Project
Partho Choudhury
Wireless_Video_Access_Networks_ppt
Wireless_Video_Access_Networks_ppt
Partho Choudhury
UMAandFemtocells-MakingFMCHappen
UMAandFemtocells-MakingFMCHappen
Partho Choudhury
Ad

US20050207495

  • 1. US 20050207495A1 (12) Patent Application Publication (10) Pub. No.: US 2005/0207495 A1 (19) United States Ramasastry et al. (43) Pub. Date: Sep. 22, 2005 (54) METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA WITH MOTION PREDICTION (76) Inventors: Jayaram Ramasastry, Woodinville, CA (US); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN) Correspondence Address: BLAKELY SOKOLOFF TAYLOR & ZAFMAN 12400 WILSHIRE BOULEVARD SEVENTH FLOOR LOS ANGELES, CA 90025-1030 (US) (21) Appl. No.: 11/076,746 (22) Filed: Mar. 9, 2005 Related US. Application Data (60) Provisional application No. 60/552,153, ?led on Mar. 10, 2004. Provisional application No. 60/552,356, ?led on Mar. 10, 2004. Provisional application No. 60/552,270, ?led on Mar. 10, 2004. Publication Classi?cation (51) Int. Cl? .......................................................H04N 7/12 (52) Us. 01. ........ 375/240.16; 375/240.19; 375/240.12; 375/240.11; 375/240.24 (57) ABSTRACT Methods and apparatuses for compressing digital image data With motion prediction are described herein. In one embodi ment, for each tWo consecutive frames of an image sequence, a motion prediction is performed betWeen the consecutive frames by tracking motion on a luminance map of the frames to generate motion prediction information for the luminance component. The motion prediction informa tion of the luminance component is then applied to the chrominance maps. In response to the motion prediction, the Wavelet coef?cients of each frame and the motion prediction information are encoded into a bit stream based on a target transmission rate, Where the encoded Wavelet coefficients satisfy a predetermined threshold according to a predeter mined algorithm. Other methods and apparatuses are also described. Encoder D3,"?Acqulsmon / 属 6 I vs Optional Decoder Server /0 I Network (13.9., wired and/or Wireless) [01/ Optimal Encoder Client /a]
  • 2. Patent Application Publication Sep. 22, 2005 Sheet 1 0f 18 US 2005/0207495 A1 &uE26 Q WSQMQ582wm3:250Qe
  • 4. Patent Application Publication Sep. 22, 2005 Sheet 3 0f 18 US 2005/0207495 A1 Physical Layer (W-CDMA, CDMA 1.x. cdma2000, GSM-GPRS, UMTS, iBen) (1) Data Link Control (DLC) (2) Streaming piolocol stack (RTP. RTSP. RTCP. so") (4) Third party ISO proiocoi 5m (TCP/lP/UDP) (3) Billing and other ancillary services (5) Network Aware Layer (NAL) (6) Application Layer APIs ior QwikSUeam'". QNikVu" and Qwiklex'" (7) Content Generation Engine (8) Data Repository (9) Fig. 3
  • 5. Patent Application Publication Sep. 22, 2005 Sheet 4 0f 18 US 2005/0207495 A1 l l | Raw YUV color frame data 4L0 o Wavelet Transfonn ?lter bank #07. Source Encoder (ARIES) 1H3 Channel encoding (Tree partitioning, CRC, RCPC) M Compressed File (e.g., .qvx ?le) Fig. 4A
  • 6. Patent Application Publication Sep. 22, 2005 Sheet 5 0f 18 US 2005/0207495 A1 Compressed Image (.qvx ?le format) 4; Channel decoding (Tree merging. CRC, RCPC) Source Decoder (l-ARIES) Inverse Wavelet Transfonn Raw YUV data Fig. 4B
  • 7. Patent Application Publication Sep. 22, 2005 Sheet 6 0f 18 US 2005/0207495 A1 Perform a wavelet transformation on each image pixel to _ transform the pixel into one or more coefficients in one or more wavelet maps. Encode each wavelet map by representing the signi?cance, sign and bit plane infomiation of the pixel using a single bit in a bit stream. A, 90 L Encode the signi?cant bits into a context variable dependent upon the information represented by the bit and its location of the coefficient being coded (e.g.. the probability of occurrence of a predetermined set of bits I immediately preceding the current bit). A $"o3 l Transmit the content of the context variable as a bit stream as an output representing the encoded pixels. ~ 5-04.
  • 8. Patent Application Publication Sub-tree 1 (HL) Sep. 22, 2005 Sheet 7 0f 18 m W Fig. 6 US 2005/0207495 A1 Sub-tree 3 (HH)
  • 9. Patent Application Publication Sep. 22, 2005 Sheet 8 0f 18 US 2005/0207495 A1 Fig. 7
  • 10. Patent Application Publication Sep. 22, 2005 Sheet 9 0f 18 US 2005/0207495 A1 1 Determine a number of iterations (nl) based on a number ot| quantization levels, which may be determined on the Z 9 largest wavelet coef?cient, and set an initial quantization / threshold T = 2 " l g l l Populate all insigni?cant pixels in IPQ. all insigni?cant pixel having descendants in ISO, and all signi?cant pixels in SPQ. A K a L For each type I entry of ISO, if the entry is signi?cant with respect to a current quantization threshold, remove the respective entry from ISO and append it in the SPQ l YoI l For each type I entry of ISO, if the entry is insignificant with respect to a current quantization threshold, remove the respective entry from ISO and append it in the lPQ lIt the respective type t entry includes descendants, remove the entry from the ISO and append it at the end of ISO as type II entry for next iteration; otherwise, the entry is purged. ~ g r lFor each type II entry of ISO, if the entry is signi?cant with respect to a current quantization threshold, all offspring of the current lSQ entry are appended to the end of ISO as type I entries for next iteration. I損 Z, l Remove any entry in lPQ that is signi?cant with respect to the current quantization threshold and append it in the 'xyolk
  • 11. Patent Application Publication Sep. 22, 2005 Sheet 10 0f 18 US 2005/0207495 A1 l-ARIES llil Raw YUV color frame data I 1 WaveletTransform?lterbank I b u f r MEIMC' f f e l/ 2 2 ' I 2 35 l CABAC ooded l n motion l information I. Source Encoder (ARIES llll) ' I Fig'. 9A Channel encoding (Tree paniiioning, CRC, RCPC) compressedfile I Optional Streaming data
  • 12. Patent Application Publication Sep. 22, 2005 Sheet 11 0f 18 1 Raw YUV color frame data Bynau[oritrill t Inverse Discrete Wavelet Transform (I-DWT) I l-ARIES1m } US 2005/0207495 A1 / ME/MC' Bypass ME/MC'for~- I ltrames T I I CABAC I coded motion infon-nation I l L Discrete Wavctet TranstorrMDWT) I l I Source Encoder (ARIES l/II) Fig. 9B 5 lmChannel encoding (Tree partitioning, CRC, RCPC) V Compressed File Streaming data
  • 13. Patent Application Publication Sep. 22, 2005 Sheet 12 0f 18 US 2005/0207495 A1 Streaming data I I Optional l I _ o . .___.4_i._.>_aiier t Compressed Video (.qsx ?le I fon'nat) CABAC coded I motion information Channel decoding (Tree merging. CRC, RCPC) Source Decoder (l-ARIES llll) KBypass MC for l Frame Buffer ) MC frames Inverse Wavelet Transform L RawYUVdata
  • 14. Patent Application Publication Sep. 22, 2005 Sheet 13 0f 18 US 2005/0207495 A1 , Streaming data I Compressed Video (.qsx ?le format) I CABAC coded I motion I information I Channel decoding (Tree merging, CRC, RCPC) E &Source Decoder (l-ARIES II") I l _. a ....... a Frame Buffer % MC frames [ InverseWaveletTransform I RawYUV data Fig. 108
  • 15. Patent Application Publication Sep. 22, 2005 Sheet 14 0f 18 US 2005/0207495 A1 Identify a reference frame (e.g._ the ?rst frame or an I- I I 9 a frame) / Ana iPerform a MEJMC on the coarsest subbands as parent subbands of a current frame other than the i-frame with respect to the identi?ed reference frame to generate one or more motion vectors for the coarsest subbands. ~Ho1 Estimate the spatial shifting of pixels of child subbands using the motion vectors of the parent subbands to determine a search area of the child subbands. l Perform a ME/MC for the child subbands to deten'nine the motion vectors of the child subbands. AIlla? More child subbands? I q Perform compression on the predicted/compensated data into compressed data (e.g., see, Figs. 5 and 8) M! a; Fig. 11
  • 16. Patent Application Publication Sep. 22, 2005 Sheet 15 0f 18 US 2005/0207495 A1 A8+02 w022V4v_v_H 0Vov8A1 "v/ M..0 V202*vi 2* =204 04A/Qt. Fig. 12 ////////////// ' Ill/Ill;/ r Z21 kmBemmfMk=leve| of sub band o=orientation (LL, HL, LH HH) Boundary of the _ - - .- Search Area for re?nement MVs Re?nement Vector for level k orientation 0 Block Neighborhood辿 MOIIOI'I Vector
  • 17. Sep. 22, 2005 Sheet 16 0f 18 US 2005/0207495 A1 31m.F Integer Motion Prediction ////////an a T. , r. 2m"VA/ wank4%mi., ///////// 2);)損, 1.,W Patent Application Publication
  • 18. Patent Application Publication Sep. 22, 2005 Sheet 17 0f 18 US 2005/0207495 A1 /////// Integer Motion Prediction HaIf-Pel Motion / Prediction Fig. 14
  • 19. Patent Application Publication Sep. 22, 2005 Sheet 18 0f 18 US 2005/0207495 A1 Block currently being tested ~95 I Matching block W1 22::V. ____ 2 i being tested is in 1MV mode > Motion Vector (identical colors _ _) denote MVs of the same block) Displaced MV to translate matching block to the relative - -> position of macroblock currently being tested Fig. 15 current block being tested is in 4MV mode I
  • 20. US 2005/0207495 A1 METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA WITH MOTION PREDICTION [0001] This application claims the bene?t of US. Provi sional Application No. 60/552,153, ?led Mar. 10, 2004, US. Provisional Application No. 60/552,356, ?led Mar. 10, 2004, and Us. Provisional Application No. 60/552,270, ?led Mar. 10, 2004. The above-identi?ed applications are hereby incorporated by references. FIELD OF THE INVENTION [0002] The present invention relates generally to multi media applications. More particularly, this invention relates to compressing digital image data With motion prediction. BACKGROUND OF THE INVENTION [0003] A variety of systems have been developed for the encoding and decoding of audio/video data for transmission over Wireline and/or Wireless communication systems over the past decade. Most systems in this category employ standard compression/transmission techniques, such as, for example, the ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC Rec. 14496-10 AVC (also referred to as MPEG-4) standards. HoWever, due to their inherent gener ality, they lack the speci?c qualities needed for seamless implementation on loW poWer, loW complexity systems (such as hand held devices including, but not restricted to, personal digital assistants and smart phones) over noisy, loW bit rate Wireless channels. [0004] Due to the likely business models rapidly emerging in the Wireless market, in Which cost incurred by the consumer is directly proportional to the actual volume of transmitted data, and also due to the limited bandWidth, processing capability, storage capacity and battery poWer, ef?ciency and speed in compression of audio/video data to be transmitted is a major factor in the eventual success of any such multimedia content delivery system. Most systems in use today are retro?tted versions of identical systems used on higher end desktop Workstations. Unlike desktop sys tems, Where error control is not a critical issue due to the inherent reliability of cable LAN/WAN data transmission, and bandWidth may be assumed to be almost unlimited, transmission over limited capacity Wireless netWorks require integration of such systems that may leverage suitable processing and error-control technologies to achieve the level of ?delity expected of a commercially viable multi media compression and transmission system. [0005] Conventional video compression engines, or codecs, can be broadly classi?ed into tWo broad categories. One class of coding strategies, knoWn as a doWnload-and play (D&P) pro?le, not only requires the entire ?le to be doWnloaded onto the local memory before playback, leading to a large latency time (depending on the available band Width and the actual ?le siZe), but also makes stringent demands on the amount of buffer memory to be made available for the doWnloaded payload. Even With the more sophisticated streaming pro?le, the current physical limita tions on current generation transmission equipment at the physical layer force service providers to incorporate a pseudo-streaming capability, Which requires an initial period of latency (at the beginning of transmission), and continuous buffering henceforth, Which imposes a strain on the limited Sep. 22, 2005 processing capabilities of the hand-held processor. Most commercial compression solutions in the market today do not possess a progressive transmission capability, Which means that transmission is possible only until the last integral frame, packet or bit before bandWidth drops beloW the minimum threshold. In case of video codecs, if the connection breaks before the transmission of the current frame, this frame is lost forever. [0006] Another draWback in conventional video compres sion codes is the introduction of blocking artifacts due to the block-based coding schemes used in most codecs. Apart from the degradation in subjective visual quality, such systems suffer from poor performance due to bottlenecks introduced by the additional de-blocking ?lters. Yet another draWback is that, due to the limitations in the Word siZe of the computing platform, the coded coef?cients are truncated to an approximate value. This is especially prominent along object boundaries, Where Gibbs phenomenon leads to the generation of a visual phenomenon knoWn as mosquito noise. Due to this, the blurring along the object boundaries becomes more prominent, leading to degradation in overall frame quality. [0007] Additionally, the local nature of motion prediction in some codes introduces motion-induced artifacts, Which cannot be easily smoothened by a simple ?ltering operation. Such problems arise especially in cases of fast motion clips and systems Where the frame rate is beloW that of natural video (e.g., 25 or 30 fps non-interlaced video). In either case, the temporal redundancy betWeen tWo consecutive frames is extremely loW (since much of the motion is lost in betWeen the frames itself), leading to poorer tracking of the motion across frames. This effect is cumulative in nature, especially for a longer group of frames (GoF). [0008] Furthermore, mobile end-user devices are con strained by loW processing poWer and storage capacity. Due to the limitations on the silicon footprint, most mobile and hand-held systems in the market have to time-share the resources of the central processing unit (microcontroller or RISC/CISC processor) to perform all its DSP, control and communication tasks, With little or no provisions for a dedicated processor to take the video/audio processing load off the central processor. Moreover, most general-purpose central processors lack the unique architecture needed for optimal DSP performance. Therefore, a mobile video-codec design must have minimal client-end complexity While maintaining consistency on the ef?ciency and robustness front. SUMMARY OF THE INVENTION [0009] Methods and apparatuses for compressing digital image data With motion prediction are described herein. In one embodiment, for each tWo consecutive frames of an image sequence, a motion prediction is performed betWeen the consecutive frames by tracking motion on a luminance map of the frames to generate motion prediction information for the luminance component. The motion prediction infor mation of the luminance component is then applied to the chrominance maps. In response to the motion prediction, the Wavelet coef?cients of each frame and the motion prediction information are encoded into a bit stream based on a target transmission rate, Where the encoded Wavelet coefficients satisfy a predetermined threshold according to a predeter mined algorithm.