1. We described the principle of operation of the scanners used in facsimile
machines to digitize bittonal images (such as a printed document) in Section 2.4.3.
The digital representation of a scanned page was shown in Figure 2.11(b) and,
even though only a single binary bit is used to present each picture element, with
the resolution used, this produces an uncompressed bit stream of the order of 2
Mbits. In most cases this must be transferred using modems and the public
switched telephone network. The relatively low bit rates available with modems
means that it would be both costly and time consuming to transfer a total
document comprising many pages in this basic form.
With most documents, many scanned lines consist only of long strings of white
picture elements pels while o thers comprise a mix of long strings of white and
long strings of black pels. Since facsimile machines are normally used with public
carrier networks, the ITU-T has produced standards relating to them. These are T2
(Group 1), T3 (Group 2), T4 (Group 3) and T6 (Group 4). The firt two are earlier
standards and are now rarely used. The an analog PSTN, and Group 4 all-digital for
use with digital networks such as the ISDN. Both use data compression, and
compression ratios in excess of 10:1 are common with most document pages. The
time taken to transmit a page is reduced to less than a minute with Group 3
machines and, because of the added benefit of a higher tranmission rate (64kbps),
to less than a few seconds with a Group 4 machine.
As part of the standardization process, extensive analyses of typical scanned
document pages were made. Tables of codewords were produced based on the
relative frequency of occurrence of the number of contiguous white and black
pels found in a scanned line. The resulting codeword are fixed and grouped into
two separate tables: the termination-codes table and the make-up codes table.
The codewords in each table are shown in Figure 3.11
L m畛t ph畉n c畛a qu叩 tr狸nh ti棚u chu畉n, ph但n t鱈ch s但u r畛ng c畛a c叩c trang ti li畛u
qu辿t i畛n h狸nh 達 動畛c th畛c hi畛n. B畉ng t畛 m達 動畛c t畉o d畛a tr棚n t畉n su畉t c畛a s畛
xu畉t hi畛n c畛a s畛 l動畛ng pels ti畉p gi叩p tr畉ng v en 動畛c t狸m th畉y trong m畛t d嘆ng
qu辿t.T畛 m達 k畉t qu畉 l c畛 畛 nh v 動畛c nh坦m l畉i thnh hai b畉ng ri棚ng bi畛t: b畉ng
ch畉m d畛t m達 s畛 v b畉ng make-up m達.Codewords trong m畛i b畉ng 動畛c th畛 hi畛n
trong h狸nh 3.11
2. Codewords in the termination-codes table are for-white or black run-lengths of
from 0 to 63 pels in steps of pel; the make-up codes table contains codewords for
white or black run-lengths that are multiples of 64 pels. A technique known as
overscanning is used which means that all lines start with a minimum of one white
pel. In this way, the receiver knows the first codeword always relates to white pels
and then white. Since the scheme uses two sets of codewords
(termination and makeup) they are known as modified Huffman codes. As an
example, a run-length of 12 white pels is coded directly as 001000. Similarly, a
run-length of 12 black pels is coded directly as 0000111. A run-length of 140 black
pels, however, is encoded as 000011001000 + 000011; that is, 128 + 12 pels. Run-
lengths exceeding 2560 pels are encoded using more than one make-up code plus
one termination code.
T畛 m達 trong b畉ng ch畉m d畛t-m達 l mu tr畉ng ho畉c en ch畉y di t畛 0 畉n 63 pels
trong c叩c b動畛c c畛a pel, make-up b畉ng m達 ch畛a t畛 m達 mu tr畉ng ho畉c en ch畉y
di l b畛i s畛 c畛a 64 pels.
M畛t k畛 thu畉t 動畛c g畛i l overscanning 動畛c s畛 d畛ng c坦 ngh挑a l t畉t c畉 c叩c d嘆ng
b畉t 畉u v畛i t畛i thi畛u pel tr畉ng. B畉ng c叩ch ny, nh畉n bi畉t mu tr畉ng 畉u ti棚n. K畛 t畛
khi ch動董ng tr狸nh ny s畛 d畛ng hai b畛 t畛 m達 (ch畉m d畛t v trang i畛m), h畛 動畛c bi畉t
畉n nh動 l s畛a 畛i m達 Huffman. V鱈 d畛, m畛t ch畉y di 12 pels tr畉ng 動畛c m達 ho叩
tr畛c ti畉p l 001.000. T動董ng t畛 nh動 v畉y, m畛t ch畉y di 12 pels en 動畛c m達 ho叩 tr畛c
ti畉p l 0.000.111. A ch畉y chi畛u di 140 pels en, tuy nhi棚n, 動畛c m達 h坦a nh動
000011001000 + 0000.111, ngh挑a l 128 + 12 pels. Run-畛 di v動畛t qu叩 2560 pels
動畛c m達 h坦a b畉ng c叩ch s畛 d畛ng nhi畛u h董n m畛t make-up m達 c畛ng v畛i m達 ch畉m
d畛t.
There is no error-correction protocol with Group 3. From the list of codewords, we
can deduce that if one or more bits is corrupted during its transmission through
the network, the receiver will start to interpret subsequent codewords on the
wrong bit boundaries. The receiver thus becomes unsynchronized and cannot
decode the receiver bit string. To enable the receiver to regain synchronism, each
scanned line is terminated with a known end-of-line (EOL) code. In this way, if the
receiver fails to decode a valid codeword after the maximum number of bits in a
codeword have been scanned (parsed), it starts to search for the EOL pattern. If it
fails to decode an EOL after a preset number of lines, it aborts the reception
3. process and informs thr sending machine. A single EOL precedes the codewords
for each scanned page and a string of six consecutive EOLs indicates the end of
each page.
Kh担ng c坦 giao th畛c s畛a l畛i v畛i Nh坦m 3.T畛 danh s叩ch c叩c t畛 m達, ch炭ng ta c坦 th畛 suy
lu畉n r畉ng n畉u m畛t ho畉c nhi畛u bit l b畛 h畛ng trong qu叩 tr狸nh truy畛n qua m畉ng,
ng動畛i nh畉n s畉 b畉t 畉u 畛 gi畉i th鱈ch c叩c t畛 m達 ti畉p theo tr棚n c叩c 動畛ng bi棚n b鱈t
sai.Nh畉n do 坦 tr畛 thnh 畛ng b畛 v kh担ng th畛 gi畉i m達 chu畛i bit nh畉n. 畛 cho
ph辿p ng動畛i nh畉n 畛 l畉y l畉i 畛ng b畛, m畛i d嘆ng qu辿t 動畛c ch畉m d畛t v畛i m畛t m達
動畛c bi畉t 畉n (EOL) end-of-line. B畉ng c叩ch ny, n畉u ng動畛i nh畉n kh担ng gi畉i m達
m畛t t畛 m達 h畛p l畛 sau khi s畛 l動畛ng t畛i a c畛a c叩c bit trong m畛t t畛 m達 達 動畛c qu辿t
(ph但n t鱈ch c炭 ph叩p), n坦 b畉t 畉u 畛 t狸m ki畉m c叩c m担 h狸nh EOL. N畉u n坦 kh担ng gi畉i
m達 m畛t EOL sau khi m畛t s畛 ci s畉n c叩c d嘆ng, n坦 h畛y b畛 qu叩 tr狸nh ti畉p nh畉n v
th担ng b叩o cho m叩y g畛i Thr. M畛t EOL duy nh畉t tr動畛c codewords cho m畛i trang
動畛c qu辿t v m畛t chu畛i s叩u EO
Because each scanned line is encoded independently ,the T4 coding scheme
known as a one dimensional coding scheme . as we can conclude , it works
satisfactorily providing the scanned image contains significant areas of ters and
line drawings. Documents containing photographic images. However , are not
saticfactory as the different shades os the black and white are represented by
varying densities of black or white pels. This is turn . results coding scheme , can
lead to a negative compression ratio ;that is , more bits are needed to send the
scanned document in its compressed form than are needed its uncompressed
form.
B畛i v狸 m畛i d嘆ng qu辿t 動畛c m達 h坦a 畛c l畉p, c叩c ch動董ng tr狸nh m達 h坦a T4 動畛c bi畉t
畉n nh動 l m畛t ch動董ng tr狸nh m達 h坦a m畛t chi畛u.nh動 ch炭ng ta c坦 th畛 k畉t lu畉n, n坦
ho畉t 畛ng th畛a 叩ng cung c畉p c叩c h狸nh 畉nh qu辿t c坦 ch畛a c叩c l挑nh v畛c quan tr畛ng
c畛a ters v h狸nh v畉. Ti li畛u g畛m c叩c h狸nh 畉nh ch畛p 畉nh. Tuy nhi棚n, kh担ng ph畉i l
saticfactory nh動 c叩c s畉c th叩i kh叩c nhau cua mu en v tr畉ng 動畛c bi畛u di畛n
b畉ng c叩ch thay 畛i m畉t 畛 pels mu en ho畉c tr畉ng. 但y l l畉n l動畛t. k畉t qu畉 m達
h坦a ch動董ng tr狸nh, c坦 th畛 d畉n 畉n m畛t t畛 l畛 n辿n ti棚u c畛c, ngh挑a l, bit h董n l c畉n
thi畉t 畛 g畛i c叩c ti li畛u qu辿t 畛 d畉ng n辿n c畛a n坦 h董n l c畉n thi畉t d畉ng kh担ng n辿n
c畛a n坦.
4. For this reason the alternative T6 coding scheme has been defined. It is an
optional feature on group 3 facsimile machines but is compulsory in group 4
machines. When supported in group 3 machines , the EOL , code at the end of
each line has an addision tag bit added. If this is an binary 1 then the next line has
been encoded using the T4 coding scheme , if it is 0 then the T6 coding scheme
has been used. The latter is known as modified -readcoding .it is also known as
two-imensional or 2D coding since it identifies black and white run-length by
comparing adjacent scan line. ..
V狸 l箪 do ny T6 thay th畉 ch動董ng tr狸nh m達 h坦a 達 動畛c x叩c 畛 nh. N坦 l m畛t t鱈nh
nng t湛y ch畛n vo nh坦m 3 m叩y fax, nh動ng l b畉t bu畛c trong nh坦m 4 m叩y.Khi 動畛c
h畛 tr畛 trong nh坦m 3 m叩y, EOL, m達 畛 cu畛i m畛i d嘆ng c坦 m畛t ch炭t tag b畛 sung th棚m.
N畉u 但y l m畛t nh畛 ph但n 1 th狸 d嘆ng ti畉p theo 達 動畛c m達 h坦a b畉ng c叩ch s畛 d畛ng
c叩c ch動董ng tr狸nh m達 h坦a T4, n畉u n坦 l 0 sau 坦 ch動董ng tr狸nh T6 m達 h坦a 達 動畛c
s畛 d畛ng. Sau ny 動畛c bi畉t 畉n khi 動畛c s畛a 畛i 畛c m達 h坦a. N坦 c嘆n 動畛c g畛i l
m達 h坦a hai imensional ho畉c 2D v狸 n坦 x叩c 畛 nh mu en v tr畉ng ch畉y di b畉ng
c叩ch so s叩nh d嘆ng qu辿t li畛n k畛. ...
MMR coding exploits the fact that most scanned lines differ from the previous line
by only a few pels. For example, if a line contains a black -run then the next line
will normally contain the same run plus or minus up to three pels. With MMR
coding the run -length associated with a line are indentified by comparing the line
contents. Known as the coding line(CL) , relative to the immediately preceding line
, Known as the reference line(RL)
MMR m達 h坦a khai th叩c th畛c t畉 l h畉u h畉t c叩c d嘆ng qu辿t kh叩c v畛i c叩c d嘆ng tr動畛c
坦 ch畛 l m畛t vi pels.V鱈 d畛, n畉u m畛t d嘆ng ch畛a m畛t mu en ch畉y sau 坦 d嘆ng
ti畉p theo th担ng th動畛ng s畉 ch畛a ch畉y c湛ng c畛ng ho畉c tr畛 ba pels.V畛i MMR m達 h坦a
ch畉y di k畉t h畛p v畛i m畛t d嘆ng c坦 th畛 x叩c 畛 nh b畉ng c叩ch so s叩nh c叩c n畛i dung
d嘆ng. 動畛c bi畉t 畉n nh動 d嘆ng m達 h坦a (CL), li棚n quan 畉n d嘆ng tr動畛c li畛n k畛,
動畛c bi畉t 畉n nh動 l d嘆ng tham chi畉u (RL) ......
this is the case when the run-length is the reference line b1b2 is to the left of the
next run-length in the coding line a1a2 , that is , b2 is to the left of a1. an example
is given in 3,12. and for this mode , the run length b1b2 is coded using the
5. codewords given in 3.11 . note that if the next pel on the coding line, a1 is directly
below b2 then thi is not pass mode.
但y l tr動畛ng h畛p khi chi畛u di ch畉y trong 動畛ng tham chi畉u b1b2 l b棚n tr叩i c畛a
k畉 ti畉p ch畉y di trong A1A2 m達 h坦a 動畛ng, 坦 l, b2 l b棚n tr叩i c畛a a1. m畛t v鱈 d畛
動畛c 動a ra trong 3,12. v ch畉 畛 ny, b1b2 ch畉y di 動畛c m達 h坦a b畉ng c叩ch s畛
d畛ng c叩c t畛 m達 動畛c 動a ra trong 3,11. l動u 箪 r畉ng n畉u pel ti畉p theo tr棚n d嘆ng m達
h坦a, a1 l tr畛c ti畉p d動畛i 但y b2 sau 坦 i畛u ny l kh担ng v動畛t qua ch畉 畛.
this is the case when the run-length in the reference line b1b2 ovelaps the next
run -length in the coding line a1a2 by a maximum of plus or minus 3 pels. two
example are given in 3.11 . and for this mode ,just the difference run-length a1b1
is coded . most codewords arein this category
但y l tr動畛ng h畛p khi do di trong d嘆ng tham chi畉u b1b2 ch畛ng ch辿o ti棚p theo
ch畉y di trong A1A2 m達 h坦a 動畛ng t畛i a l c畛ng ho畉c tr畛 3 pels.v鱈 d畛 hai 動畛c
動a ra trong 3,11.v ch畉 畛 ny, ch畛 c畉n a1b1 kh叩c nhau ch畉y di 動畛c m達
ho叩.codewords trong th畛 lo畉i ny
this is the case when the run-length in the reference line b1b2 overlaps the run -
length a1a2 by more than plus or minus 3 pels . for this mode , the two run-
length a0a1 and a1a2 are coded using the codewords in 3.11
但y l tr動畛ng h畛p khi do di trong d嘆ng tham chi畉u b1b2 ch畛ng l棚n A1A2 ch畉y di
h董n c畛ng ho畉c tr畛 3 pels.cho ch畉 畛 ny, hai ch畉y di a0a1 v A1A2 動畛c m達 h坦a
b畉ng c叩ch s畛 d畛ng c叩c t畛 m達 trong 3,11
a flowchart of the coding procedure is show 3.13. note that this first a0 is set to an
imaginary white pel before the first pel of the line and hence the first a0a1 run-
length will be a0a1 -1 . if during the coding of the a line a1, a2,b1,or b2 are not
detected , then they are set to an imaginary pel positioned immediately after the
last pel on the respective line
m畛t s董 畛 c畛a quy tr狸nh m達 h坦a l hi畛n th畛 3,13.l動u 箪 r畉ng a0 畉u ti棚n 動畛c thi畉t
l畉p m畛t pel tr畉ng t動畛ng t動畛ng tr動畛c khi pel 畉u ti棚n c畛a d嘆ng v do 坦 c叩c a0a1
ch畉y di 畉u ti棚n s畉 l a0a1 -1.n畉u trong qu叩 tr狸nh m達 h坦a c畛a a1, a2, b1 m畛t
6. d嘆ng, ho畉c b2 kh担ng 動畛c ph叩t hi畛n, sau 坦 h畛 動畛c thi畉t l畉p 畛 m畛t pel 畉o v畛 tr鱈
ngay l畉p t畛c sau khi pel cu畛i c湛ng tr棚n d嘆ng t動董ng 畛ng
once the first/next position of a0 has been determined, the positions of a1 .a2, b2
for the next codeword are located. the mode is then determined by computing
the position of b2 relative to a1. if it is to the left , this indicates passmode. if it is
not to the left ,then the magnitude of a1b1 is used to determine whether the
mode is vertical or horizontal. the codeword for the identified mode is then
computed end the start or the next codeword position a0 updated to the
appropriate position. This procedure repeats alternately between white and black
runs until the end of the line is reached. This is an imaginary pel positioned
immediately after the last pel of the line and is assumed to have a different color
from the last pel. The current coding line then becomes the new references line
and the next scanned line the new coding line.
m畛t khi c叩c v畛 tr鱈 畉u ti棚n / ti畉p theo c畛a a0 達 動畛c x叩c 畛 nh, v畛 tr鱈 c畛a a1 a2,
b2 cho t畛 m達 ti畉p theo 動畛c 畉t.ch畉 畛 sau 坦 動畛c x叩c 畛 nh b畉ng c叩ch t鱈nh to叩n
c叩c v畛 tr鱈 t動董ng 畛i b2 a1.n畉u n坦 l b棚n tr叩i, i畛u ny cho passmode.n畉u n坦 kh担ng
ph畉i l b棚n tr叩i, sau 坦 l 畛 l畛n c畛a a1b1 動畛c s畛 d畛ng 畛 x叩c 畛 nh xem ch畉 畛
l vertical or horizontalth畉ng 畛ng hay n畉m ngang.t畛 m達 cho ch畉 畛 動畛c x叩c
畛 nh sau 坦 動畛c t鱈nh to叩n b畉t 畉u ho畉c k畉t th炭c t畛 m達 ti畉p theo v畛 tr鱈 a0 c畉p
nh畉t 畉n v畛 tr鱈 th鱈ch h畛p.Th畛 t畛c ny l畉p i l畉p l畉i lu但n phi棚n gi畛a ch畉y tr畉ng v
mu en cho 畉n khi 畉t 畉n k畉t th炭c c畛a d嘆ng 動畛c.但y l m畛t pel 畉o v畛 tr鱈
ngay l畉p t畛c sau khi pel cu畛i c湛ng c畛a d嘆ng v 動畛c gi畉 畛 nh l c坦 m畛t mu kh叩c
nhau t畛 pel cu畛i c湛ng.D嘆ng hi畛n t畉i m達 h坦a sau 坦 tr畛 thnh d嘆ng c叩c ti li畛u
tham kh畉o m畛i v d嘆ng k畉 ti畉p qu辿t c叩c m達 h坦a d嘆ng m畛i.
since the coded run-length relate to one of the three modes, additional
codewords are used either to indicate to which mode the following codeword
relate -pass or horizontal-or to spectify the length of the codeword directly -
vertical .the additional codewords are given in a third table known as the two-
dimensional code table .its content are as shown in table 3,1. the final entry in the
table , known as the extension mode, is a unique codeword that aborts the
encoding operation prematurely before the end of the page. this is provice to
7. allow a portion of the a page to be sent in its uncompressed form or possibly with
a different coding scheme.
k畛 t畛 khi m達 ch畉y di li棚n quan 畉n m畛t trong ba ch畉 畛, codewords th棚m 動畛c
s畛 d畛ng ho畉c 畛 cho bi畉t ch畉 畛 t畛 m達 sau 但y li棚n quan-pass ho畉c ngang ho畉c
spectify chi畛u di c畛a t畛 m達 tr畛c ti畉p d畛c c叩c codewords b畛 sung 動畛c 動a
ra.trong m畛t b畉ng th畛 ba 動畛c g畛i l b畉ng m達 hai chi畛u. n畛i dung c畛a n坦 動畛c th畛
hi畛n trong b畉ng 3,1.m畛c cu畛i c湛ng trong b畉ng, 動畛c g畛i l ch畉 畛 m畛 r畛ng, l m畛t
t畛 m達 duy nh畉t h畛y b畛 ho畉t 畛ng m達 h坦a s畛m tr動畛c khi k畉t th炭c c畛a trang.但y l
T畛 nh 畛 cho ph辿p m畛t ph畉n c畛a m畛t trang 動畛c g畛i 畛 d畉ng kh担ng n辿n c畛a n坦
ho畉c c坦 th畛 v畛i m畛t ch動董ng tr狸nh m達 h坦a kh叩c nhau.