�ݺ�ߣ

Network
measurement

Jeromy Fu

Presentation_ID © 2008 Cisco Systems, Inc. All rights reserved. Cisco Confidential 1

Agenda
 Why what and how
 Roadmap
 Bandwidth measurement
 Current Implementation
 Future work

Source: Placeholder for Notes is 14 points

Why measurement is needed
 A big black cloud
 No explicit feedback


Application area
 Congestion control(QoS, transport layer etc)

 Overlay networks, (relay, overlay route etc)

 CDNs (select best server)

 Streaming(adjust encoding rate)

 And many more…


What to measure?

Metrics Tools
RTT ping
Jitter iperf
Packet Loss ping
Avail bandwidth
Bottleneck
Link capacity
Throughput iperf
Route info traceroute
MTU ping
Topology GNP ， Skitter


How to measure?

 Using special probing packets or application packets

 The aim:

accuracy, when cross traffic exist

non-intrusiveness, do not saturate path

timeless


Roadmap
 Measurement bandwidth using probing packets

 Link congestion detection using probing packets

 Congestion group identification based on congestion
similarity.

 Using app packets instead of probing packets

 Topology(GNP like ordinate system etc)


Bandwidth measurement
 One packet model and packet pair (train) model

 link capacity, bottleneck bandwidth, available bandwidth.

 Lots of experiment tools exist, but none production exist,
the most previous tools are using TCP flooding.


Terminology


Terminology
 Hop : Link at layer 3
 Segment : Link at layer2


Terminology
 Link aggregation
 http://wenku./view/64d752a6f524ccbff12184ca.htm


Vuze use Network Diagnostic Tool
 http://www.measurementlab.net/measurement-lab-tools#tool1
 http://netspeed.stanford.edu/


uTorrent use M-labs tool pathload2
 http://www.utorrent.com/faq#mlabs
http://www.measurementlab.net/measurement-lab-tools#tool4


Limitation of Tcp Throughput
 Other metrics may have significant effect on TCP
throughput （ TCP is inefficient in high BDP networks
and packet loss link)

 Other applications and transport protocols (e.g. for video
and audio streaming) have different performance
characteristics.

 Too intrusive, place too much additional load on the
network.


Available bandwidth
 Tools

pathload, pathchrip, IGI/PTR

 Method

Self-Induced Congestion


IGI/PTR insight


IGI/PTR Algorithm


Pathload Insight


Pathload Algorithm


(Path or hop) capacity
 Tools

pathchar, click, etc

 One packet Model

Measures per-hop capacity, using icmp packets, like
traceroute


One packet Model


One packet Model
 Sender set TTL=1, send out the packet, and wait for the
ICMP TTL-exceeded packet back.

 Upon receiving ICMP, estimate the RTT. Estimate the
RTT multiple times for various size packets. The
minimum RTT of various packets are believed to be the
valid sample.

 The first link capacity is C=1/b , b is slope of RTT graph.

 Set the TTL=2,3…n, repeat the process of step1 to 3, to
Calculate the C=1/ bi – bi-1

One packet Model
 Transmission delay is linear with respect to packet size.

 Most implementation use RTT instead of one-way delay.

 Using linear regression to filter the queue results.

 Links are single-channel


Drawbacks
 Linear regression is expensive (done for every link, need
many packets, can alleviate through convergence of
result).

 ACKs may not be sent in timely manner (ICMP packets
are often limited or blocked).

 Some nodes are invisible (such as bridge etc work in
layer 2, thus won’t decrease IP TTL and no icmp ack),
layer2 effect (underestimate lay3 capacity)

 Reverse path adds noise. Response packets may come
back through a different path.

Bottleneck bandwidth
 Pathrate,
capprobe,
udt etc

 Packet pair
Model


Packet pair Model
 Cross traffic

Time compression: Other packet queue ahead of the
first probe packet when it is downstream of the
bottleneck link. This leads to high estimates.

Time extension: Other packets delay the second probe
packet and extend the spacing between the two probe
packets. This leads to low estimates.

 Only support FIFO-queuing of router

 Doesn’t support multi-channel links

Packet pair Model
 Transmission time of L-byte packet at link with capacity
C. t = L/C
 Send two packets ‘back-to-back’ from source to sink/
 Measure dispersion of packet pair at receiver.


Drawbacks
 Though simple, Packet pair technique can produce
widely varied estimate and erroneous results, mainly
due to cross traffic in the path and error in
measurement(it relies on high precision timestamp)


Packet Train Model
 Packet train of length N. Source can send N back-to-
back packets of size L to sink.

 Sink measures total dispersion D, computes bandwidth
estimate as b = (N-1)L/D.

 eliminate measurement errors, but more likely to be
interfere with cross traffic packets.


Quick Review
 Active probing including three kinds of method, one
packet, packet pair and packet train.
 All assumes store-and-forward behavior of the
intermediate node.
 All works on single channel.
 Receive based or Sender based.
 All have their pros and crons.


Which one is better
 Most people said their tool is better than the others.
No business product using yet.
 Do test of those tools by ourselves.
 We need a benchmark tool first. Here comes iperf.


Which one is better


Which one is better
 After overall test on current implemented tools in
various environments including nistnet environment
and ADSL environment .
 Unfortunately, none of them gives reasonable result
in both environment.
 Iperf works well.
 For more information, pls refer to “iperf.doc” and “bw
tech report.docx”


Previous implementation
 More or less like iperf. measure throughput, but not
TCP.
 Based on UDT, which uses UDP for reliable data
transfer . UDT has its own flow/congestion control
algorithm which is more efficient for data transfer than
TCP.

 UDT has very flexible design which enable using
used defined flow/congestion control algorithm. Good
for later optimization, for example, slow down when
detecting OWD increasing trend, so not affecting
normal traffic.

Problems remains
 Flooding way will affect the normal traffic and
interfere the user.
 UDT is too aggressive, even for constant rate UDP
stream.
 Should review the research materials before to find a
better solution.
 Besides, flooding way is not feasible when we need
the metrics about the network most of the time, for
example in QoS.
 Need further research into these area, so new version
of netdect in being developed.


We need better solution
 Think a litter about what bandwidth, how bandwidth
are limited?

 Some experiments.

 Implementation details.


Physical layer net bit rate

56 kbit/s Modem / Dialup
1.5 Mbit/s ADSL Lite
11 Mbit/s Wireless 802.11b
54 Mbit/s Wireless 802.11g
100 Mbit/s Fast Ethernet
155 Mbit/s OC3
300 Mbit/s Wireless 802.11n
622 Mbit/s OC12
1 Gbit/s Gigabit Ethernet
2.5 Gbit/s OC48
9.6 Gbit/s OC192
10 Gbit/s 10 Gigabit Ethernet
100 Gbit/s 100 Gigabit Ethernet


Physical layer net bit rate
Version Common name Downstream rate Upstream rate

ADSL ADSL 8.0 Mbit/s 1.0 Mbit/s

ADSL ADSL (G.DMT) 12.0 Mbit/s 1.3 Mbit/s

ADSL ADSL over POTS 12.0 Mbit/s 1.3 Mbit/s

ADSL ADSL over ISDN 12.0 Mbit/s 1.8 Mbit/s

ADSL ADSL Lite (G.Lite) 1.5 Mbit/s 0.5 Mbit/s

ADSL2 ADSL2 12.0 Mbit/s 1.3 Mbit/s

ADSL2 ADSL2 12.0 Mbit/s 3.5 Mbit/s

ADSL2 RE-ADSL2 5.0 Mbit/s 0.8 Mbit/s

ADSL2 splitterless ADSL2 1.5 Mbit/s 0.5 Mbit/s

ADSL2+ ADSL2+ 24.0 Mbit/s 1.3 Mbit/s

ADSL2+ ADSL2+M 24.0 Mbit/s 3.3 Mbit/s


Bandwidth cap
 limits the transfer of a specified amount of data over a
period of time.
 Internet service providers commonly apply a cap
when a channel intended to be shared by many users
becomes overloaded, or may be overloaded, by a few
users.
 Different approaches exist, including simple limitation
of rate on user and sophisticate strategy based on
credit.
 This is what we are interesting.


What affects user’s observed throughput
 The ideal throughput is the physical layer net bit rate.
If in Ethernet, it’s 100Mbps.

 Latency is not directly related to the throughput, but it
has effect on specific transport protocol, for example,
TCP(TCP’s self-clocking based on RTT). That’s to
say congestion control algorithm will affect the
throughput.

 Packet drops affect the throughput.

 Bottleneck link affect the throughput.


Two ways of bandwidth limitation

 Drop

 Buffer and then drop


How Nistnet works?


How Nistnet works?
 Bandwidth limitation is implemented as adding delay,
just like a packet go through a bottleneck link.

 Determine the amount of time to delay a packet. This
is the maximum of two quantities:
1. Probabilistic packet delay time
2. Bandwidth-limitation delay time


How Nistnet works?
 probdelay = correlatedtabledist(&tableme->ltEntry.lteIDelay);

if (hitme->hitreq.bandwidth) {
fixed_gettimeofday(&our_time);
bandwidthdelay = timeval_diff(&hitme->next_packet, &our_time);

if (bandwidthdelay < 0) {
bandwidthdelay = 0;
hitme->next_packet = our_time;
}

packettime = (long)skb->len*(MILLION/hitme->hitreq.bandwidth)

+ ((long)skb->len*(MILLION%hitme->hitreq.bandwidth)
+ hitme->hitreq.bandwidth/2)/hitme->hitreq.bandwidth;
timeval_add(&hitme->next_packet, packettime);
bandwidthdelay += packettime;
}

delay = probdelay > bandwidthdelay ? probdelay : bandwidthdelay;

Ping in real life
 124.160.32.248(Netcom office DMZ) ping
216.24.133.8(pc in Denver)
 Minimum = 210ms, Maximum = 234ms, Average =
213ms


Ping in real life
 218.109.124.61(Huashu Netcom) ping
124.160.32.248(Netcom office DMZ)
31ms


Ping in real life with cross traffic
 Daisy’s ADSL ping www.google.com, adding TCP
upstream cross traffic
98ms


 Daisy’s ADSL ping www.google.com, adding 1.5 MB /s
UDP upstream cross traffic saturating the link.
 Minimum = 20ms, Maximum = 705ms, Average = 634ms


 Netcom ADSL 124.90.150.57 ping DMZ
124.160.32.248 , using TCP downstream cross traffic
saturating the link.
 Minimum = 4ms, Maximum = 8ms, Average = 4ms, Loss
0%


 Netcom ADSL 124.90.150.57 ping DMZ
124.160.32.248 , using 8 MB/s UDP downstream cross
traffic saturating the link.
 Minimum = 4ms, Maximum = 7ms, Average = 4ms, Loss
0%, UDP loss reported by iperf 74%.


Clock Skew
 This is not a big problem for using increasing OWDs
as a hint of congestion.
 According to the paper of pathload, the typical clock
skew is 10-100 us per second(in my test, it's
0.1ms=100us per second)
 for 1ms precision is used, we should limit the error
less than 1ms.
 So, for 0.1ms per second clock skew, we should
collect data in less than10 seconds, if for 10us per
second clock skew, we should collect data in 100s.


Clock Skew
 The relative OWDs can be distorted by possible
skew between then sender and receiver clocks.
 Measure from both direction.


Clock Skew
 In the test, the skew between the two different
type of machines is 1ms per 10s, that is 0.1ms per
second.


OWD increasing trend
 Send rate Rs, Avali bw Ra
Buffered_size = Rs* t – Ra * t
Delay = buffered_size/Ra = (Rs/Ra -1) t
 Test in nistnet, set bandwidth limitation 100K with
sending rate of 100K


OWD and packet loss in China ADSL
 DMZ 124.160.32.248 -> Netcom ADSL 124.90.150.57
 Sending rate 500KB/s.
 rcv_rate = 329 KB/s , loss : 32% (158/491)


Trend algorithm (Pathload)
 PCT (Pairwise Comparison Test): measures the fraction
of consecutive pairs that are increasing. if there's a
strong increasing trend, it approaches one.

 PDT (Pairwise Difference Test): quantifies how strong is
the start-to-end variation, relative to the absolute
variations. if there is a strong increasing trend, it
approaches one.


Spearman’s Rank Correlation Coefficient
 Spearman’s Rank Correlation Coefficient is used to
detect monotonic trend. The value is in range [-1,1],
and the more approaches 1, the stronger the
increasing trend.


How to calculate
 n raw scores Xi,Yi are converted to ranks xi, yi
 di = xi – yi
 Using the formulae.

 Refer to http://www.wikihow.com/Calculate-
Spearman's-Rank-Correlation-Coefficient
 This web page will do Spearman rank correlation.


Compare of the two
 spearman : 0.821005081875
 PCT: 0.47619047619
 PDT: 0.393939393939


Work on better solution
 Firstly, it should be test and work well in specific
nistnet environment. We need reproducible
environment.
 Data should be collected for post- analyze.
 Should ensure collected data is not twisted by some
inefficiency, for example, writing logs or doing
calculation.
 Every abnormal case should be analyze carefully for
the root cause, mostly there’re some bugs.
 Need some tools to help on the analyze.


Tools developed for analyzing
 pingtrend.py – using ping.exe to collect data and
record the result into file, pingtrend.py can be used to
filter out the ping values and dump them in ‘pingrtt.txt’,
which can be used in combination with other tools.
 trend.py – Give a sequence of values, plot them in a
diagram and calculate increasing trend, including
PCT, PDT and spearman correlation coefficient.
 logparser.py – Analyze the log of netdect.exe,


Sequence loopback
 ISN would better be random.
 Many protocols and algorithms require the
serialization or enumeration of related entities. For
example, a communication protocol must know
whether some packet comes "before" or "after" some
other packet. The IETF RFC 1982 attempts to define
"Serial Number Arithmetic" for the purposes of
manipulating and comparing these sequence
numbers.


Feedback
 Too many feedbacks will add the processing
overload , and will twisted the latency and may slow
down the sending rate.
 The feedback can be timely based or packet based.
 Timely based feedback won't cost too much reverse
traffic but there maybe not enough samples when
congestion happens.
 Which is better is under consideration, currently, for
bandwidth measurement, there’s no need to send
those feedbacks, all will be calculated at the server
side and give the client the result.
 However it’s needed in other aspects(identify the
share bottleneck, make bw measurement tcp friendly)

Logs
 DO NOT use stdout log for it's the performance killer,
writing them to file logs.
 If needed, I think using shared memory and open
another process to flush the logs into file will be the
best.
 However, using file logs is enough.
 Using txt file log which is convenient for later
processing.


Packet size and probe period
 Previously mentioned that clock skew would be 1ms
per 10s, this won’t be a problem for bw measurement
for it’s less than this. The probe period is the less the
better.
 Now the problem becomes how can we collect
enough samples in a limited time period, while using a
specific sending rate.
 Let's consider 10KB/s, and 1KB per packet, so
there're only 10 packets in one sec, that means 0.001
pkt in 10ms, so we should adjust the pkt size
according to the sending rate.
 Dynamic adjust pkt size to get enough samples.
pkt_size = min(t * spd / min_sample_cnt, MTU)

Cope with packet loss
 As mentioned before, some bandwidth cap drops
packets without buffer. So, OWD increasing won’t
work.
 How to tell from loss caused by bandwidth cap and
other cases
 Loss rate caused by bandwidth cap shows strong
correlation with sending rate.


Netcom -> Netcom
 Pearson similarity: 0.776851619493
 spearman : 1.0


Netcom -> Telcom
 Pearson similarity: -0.233247151714
 spearman : 0.155357142857


Nistnet 30% packet loss
 Pearson similarity: 0.171674559023
 spearman : -0.0428571428571


Current implementation
 Mainly use OWD increasing trend as a hint of
congestion.
 Cope with China ADSL, which drops packets when
upper to limitation(Find the tuning point of packet loss
increasing, the packet loss will correlation with
sending rate when up to the limitation.
 Support packet pair, but it’s not accurate, in LAN, it
measured bw is 5MB/s, nearly half of the capacity.
 Use packet Train as a hint of avail bw suggestion.
 Start binary search. Add error recovery.
 Works well in nistnet env.
 Can cope with constant UDP cross traffic.

To-do
 Be TCP friendly, it should work if it can detect
congestion quicker than TCP, if OWD increase it’s
possible(Using feedback packets), but it directly drop,
then it will be more complicate. Even flooding method
can’t guaranty this.
 Grouping congestion path.
 Using APP packets(stream of different type) to identify
congestion and groups.


Reference
 Topics in High-Performance Messaging
http://www.29west.com/docs/THPM/index.html
 Spearman Rank-Order Correlation Coefficient
http://faculty.vassar.edu/lowry/corr_rank.html
http://www.wikihow.com/Calculate-Spearman's-Rank-Correlation-
Coefficient
http://geographyfieldwork.com/SpearmansRank.htm
 Correlation and linear regression
http://udel.edu/~mcdonald/statregression.html
http://www.statisticssolutions.com/methods-chapter/statistical-
tests/correlation-pearson-kendall-spearman/
 Free Statistical Software
http://statpages.org/javasta2.html
 ISN
http://lin-style.javaeye.com/blog/156950
http://www.faqs.org/rfcs/rfc1982.html
http://en.wikipedia.org/wiki/Serial_number_arithmetic
http://kerneltrap.org/node/4654

Test bed
 Nistnet used for minic the
wide area network
enviroment.

 Spirent (Hardware
emulator) which gives
more powerful control
than nistnet.

 ADSL env in the Lab

 PlanetLab (On progress)


Autotest Tool
 Not yet extensible now, but
if having more
requirements, it will be, can
be used by others, not limit
to network detect.

 User register for the test.

 Set the settings of the
test( autorun time, working
directory)

 Autorun, analyzed and
generate reports.

Q and A


�ݺ�ߣ

Bandwidth measurement

More Related Content

Bandwidth measurement