�ݺ�ߣ

An Introduction

Kanishka Khandelwal-BCSE IV , JU 3/20/2012

Flip a coin

An algorithm which flip coins is called a randomized algorithm.


Making decisions could be complicated.
A randomized algorithm is simpler.

Consider the minimum cut problem

Randomized algorithm?
Pick a random edge and contract.
And Continue until two vertices are left


Making good decisions could be expensive.

A randomized algorithm is faster.

Consider a sorting procedure.
Picking an element in the middle makes the procedure very efficient,
but it is expensive (i.e. linear time) to find such an element.

5 9 13 11 8 6 7 10

5 6 7 8 9 13 11 10

Picking a random element will do.


 Avoid worst-case behavior: randomness can
(probabilistically) guarantee average case behavior
 Efficient approximate solutions to intractable
problems
 In many practical problems,we need to deal with
HUGE input,and don’t even have time to read it
once.But can we still do something useful?


Deterministic
Input Output
Computer

Random Bits

www.lavarnd.org
(doesn’t use lava lamps
anymore)


 Randomized algorithms make random rather than
deterministic decisions.
 The main advantage is that no input can reliably
produce worst-case results because the algorithm runs
differently each time.
 These algorithms are commonly used in situations
where no exact and fast algorithm is known.
 Behavior can vary even on a fixed input.


 Minimum spanning trees

A linear time randomized algorithm,
but no known linear time deterministic algorithm.
 Primality testing

A randomized polynomial time algorithm,
but it takes thirty years to find a deterministic one.
 Volume estimation of a convex body

A randomized polynomial time approximation algorithm,
but no known deterministic polynomial time approximation algorithm.


Monte Carlo Las Vegas


 Always gives the true answer.
 Running time is random.
 Running time is variable whose expectation is
bounded(say by a polynomail).
 E.g. Randomized QuickSort Algorithm


 It may produce incorrect answer!
 We are able to bound its probability.
 By running it many times on independent random
variables, we can make the failure probability
arbitrarily small at the expense of running time.
 E.g. Randomized Mincut Algorithm


 Suppose we want to find a number among n given
numbers which is larger than or equal to the median.


Suppose A1 < … < An.
We want Ai, such that i ≥ n/2.

It’s obvious that the best deterministic algorithm needs
O(n) time to produce the answer.
n may be very large!
Suppose n is 100,000,000,000 !


 Choose 100 of the numbers with equal probability.
 find the maximum among these numbers.
 Return the maximum.


 The running time of the given algorithm is O(1).
 The probability of Failure is 1/(2100).
 Consider that the algorithm may return a wrong
answer but the probability is very smaller than the
hardware failure or even an earthquake!


 QUICKSORT


 QuickSort is a simple and efficient approach to
sorting:
 Select an element m from unsorted array c and divide
the array into two subarrays:
csmall - elements smaller than m and
clarge - elements larger than m.
 Recursively sort the subarrays and combine them
together in sorted array csorted


1. QuickSort(c)
2. if c consists of a single element
3. return c
4. m  c1
5. Determine the set of elements csmall smaller
than m
6. Determine the set of elements clarge larger
than m
7. QuickSort(csmall)
8. QuickSort(clarge)
9. Combine csmall, m, and clarge into a single
array, csorted
10. return csorted


 Runtime is based on our selection of m:

-A good selection will split c evenly such that
|csmall | = |clarge |, then the runtime is O(n log n).

-For a good selection, the recurrence relation is:
T(n) = 2T(n/2) + const ·n

The time it takes to Time it takes to split the
sort two smaller array into 2 parts where
arrays of size n/2 const is a positive constant

However, a poor selection will split c unevenly and in the
worst case, all elements will be greater or less than m
so that one subarray is full and the other is empty. In
this case, the runtime is O(n2).

For a poor selection, the recurrence relation is:
T(n) = T(n-1) + const · n

The time it takes to sort Time it takes to split the array
one array containing n-1 into 2 parts where const is a
elements positive constant

 QuickSort seems like an ineffecient MergeSort
 To improve QuickSort, we need to choose m to be a
good ‘splitter.’
 It can be proven that to achieve O(nlogn) running
time, we don’t need a perfect split, just reasonably
good one. In fact, if both subarrays are at least of size
n/4, then running time will be O(n log n).
 This implies that half of the choices of m make good
splitters.


 To improve QuickSort, randomly select m.
 Since half of the elements will be good splitters, if we
choose m at random we will get a 50% chance that m
will be a good choice.
 This approach will make sure that no matter what
input is received, the expected running time is small.


1. RandomizedQuickSort(c)
2. if c consists of a single element
3. return c
4. Choose element m uniformly at random from c
5. Determine the set of elements csmall smaller
than m
6. Determine the set of elements clarge larger than
m
7. RandomizedQuickSort(csmall)
8. RandomizedQuickSort(clarge)
9. Combine csmall, m, and clarge into a single
array, csorted
10. return csorted


 Worst case runtime: O(m2)
 Expected runtime: O(m log m).
 Expected runtime is a good measure of the
performance of randomized algorithms, often more
informative than worst case runtimes.


 Making a random choice is fast.
 An adversary is powerless; randomized algorithms
have no worst case inputs.
 Randomized algorithms are often simpler and faster
than their deterministic counterparts.


 In the worst case, a randomized algorithm may be very
slow.
 There is a finite probability of getting incorrect answer.
 However, the probability of getting a wrong answer can
be made arbitrarily small by the repeated employment
of randomness.
 Getting true random numbers is almost impossible.


Assignments


 Input: a set of 2D points
 Determine the closest pair (and its dist)
 Input points are stored in an array
 Suppose we have a strange storage data structure D :
 When we give a point to D, it stores the point and
outputs the closest pair of points stored in D
 Our knowledge: Insertion time depends on whether
the closest pair is changed or not.
 If output is the same: 1 clock tick
 If output is not the same: |D| clock ticks


 With random insertion order,
show that the expected total number of clock ticks
used by D is O(n)


Suppose you are given a directed graph with n vertices
and m unit-length edges. Consider the problem of
estimating the number of vertices within distance d of
each vertex. Give a fully polynomial approximation
scheme that solves this problem simultaneously for all
vertices for any fixed d.


�ݺ�ߣ

Randomized Algorithm

More Related Content

Randomized Algorithm