Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Efficient Algorithm for Finding Kth Smallest Element: Median of Medians - Prof. Hyeong-Ah , Study notes of Computer Science

Class notes on the median of medians algorithm, an efficient method for finding the kth smallest element in a list of n elements in o(n) time. The algorithm recursively partitions the list around the median-of-medians, which is found by dividing the list into groups of size 7, finding the median of each group, and recursively finding the median of the list of medians. The time complexity of the algorithm is analyzed and proven to be o(n) for lists with n ≥ 48 elements.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-yah-1
koofers-user-yah-1 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSCI 212 - Class Notes
02/10/09
Select Problem
Given: A list Aof nelements a1, a2, . . . , anin an arbitrary order and an integer k, 1 kn.
Objective: Find the kth smallest element in O(n) time.
We find the desired element by recursively partitioning the list around the median-of-medians. To
find the median-of-medians, we divide the list into groups of size 7, create a list of the medians
from each group, and find the median of the list of medians recursively.
The elements are grouped as shown below. When the total number of elements is not evenly di-
visible by 7, i.e., n6= 7bn
7c, the remaining elements a7bn
7c+1, . . . , anare not included in any of the
groups. Note that we used groups of size 5 when previously discussing the algorithm.
a1, a2, . . . , a7
| {z }
g1
, a8, a9, . . . , a14
| {z }
g2
, . . . , a7(i1)+1, . . . , a7i
| {z }
gi
, . . . , a7(bn
7c−1)+1, . . . , a7bn
7c
| {z }
gbn
7c
Our algorithm appears below. We denote with mithe median of group gi.
Select(A,k)
1. Compute M, the list of the medians from each group. M {m1, m2, . . . , mbn
7c}.
2. Compute m, the median of the list of medians. mSelect(M,db n
7c/2e).
3. Partition the elements of Ainto three groups L,E, and Rsuch that (i) aL, a < m; (ii)
aE, a =m; and (iii) aR, a > m.
4. If |L|< k |L|+|E|, return m.
If k < |L|, return Select(L,k).
Otherwise, return Select(R,k |L|−|E|).
Let T(n) denote the time complexity of calling Select on a list with nelements. From the four steps
of the algorithm, T(n) = O(n) + T(bn
7c) + O(n) + T(max {|L|,|R|}). Combining terms and forming
an upper bound, we have T(n)cn +T(n
7) + T(max {|L|,|R|}). To obtain an upper bound for
|L|and |R|, consider each group to be a sorted column of elements in a matrix containing exactly
one column for each group. Now, arrange the columns so that the medians, i.e., middle elements
of the columns, are in a non-decreasing order. The number of elements in Athat are m(m)
is 4§bn
7/2c¨2bn
7c. This implies that |R|(|L|), the number of elements in Athat are greater
(less) than m, is n2bn
7c n2¡n6
7¢=5
7n+12
7.
We now find two constants αand βsuch that |L| 5
7n+12
7βn,|R| 5
7n+12
7βn, and
T(n)αcn. By substitution, we have T(n)cn +T(n
7) + T(βn). This implies that we need
cn +αcn
7+αcβn αcn. Simplifying, we have 1 + α
7+αβ α1 + αβ 6
7αβ < 6
7. For β,
the previously-obtained upper bound combined with the lower bound implied from 5
7n+12
7βn
yields 5
7< β < 6
7. We choose β=3
4. Substituting for β, we have 1 + 3
4α6
7αα28
3. We
choose α= 10. We derived our choices of αand βbased on 5
7n+12
7βn, which can be shown, by
1
pf2

Partial preview of the text

Download Efficient Algorithm for Finding Kth Smallest Element: Median of Medians - Prof. Hyeong-Ah and more Study notes Computer Science in PDF only on Docsity!

CSCI 212 - Class Notes

Select Problem

Given: A list A of n elements a 1 , a 2 ,... , an in an arbitrary order and an integer k, 1 ≤ k ≤ n. Objective: Find the kth smallest element in O(n) time.

We find the desired element by recursively partitioning the list around the median-of-medians. To find the median-of-medians, we divide the list into groups of size 7, create a list of the medians from each group, and find the median of the list of medians recursively.

The elements are grouped as shown below. When the total number of elements is not evenly di- visible by 7, i.e., n 6 = 7b n 7 c, the remaining elements a 7 b n 7 c+1,... , an are not included in any of the groups. Note that we used groups of size 5 when previously discussing the algorithm.

a 1 , a 2 ,... , a 7 ︸ ︷︷ ︸ g 1

, a 8 , a 9 ,... , a 14 ︸ ︷︷ ︸ g 2

,... , a7(i−1)+1,... , a 7 i ︸ ︷︷ ︸ gi

,... , a7(b n 7 c−1)+1,... , a 7 b n 7 c ︸ ︷︷ ︸ gb n 7 c

Our algorithm appears below. We denote with mi the median of group gi.

Select(A,k)

  1. Compute M , the list of the medians from each group. M ← {m 1 , m 2 ,... , mb n 7 c}.
  2. Compute m∗, the median of the list of medians. m∗^ ← Select(M ,db n 7 c/ 2 e).
  3. Partition the elements of A into three groups L, E, and R such that (i) ∀a ∈ L, a < m∗; (ii) ∀a ∈ E, a = m∗; and (iii) ∀a ∈ R, a > m∗.
  4. If |L| < k ≤ |L| + |E|, return m∗. If k < |L|, return Select(L,k). Otherwise, return Select(R,k − |L| − |E|).

Let T (n) denote the time complexity of calling Select on a list with n elements. From the four steps of the algorithm, T (n) = O(n) + T (b n 7 c) + O(n) + T (max {|L|,|R|}). Combining terms and forming an upper bound, we have T (n) ≤ cn + T ( n 7 ) + T (max {|L|,|R|}). To obtain an upper bound for |L| and |R|, consider each group to be a sorted column of elements in a matrix containing exactly one column for each group. Now, arrange the columns so that the medians, i.e., middle elements of the columns, are in a non-decreasing order. The number of elements in A that are ≤ m∗^ (≥ m∗) is ≥ 4

b n 7 / 2 c

≥ 2 b n 7 c. This implies that |R| (|L|), the number of elements in A that are greater (less) than m∗, is ≤ n − 2 b n 7 c ≤ n − 2

( (^) n− 6 7

= 57 n + 127.

We now find two constants α and β such that |L| ≤ 57 n + 127 ≤ βn, |R| ≤ 57 n + 127 ≤ βn, and T (n) ≤ αcn. By substitution, we have T (n) ≤ cn + T ( n 7 ) + T (βn). This implies that we need cn + αc n 7 + αcβn ≤ αcn. Simplifying, we have 1 + α 7 + αβ ≤ α ⇒ 1 + αβ ≤ 67 α ⇒ β < 67. For β, the previously-obtained upper bound combined with the lower bound implied from 57 n + 127 ≤ βn yields 57 < β < 67. We choose β = 34. Substituting for β, we have 1 + 34 α ≤ 67 α ⇒ α ≥ 283. We choose α = 10. We derived our choices of α and β based on 57 n + 127 ≤ βn, which can be shown, by

solving for n, to be true for n ≥ 48. Therefore, T (n) ≤ cn + T ( n 7 ) + T ( 34 n ), for n ≥ 48. Given our value of α, we prove using induction that T (n) ≤ 10 cn.

Basis Case: Let n = 48. T (n) ≤ 10 cn for suitably large value of c.

Inductive Hypothesis: Assume that T (m) ≤ 10 cm for some value of m ≤ n. We must now show that the claim is true for m = n + 1, i.e., we must show that T (n + 1) ≤ 10 c(n + 1).

Inductive Step: T (n + 1) ≤ c(n + 1) + T ( n+1 7 ) + T ( 3(n 4 +1) ) by definition of T (n). By the inductive

hypothesis, n+1 7 ≤ n ⇒ T ( n+1 7 ) ≤ 10 c( n+1 7 ). Similarly, 3(n 7 +1) ≤ n ⇒ T ( 3(n 4 +1) ) ≤ 10 c( 3(n 4 +1) ).

Therefore, T (n + 1) ≤ c(n + 1) + 10c( n+1 7 ) + 10c( 3(n 4 +1) ) = c(n + 1)(1 + 107 + 304 ) = c(n + 1)( 27828 ) ≤ 10 c(n + 1).

Therefore, it must be the case that T (n) ≤ 10 cn for all values of n ≥ 48. The result implies that T (n) = O(n).