



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Notes; Class: Data Structures and Algorithms; Subject: Computer Science; University: University of New Mexico; Term: Fall 2005;
Typology: Study notes
1 / 6
This page cannot be seen from the preview
Don't miss anything!
Jared Saia
University of New Mexico
Dictionary ADT
Hash Tables
A dictionary ADT implements the following operations
Insert(x)
: puts the item x into the dictionary
Delete(x)
: deletes the item x from the dictionary
IsIn(x)
: returns true iff the item x is in the dictionary
nary asFrequently, we think of the items being stored in the dictio-
keys
The keys typically have
records
associated with them which
implementationare carried around with the key but not used by the ADT
Thus we can implement functions like:
Insert(k,r)
puts the item (k,r) into the dictionary if the
key k is not already there, otherwise returns an error
Delete(k)
: deletes the item with key k from the dictionary
Lookup(k)
: returns the item (k,r) if k is in the dictionary,
otherwise returns null
linked listThe simplest way to implement a dictionary ADT is with a
Let
l be a linked list data structure,
assume we have the
following operations defined for
l
head(l): returns a pointer to the head of the list
next(p): given a pointer
p
into the list, returns a pointer
to the next element in the list if such exists, null otherwise
previous(p):
given
a
pointer
p
into
the
list,
returns
a
null otherwisepointer to the previous element in the list if such exists,
of that itemkey(p): given a pointer into the list, returns the key value
value of that itemrecord(p): given a pointer into the list, returns the record
Implement a dictionary with a linked list
to the item with key k if it is in the dictionary or null otherwiseQ1: Write the operation Lookup(k) which returns a pointer
Q2: Write the operation Insert(k,r)
Q3: Write the operation Delete(k)
For a dictionary with
n
elements, what is the runtime
of all of these operations for the linked list data structure?
Describe how you would use this dictionary ADT to
book.count the number of occurences of each word in an online
If
m
is the total number of words in the online book,
and
n
is the number of unique words, what is the runtime of
the algorithm for the previous question?
This linked list implementation of dictionaries is very slow
Q: Can we do better?
A: Yes, with hash tables, AVL trees, etc
“Key” Idea:
An element with key
k
is stored in slot
h ( k ),
where
h is a
hash function
mapping
into the set
,... , m
Main problem: Two keys can now hash to the same slot
Q: How do we resolve this problem?
making the table large enoughA1: Try to prevent it by hashing keys to “random” slots and
A2: Chaining
A3: Open Addressing
CH-Delete(T,x){delete x from the list T[h(key(x))];}CH-Search(T,k){search for elem with key k in list T[h(k)];} CH-Insert(T,x){Insert x at the head of list T[h(key(x))];}linked list. In chaining, all elements that hash to the same slot are put in a
CH-Insert and CH-Delete take
(1) time if the list is doubly
linked and there are no duplicate keys
Q: How long does CH-Search take?
A: It depends.
In particular,
depends on the
load factor
α
=
n/m
(i.e. average number of elems in a list)
Worst case analysis: everyone hashes to one slot so Θ(
n )
For average case, make the
simple uniform hashing
assump-
mtion: any given elem is equally likely to hash into any of the
slots, indep. of the other elems
Let
n i be a random variable giving the length of the list at
the
i -th slot
Then time to do a search for key
k
is 1 +
(^) n h ( k )
Q: What is
n h ( k ) )?
A: We know that
h ( k ) is uniformly distributed among
, .., m
Thus,
n h ( k ) ) =
∑ m − 1
/m
n i =
n/m
α
Want each key to be equally likely to hash to any of the
m
slots, independently of the other keys
terns that might exist in the dataKey idea is to use the hash function to “break up” any pat-
easily convert strings to naturaly numbers)We will always assume a key is a natural number (can e.g.
h ( k ) =
k
mod
m
Want
m
to be a
prime number
, which is not too close to a
power of 2
Why?
h ( k ) =
b m (^) ∗ (^) ( kA
mod 1)
c
kA
mod 1 means the fractional part of
kA
Advantage: value of
m
is not critical, need not be a prime
2 works well in practice