Collision Handling In Hash Table - Data Structures & Algorithms Tutorials In Python #6

codebasics

4 года назад

158,150 Просмотров

Скачать видео

Комментарии:

@shawzhang4498 - 29.09.2023 20:32

The "found" has no use

Ответить

@myrusEW - 22.08.2023 12:58

This one was tough. Lot of moving parts for me. bout to get started on these problems

Ответить

@massefromla2554 - 18.08.2023 08:50

I can not find your exercises anywhere on-site . No Datastructure folder . Where are .cvs files .

Ответить

@SaqibMehmood-ov1ns - 05.04.2023 15:00

very helpful channel for learning piurpose

Ответить

@ritiksaxena2332 - 26.02.2023 22:29

Based on the explaination, I've solved Linear Probing by myself. It runs against all the cases mentioned in the github solution.

Code:

class LinearProbing:
def __init__(self) -> None:
self.MAX = 10
self.memoryList = [None for x in range(self.MAX)]

def getHash(self, key):
h = 0
for char in key:
h += ord(char)

return h % self.MAX

def add(self, key, value):
index = self.getHash(key=key)
print('Index of ', key, ': ', index)

count = 0
while self.memoryList[index] != None and self.memoryList[index][0] != key: # index is not empty and not same key
count += 1
if index == self.MAX-1: # index from 0 to 99, so self.MAX-1 = 100-1 = 99
index = 0
else:
index += 1

if count==self.MAX:
print('Hashmap is full')
return

# this statement is generic,
# if index is None in memoryList then loop will not execute and value will be added directly
# or if index is not None but key is same, then also loop will not execute and value will be updated directly

# if index is not None and key is not same then while-loop will execute and increase the index by 1.
self.memoryList[index] = (key, value)

def get(self, key):
index = self.getHash(key=key)

try:
# loop will only execute if key at index is different
while self.memoryList[index][0] != key:
if index == self.MAX-1:
index = 0
else:
index += 1
return self.memoryList[index][1]

except Exception as e:
print('Index does not exist.', e)

def delete(self, key):
index = self.getHash(key=key)

try:
while self.memoryList[index][0] != key:
if index == self.MAX-1:
index = 0
else:
index += 1

self.memoryList[index] = None
# del self.memoryList[index]
# self.memoryList.insert(index, None) # it is required, because if we delete the value at index using del then size of memoryList will also reduce by 1. So, inserting None at the index will balance the size of list.

except Exception as e:
print('Index does not exist.', e)

if _name_ == '__main__':
obj = LinearProbing()

obj.add(key='march 6', value=20) # index 9
obj.add(key='march 17', value=88) # index 9, last index of memoryList. So stored at index 0
print(obj.memoryList)
obj.add(key='march 17', value=29) # index 9, stored at index 1 after incrementing index via loop
print(obj.memoryList)
obj.add(key='nov 1', value=1) # index 0, stored at index 2 after incrementing index
obj.add(key='march 33', value=234)

print(obj.get('dec 1')) # does not exists
print(obj.get('march 33'))

obj.add(key='march 33', value=999) # updated the value of 'march 33'
print(obj.get('march 33'))

obj.add(key='april 1', value=87)
obj.add(key='april 2', value=123)
obj.add(key='april 3', value=234234)
obj.add(key='april 4', value=91) # index 0, stored at index 4 after incrementing
print(obj.memoryList)

obj.add(key='may 22', value=4)
obj.add(key='may 7', value=47)
print(obj.memoryList) # hashmap is full

obj.add(key='jan 1', value=0) # hashmap is full
print(obj.memoryList)

obj.delete(key='april 2')
print(obj.memoryList)

obj.add(key='jan 1', value=0)
print(obj.memoryList)

Ответить

@ducanhnguyen6903 - 03.02.2023 21:55

You said "chaining" is using linked list; but I only see adding tuples to a list. Where is the linked list? Am I missing something? Thanks!

Ответить

@dikshantmadaan9116 - 20.01.2023 19:20

I am using following function:

def add(key, val):
hash_value = get_hash(key)
d[hash_value].append((key, val))

instead of your __setitem__() function and still my code is working fine, and I don't understand why?

and when I am using your __setitem__(), I am getting an error on the following line:

self.arr[h][idx] = (key,val)

Ответить

@TKZntkx - 16.12.2022 01:42

_setitem_ has a bug on python 3.11.0, the value cannot be set

Ответить

@priyanshupurohit5431 - 17.11.2022 21:28

Thankyou sir i really apreciate your way of teaching

Ответить

@jayshreedonga2833 - 16.11.2022 09:29

thanks nice session

Ответить

@55mayakeren55 - 11.09.2022 14:34

why not using dict element in the self.arr instead of linked lists?

Ответить

@svetliodoychinov5580 - 14.08.2022 16:44

Amazing video! Best explantion I have found and the exercises are very useful I just finnished the nyc_weather one. Used the built in hash function and handles collisions by adding (key, value) tuples to a list that is on the index. Thank you for the great video!

Ответить

@priyankashelke7694 - 13.07.2022 10:22

Thanks for the great explanation......but I'm getting h value for 'march 6' and 'march 17' is different that is 9 and 59 ...so how to resolve this issue?

Ответить

@johanlopez7452 - 07.07.2022 05:45

And here I am paying college tuition and students loans to "learn" what this awesome dude gives out for free. You are the best man, love your content.

Ответить

@maggiezhang145 - 26.06.2022 07:14

How come I don't see a Linked List? I only see the array was used in the collision place?

Ответить

@meralmaradia4774 - 22.06.2022 08:13

Hello Sir, can you please create a video developing of project using only DSA ?

Ответить

@MeghaSharma-du8wg - 14.06.2022 13:44

for 1 and 2 exercise

weather={}
with open("E://nyc_weather.csv","r") as f:
for line in f:
tokens=line.split(',')
weather[tokens[0]]=tokens[1]
a=[]
for i in weather.values():
try:
a.append(int(i))
except ValueError:
pass

then print

for 3rd exercise

poem=[]
with open("E://poem.txt",'r') as f:
for line in f:
t=line.split()
for h in t:
poem.append(h)
t=str(input())
count=0
for i in range(len(poem)):
if t==str(poem[i]):
count+=1
print(count)

Ответить

@erenhan - 22.05.2022 15:44

this is the best practical exercise and explanation for hashtables for python in entire web, big thank you

Ответить

@NoobTube4148 - 27.04.2022 15:46

Would be nice if you could touch on double hashing and also explain what happens when hash table itself needs to grow or the individual slot data structure has to grow. What happens to the time complexities in those cases?

Ответить

@ShashankData - 27.04.2022 05:42

Amazing video! This was by far the clearest explanation of this I've ever seen!

Ответить

@acoustictuber2227 - 09.04.2022 22:56

Sir is this best for my btech placement interview ? And coding question ?? Is it fine

Ответить

@KANISHKSAHUSIME - 03.04.2022 06:43

hello sir instead of using for loop directly if we check number of elements present in each list then going to for loop will it decrease the time complexity?

Ответить

@hoatruong2474 - 02.04.2022 21:11

Best explanation and implementation of hash table ever, understand it immediately

Ответить

@user-zn2hr3ci9z - 24.03.2022 18:11

great channel sir. Can you please make vedios on priority queue and heap data structure?

Ответить

@yashpreetkaur2042 - 01.03.2022 11:14

I understood the video well but the exercises are like made me feel am I really understanding. Please someone help me to choose the right path, do I need to go for learning advanced python side by side?

Ответить

@cyclogenisis - 20.02.2022 04:13

You didn't mention something pretty important as of 3.7 which is ordered insertion.

Ответить

@anshulhedau10 - 07.02.2022 14:18

I have seen many videos but your videos have simplest explanation among all. Respect.

Ответить

@aashi9781 - 05.02.2022 21:10

Just started on the DSA journey combining practice with your guidance and taking notes. I started my Data science journey around 3n half years back and you were the first tutorial that I clicked for ML. I remember being so naive at that point but later on just got a flow out and got a chance to be a mature professional in data science field. Now feeling the urge for Targeting the Data Structure and Algorithms next!! Thank you so much! Enjoying it!

Ответить

@karrianilkumar8123 - 03.02.2022 06:53

You are a very good teacher. Thank you Mr Dhaval

Ответить

@prithvijali2629 - 15.01.2022 22:14

Hi
i have a question here, to avoid collision we can resize the hashtable or increment it by one. So that we can get different hash values if there are any collisions. Is it possible to change the size of hashtable dynamically?

Ответить

@ashaypatil - 13.01.2022 16:13

Could someone please tell me why are we not appending the key value pair whenever we find key to be already present.

We should append the key value pair to that element, isn't it?

def __setitem__(self, key, value):
h = self.get_hash(key)
found = False
# if key exists already then append the value to the key
for idx, k in enumerate(self.arr[h]):
if len(k) == 2 and k[0] == key:
self.arr[h].append((key, value))
found = True
break
# if the key value pair doesn't exist, add the key and value
if not found:
self.arr[h].append((key, value))

Ответить

@ashaypatil - 13.01.2022 15:56

def __delitem__(self, key):
h = self.get_hash(key)
for idx, k in enumerate(self.arr[h]):
if k[0] == key:
self.arr[h].remove(self.arr[h][idx])

Ответить

@zap7140 - 02.12.2021 12:10

Thanks

Ответить

@Heck-ed6sr - 28.11.2021 12:47

Hi @codebasics,

I have looked through the solution provided for exercise 4 on linear probing, and have noticed some issues that I would like to bring up.
Using the solution provided, if i execute the following lines:

1. t=HashTable()
2. t["march 6"] = 20
3. t["march 17"] = 88
4. del t["march 6"]
5. print(t["march 17"])
6. del t["march 17"]
7. print(t.arr)

The result is not as expected.
Line 5 gives None when the key value pair for "march 17" was not deleted, and line 6 does not delete the value for the key value pair "march 17". Line 7 thus still shows the key value pair for "march 17" even though it was supposed to be deleted.

I think this has happened because in your code for _getitem_ and __delitem__, once it finds None in the index of the array (given by the hash function), it assumes that the key does not exist and returns. However, it could be the case that an earlier colliding entry was occupying the slot (march 6), causing the entry for march 17 to occupy a different slot. Once the key value pair for march 6 is deleted, the index given by the hash function for key "march 17" now gives None in the array, while the entry for march 17 still exists in the array.

I apologize for my poor explanation, please also feel free to correct me should i have made any mistakes or misconceptions.

Ответить

@adbuth4854 - 25.11.2021 21:51

hello sir, what does the idx represents while iterating through the llist

Ответить

@overlydedicated2794 - 11.11.2021 06:34

For some reason, when I tried to set march 17, it did not append to the same location at march 6, but was placed at another location and I have no idea why. If anyone knows, the help would be appreciated
def __setitem__(self, key, val):
h = self.get_hash(key)
found = False

for idx, element in enumerate(self.arr[h]):
if len(element) == 2 and element[0] == key:

self.arr[h][idx] = (key, val)
found = True
break
if not found:
self.arr[h].append((key, val))

def __getitem__(self, key):
h = self.get_hash(key)
return self.arr[h]

t = HashTable()
t["march 6"] = 1
t["march 6"] = 1.5
t["march 8"] = 2
t["march 9"] = 3
t["march 17"] = 4
print(t.arr)

Output:
[[], [], [], [], [], [], [], [], [], [('march 6', 1.5)], [], [('march 8', 2)], [('march 9', 3)], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [('march 17', 4)], [], [], [],

Ответить

@leakyroofmedia - 11.11.2021 03:56

Great videos! Thank you!

Ответить

@siddharthsinghh - 19.10.2021 20:06

Room looks clean

Ответить

@DispelTV - 22.09.2021 22:12

Great videos, thank you so much! Do you have any projects that use these algorithms and data structures to help us imbed this information into our brains?

Ответить

@thammayyaarava2259 - 21.09.2021 04:45

The code for implementing linear probing has problem with the case.....
1.March 6 and march 17 has same value of hash
2.March 6 already present in the correct position....let say at 9
3.March 17 is inserted at position 5 due to no gaps at previous position
4.if we deleted an element at position 3
5.and we want to update march 17 value ....due to gap present at position 3 it will again stored as a new element without getting updated......
................................................
As I tried by myself and i compared wit the given solution i abel to find it
Hope u change the code
....….....…............ aa..............
And i love the way you teaching i admire it thanks for this wonderful tutorials......❤️ From India,Andhra

Ответить

@santoshkumarchidambaram3130 - 26.08.2021 19:54

Hey Codebasics , Can you tell me whether the following code is right for adding items in hashmap using linear probing?

class Hashmap:
def __init__(self):
self.max=10
self.arr=[[] for i in range(self.max)]

def gethash(self,key):
h=0
for i in key:
h=h+ord(key)
return h%self.max

def __setitem__(self,key,value):
index=self.gethash(key)
if self.arr:
if len(self.arr[index])==0:
self.arr[index]=(key,value)
else:
space=True
while space:
try:
index=index+1
if len(self.arr[index])==0:
self.arr[index]=(key,value)
space=False
except(IndexError):
space=True
index=0
while space:
try:
if len(self.arr[index])==0:
self.arr[index]=(key,value)
space=False
else:
index=index+1
except(IndexError):
print("Hashmap is full")
space=False

Ответить

@poornimaumapathy9986 - 24.08.2021 20:48

Crisp and clear tutorial for hash maps sir.Thank you. Could you please explain why if len(element) ==2 is checked in setitem function

Ответить

@TheBhartiyaTrainee - 20.08.2021 18:01

Let me take the chance to click the 1000th like!! Do we have to manually deal with csv files or can the third party/standard lib modules be used?

Ответить

@AngelSanchez-fi7vf - 17.08.2021 09:15

Great video. Thank you!

Ответить

@debasismondal3619 - 04.08.2021 15:44

Thanks a lot Dhaval, I have been learning a lot from you and using this skill to search for a career as a developer or data scientist after 10 years of administration experience, hope I will find a career soon!

I did all the exercises by myself with your encouragement but I check your every solution as well. I found the solution for Linear Probing is not working as expected when I am running the following case

t = HashTable()
t["march 6"] = 20
t["march 17"] = 88
del t["march 6"]
t['march 17']

However, both of the following is working fine

def __getitem__(self, key):
h = self.get_hash(key)
prob_range = self.get_prob_range(h)
for prob_index in prob_range:
element = self.arr[prob_index]
if element is None:
continue
if element[0] == key:
return element[1]

def __getitem__(self, key):
h = self.get_hash(key)
for i in range(self.MAX):
element = self.arr[h]
h = (h+1)%self.MAX
if element == None:
continue
if element[0] == key:
return element[1]