Data engineer interview question | Process 100 GB of data in Spark Spark | Number of Executors

Data engineer interview question | Process 100 GB of data in Spark Spark | Number of Executors

MANISH KUMAR

1 год назад

28,434 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

ask to stranger 🤝
ask to stranger 🤝 - 26.09.2023 17:32

Bahi data engineering field me remote jobs
Bhi he ??us Remoye jobs?

Ответить
Ravi Kumar
Ravi Kumar - 26.09.2023 01:52

This is not a correct approach i believe . To process 100 gb of data, block size created would be 800 . We would need more executors to run in parallel . If we rely on the resources explained, it will take much more time than expected.

Ответить
rauldtt
rauldtt - 16.09.2023 10:00

The interviewer asked me about processing PETABYTES of data. Can you explain how to deal with that scenario

Ответить
anjibabu makkena
anjibabu makkena - 29.07.2023 03:52

Expalin in english

Ответить
Pratik j
Pratik j - 14.04.2023 08:27

Mihir is just bluffing and saying the generic stories. Manish did a good job by interrupting hime. Keep it up.

Ответить
neel banerjee
neel banerjee - 27.03.2023 16:43

You are actually filling the gap.. much thanks man..!!

request you to kindly make this kind of interactive videos specially on below topics -

1. Repartition with real time scenario. How to determine repartition size depending on data size, cluster size

2. Key salting method - practical/real time case with coading example

3. Data serialization in spark and how it helps on optimization

4. Choosing file type on different scenario (parquet/json/orc)

5. DAG analysis

6. Accumulator - with real time use cases

7. Cache and persist - when to use what

8. garbage collection tuning

9. Real time coding issues faced by data engineers and debugging

10. Version control system for databricks notebook

11. Real time production implementation of bigdata projects..

12. How to perform unit testing for databricks notebooks?

Thanks in advance.. ❤❤

Ответить
sandeep soni
sandeep soni - 23.02.2023 09:37

Best Channel for data engineer 👍👍

Ответить
Sohel Sayyad
Sohel Sayyad - 21.02.2023 15:25

Thank you Manish Bhai, you understand what matters to the aspiring data engineers and what they need to know in depth. really appreciate this.

Ответить
M Naveen Vamshi
M Naveen Vamshi - 16.02.2023 06:53

Thank you very much Manish for your guidance, it is really helpful i am ur new subscriber, my query is , I am good at python developer and intermediate SQL i know, but very much new to spark, i had learnt the spark basics, but can you suggest me one course from where I can learn like this real time questions on spark to process 100 GB data is there any resources in udemy or any other places Thanks in advance, as if i want to career change from python developer to data Engineer

Ответить
Girish Nigade
Girish Nigade - 06.02.2023 09:15

One yr study krke dada Engineer ban sakte hai kya sir...?

Ответить
Manish 789
Manish 789 - 01.02.2023 16:05

Where to learn these in depth spark architecture... Any resources/book you'll suggest ?

Ответить
KARAN SINGH RAJPUROHIT
KARAN SINGH RAJPUROHIT - 30.01.2023 10:21

Bhai aapka Instagram ya Gmail?

Ответить
Siddharth Guliyani
Siddharth Guliyani - 29.01.2023 20:55

Yeh channel ki reach m aag lagne vali hai bahut tej , kaafi tez upr uthega yeh. Likh ke lelo.

Ответить
jay chavhan
jay chavhan - 23.01.2023 16:39

how much data structure needed for data engineer and how to learn plz make video on this topic...

Ответить
manoj kumar
manoj kumar - 23.01.2023 06:44

Awesome

Ответить
Mranal Jadhav
Mranal Jadhav - 22.01.2023 17:00

Thanks manish for this informative session.. I already had this question in my mind.. I was searching for this question from few days...finally Today you made this video...its like a magic...Thnks a lot man...Please make more videos on such questions which are asked in interview.

Ответить
Rishav
Rishav - 22.01.2023 16:39

great video ...Can u make a video on What projects should fresher make for Data Engineer role ?

Ответить