Комментарии:
Hi Bryan,
Can you pls help to understand if my role is a etl do i need to learn pyspark or ADF can do the job of transfering and transforming the data
very good tutorial
ОтветитьBrian,
do you mind a random question?
when in Databricks notebooks and writing base Python on a local pandas dataframe, is that technically still PySpark?
not sure why that question matters to me but it kind of bothers my brain not knowing for certain 🙃
if it is PySpark does that mean even pandas dataframes get passed to the optimiser, or is that restricted to distributed dataframes?
loving the videos, thank you.
also really love your sign off, thanks for pulling for us, great person!
Super helpful! Thanks a lot!
ОтветитьSince SQL is native to Spark is there any benefit of using PySpark over Spark SQL?
ОтветитьQuestion - If I have a script written using pandas for transformations in a Databricks notebook... would I need to convert all the code to pyspark to realize the benefits or would it be okay if I only converted the 'inefficient blocks' and used pandas for some of the more simpler munging tasks?
ОтветитьReally helped me thank you so much. Keep sharing your knowledge.
ОтветитьHow has your experience been in the solutions space? Is your job more along the lines of a sales engineering type role? Reason I asked was I just recently turned down a solutions role in my company and chose to stay non-client facing :)
ОтветитьThanks for this amazing video. Exactly, what I was looking for.
ОтветитьQapla' brother!
ОтветитьWas really glad when you said ' highly recommend you don't restrict yourself to python' in a video which deep dives into Python with PySpark! A real good video.
ОтветитьExcellent! Thank you very much...
ОтветитьLoved this lecture Bryan! I'm curious to know, given that the spark engine optimizes the sql code, is it a good idea to use python udf for processing at all ?
ОтветитьCould not have been showcased more nicely and concisely.
ОтветитьReally helped me to understand PySpark as a beginner. Hoping to see videos on real-time and streaming data. Thanks and keep sharing your wonderful knowledge Bryan.
ОтветитьThe video is really good..
but i don't find the git repository on the path mentioned in the video.
Can you please share path with me
Sir, is ur GitHub link for notebook posted somewhere?
ОтветитьReally great tutorial ... Thank you Bryan !
ОтветитьExcellent Bryan, Thanks!
ОтветитьReally Great Explanation. Totally worth spending 2-3 hours to watch the video and understand all the concepts in detail. Thanks @Bryan Cafferky
ОтветитьWhen I noted this video, never knew that I would be watching it till the end. But I took time and watched it till the end and it took me 2 days as I practiced all along. Its totally worth it. Keep sharing your knowledge.
Cheers!
Fantastic explanation Bryan!
ОтветитьNice tutorial, very well explained, thanks Bryan !!
ОтветитьGreat!! question: What is the best way to analyze 35 thousand tables of 98 rows contained in a single Spark dataframe? Process each of the 35,000 tables one by one as Spark tables or convert the entire dataframe for Pandas and work locally with the tables?
ОтветитьI really had to log in just to like and subscribe. Your explanations are awesomely straight to the point and not time wasted, really excellent.
ОтветитьVery good video, it would be awesome if you can create similar video just for the ML.
Ответитьthanks
ОтветитьHello Bryan ! I checked the link of the notebook but I still don't see the diabets notebook you are using in the video ! Is there any way to get it ? I would be grateful ! Thank you !
ОтветитьTwo excellent Azure Databricks videos Bryan, and thank you for taking the time for sharing your knowledge.
Ответитьhi Bryan: How to import unstructured data into DBFS, it always makes us convert that into table and stores it in /Filestore/tables. Is there anyway to load json or xml files which cannot be loaded as table?
Ответитьsequel, not s q l... we old guard fellows should know...
ОтветитьI must say this video is very very thorough. I searched quite a bit to find the notebook you're using. Would I be able to get it from you somehow?
Ответитьhi how to mount two azure storage (blob) and copy file from one mount to another mount using python (shutil) . I am not using the dbutils since databricks is still a preview.
ОтветитьAwesome tutorial. Liked it much.
ОтветитьThanks Bryan, great video. There are couple of issues in the demo.
When you do sdf.selectExpr, the output changes. The values in the columns with spaces changes.
The same thing happens when you use sdf.filter.sort on 'blood pressure' column. The values in blood pressure column becomes all 0.
Is this something you observed?
Great tutorial as always!
ОтветитьHi Bryan, great tutorials. They helped by get a lay of the land with databricks. You mention providing access to your notebooks. Where would those be?
Ответить