Comparing Kafka Streams, Akka Streams and Spark Streaming: what to use when | Rock the JVM

Comparing Kafka Streams, Akka Streams and Spark Streaming: what to use when | Rock the JVM

Rock the JVM

4 года назад

21,202 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

Carlos
Carlos - 15.03.2023 19:17

One thing I missed was STATE, how they compare in terms of managing aggregations. Great video thank you.

Ответить
min shi
min shi - 03.10.2022 23:33

Hi Daniel,
Normally, how would you host the scala applications to make it long running process if you use Kafka Streams ?

I know if I use spark streaming, the dedicated cluster will keep it running and listen /react to the stream/data. I have not big amount of data.


Kind Regards

Ответить
δlieř
δlieř - 09.08.2022 11:07

but i hate jvm related technology.. so, do i have any other choices? or just suck it up?

Ответить
Naman Bhayani
Naman Bhayani - 21.07.2022 10:09

Thank you very much Daniel :)

Ответить
太好了
太好了 - 29.01.2022 16:32

Could you please clarify what do you mean by fault tolerance in Akka Streams? I am used to working with big data frameworks (Kafka Streams, Spark Streaming and Flink) and they usually execute code on flock of machines with exceptional horizontal scalability and fault tolerance. I lack the information on Akka Streams side - from your description (best for high-performance streams that are part of the business logic) I would assume that we embed Akka Streams application into existing ones. That could give us superior vertical scalability (with concurrency backed by actors) but if that's just a single machine then how on earth can we talk about fault-tolerance? I must be missing something obvious :)

Ответить
ElectricWound
ElectricWound - 24.12.2021 02:58

A very nice high-level overview of the differences of the streaming libraries. I was especially looking for a description of when to use Kafka Streams instead of Akka Stream and this was very helpful. There was one severe error in your description of Akka Streams though. They are not "asynchronous by default". Most operators are actually synchronous and you are able to introduce asynchronous boundaries into streams or invoke asynchronous operations with a given degree of parallelism. Consecutive synchronous operations will be "baked" into a single actor transparently on materialization to minimize message passing overhead. So you have perfect and concise control over the concurrency of calculations. And I just can not fully agree on your position on Akka Streams as being especially hard for beginners. Especially programmers with some Scala experience will quickly relate to the collections-like API and be up and running in no time, especially compared to setting up Kafka or Spark. I think, before anyone approaches streaming libraries at all, they are probably already knee deep in hard to solve concurrency, dependency and performance problems and maybe sunk weeks into cracking each problem the hard way. Then finding Akka Streams you can finally concentrate on your logic, get all the boilerplate out of the way and write some self-descriptive concise code, that rocks some incredibly complex stuff, nicely modularized in readable code chunks that fit on a single screen. Its discovery for me was like finally coming home. I think, the hardest part is wrapping your head around the concept of materialized values, how to design stream stages with state correctly and when you need the Graph API at all. My next task is getting my hands dirty with Kafka.

Ответить
Abdulelah Al Jeffery
Abdulelah Al Jeffery - 10.07.2021 13:08

I really love how you lay out the pros and cons of each streaming API, and in what situation we have to use what. Really great stuff; and I'm glad that I found your channel.
I'd happily buy a membership to learn from your awesome courses.

Cheers pro :)

Ответить
Marek Iwaniuk
Marek Iwaniuk - 16.04.2021 18:55

Just wanted to leave a note on how Reactive Manifesto and Reactive Streams are (not) related to each other. The first one describes 'reactive systems' - it means the whole system, where all of its components cooperate in a resilient, elastic, fault-tolerant and message-driven manner. So it is a specification of how a system should behave as a whole. Reactive streams, on the other hand, are just a piece of the puzzle in the reactive system. They also can be used separately, outside of reactive system. The thing is, you can actually write an application, which doesn't comply with requirements of Reactive Manifesto, but still uses and leverages Reactive Streams. 'Reactive' in systems means how the whole system reacts to volume, load, errors etc.; 'reactive' in streams means that you have a flow of data, and you react asynchronously to the events in this flow. In the world of Akka those two terms might get blurred, because Akka Actor system actually enables you to build a reactive system. Nonetheless, I would say that Akka Streams might help you build a reactive system, but they won't make your system resilient, elastic, etc. straight away.
Anyways, you have a really good content on this channel, thanks a ton for that!

Ответить
Alexandru Toma
Alexandru Toma - 07.03.2021 23:10

esti cel mai bun instructor de scala din lume :D
ce bine ca esti si pe udemy si ai si cursuri pe site.
tot asa Daniel!

Ответить
Dr. Dude
Dr. Dude - 11.01.2021 01:16

Nice, finally i know the difference and when to use what!!!! well done video as always

Ответить
dimfatal
dimfatal - 09.12.2020 14:05

Hey, Daniel, I’m absolutely beginner and I have question about fs2 library which also using for some kind of streaming. My question is - could it be alternative for some of the streaming library’s that you mentioned in this video?

Ответить
Stanislav G.
Stanislav G. - 22.11.2020 23:18

Cool. But now (from 2.3) Spark has .trigger(processingTime = "0 seconds") to minimize the latency. We may use a 0 second processing time trigger indicating that Spark should start each micro-batch as fast as it can with no delays.

Ответить
Sergio
Sergio - 21.11.2020 17:13

+1 for: why Flink is not here?

Ответить
Chandrashekhar Kotekar
Chandrashekhar Kotekar - 19.08.2020 12:54

Thanks for this detailed video. Can you please make similar video which compares Spark streaming with Apache Flink with Apache pulsar?

Ответить
cgmds
cgmds - 18.08.2020 06:28

Awesome explanation, thank you!!

Ответить
Zia Uddin
Zia Uddin - 02.07.2020 00:40

Nice explanation. Can we also include a part of Apache Flink. Apache Flink,as i think , also uses Akka under the hood (?) and it also provides some good control over stream through low level APIs and other benefits as shown for akka.

Ответить
Alexander Gold
Alexander Gold - 27.06.2020 09:23

Good video, however it was nice if you could also include Flink (as you comparing streaming frameworks) it's generally 20% faster than Kafka Streams and Spark Streaming, probably Kafka streams is the future as Kafka's ecosystem is evolving, but syntax vice Spark/Flink are much more intuitive in Scala

Ответить
Luca Savoja
Luca Savoja - 19.06.2020 16:45

Awesome video as always. I'd love a course (on udemy, not free!) of kafka/kafka streams. The other one on udemy are not as good as yours.

Ответить
Vitor Mendes
Vitor Mendes - 19.06.2020 05:26

Is There any discount associate with your yearly full access membership? Here in Brazil things are complicated. Dollar is almost 6 times our currency.

Ответить
Vitor Mendes
Vitor Mendes - 19.06.2020 05:25

I love your videos bro

Ответить
MEN Phalla
MEN Phalla - 19.06.2020 04:19

Hello. :-)

Ответить