In data engineering, knowing how to explode data will help you deal with challenges at the intersection where batching systems and streaming systems meet. It helps avoid an event-driven system anti-pattern by ensuring each message contains a single record. Furthermore it makes tasks such as filtering easier and also enables you to pivot data for further analysis and processing.
@DataSurfer will go through a batch processing example using Pandas and then go through 3 top use cases for Python stream processing with Kafka and Quix Streams: e-commerce orders, change data capture from databases and telemetry data from sensors. Check out the GitHub repo with the full source code for the use cases and use Docker Compose to try them out on your machine.
0:00 — What is explode?
1:06 — Explode theory with e-commerce orders use case
2:24 — Dive into code
3:46 — E-commerce orders use case with Pandas
6:08 — Requirements.txt
7:40 — E-commerce use case with Kafka
16:26 — Change data capture use case with Kafka
25:13 — Telemetry use case with Kafka
32:36 — Wrap up