Optimized ETL's with QueryDatabaseTable and PutDatabaseRecord | Apache Nifi | Part 10

Optimized ETL's with QueryDatabaseTable and PutDatabaseRecord | Apache Nifi | Part 10

Steven Koon

3 года назад

17,884 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@seshaiahambati1798
@seshaiahambati1798 - 03.11.2022 23:07

Hi, I have scenario like ' File move to one folder to another then process(load to Snowflake) the file & Archive it(move file again source to Archive folder)' In it Flowfile to time GetFile & PutFile, how i do depending condition one after other.

Ответить
@andreyfilatov3676
@andreyfilatov3676 - 21.08.2022 21:41

Hi Steven, Thank you a lot for you shown! One question, how to fill Catalog in correct way?

Ответить
@WilliamKnak
@WilliamKnak - 25.07.2022 03:09

"Input requirement: This component DOWS NOT ALLOW an incoming relationship". Is it possible to start this processor after another one?

Ответить
@MADAHAKO
@MADAHAKO - 14.06.2022 13:23

Sir, you're videos are amazing!

Ответить
@enesteymir
@enesteymir - 27.04.2022 11:17

Hello Steven, thank you for these videos. I have used Nifi for several months. I can handle easily with tables migration that contains 5 - 20 million rows. Now ı am loading a big transaction table that contains 550 million rows. Generally ı perefer QueryDataforbaseTable > PutDatabaseRecord processors for small size migration, and GenerateTableFetch > ExecuteSQL > PutDatabaseRecord for medium size tables (5- 20 m). I can't use the first method because ı get heap size memory errors although ı tried max rows for a flow parameter. For very big size like 550 m rows, ı can use parallel flows that generatetablefetch create different partitions according to a where condition and order by or max column value parameters. This method works but ı try to find much more faster methods. Maybe you can give me an advice to handle very big size table migration from PostgreDB to VerticaDB without any transformations.

Ответить
@kwamesefah6855
@kwamesefah6855 - 01.04.2022 00:14

can you please put together a video for getting data from mysql to publishing to Kafka, consuming and the putting that in s3.

Ответить
@carlosnatividade1814
@carlosnatividade1814 - 19.12.2021 23:45

Good afternoon. I managed to perform all the steps and be successful! Thank you very much. Just one question: - There is no way to send only the registration records to the DatawareHouse, without the need to send all the records again - avoiding duplication of information?

Ответить
@toledo3211
@toledo3211 - 16.12.2021 21:55

Hello, Thanks for the training videos. How do you keep the QueryDatabaseTable processor to stop reading the old events over and over again?

Ответить
@chitrangsharma1362
@chitrangsharma1362 - 07.10.2021 16:02

I'm glad that I've found your video, sir would you please let me know how to transform the data ! I basically want to do some kind of masking or updation before inserting the data into the destination database. Thanks in advance

Ответить
@medal16
@medal16 - 16.09.2021 20:45

Nice explanations. thank you! Could you demonstrate how can i made this for updates too. in this case a incremental line works very well, but how can i get the updates too? thanks.

Ответить
@user-vt7sz4xd5l
@user-vt7sz4xd5l - 23.03.2021 19:09

Hello! I try to configure data selection from Hive (Serialization type = JSON) to Oracle. In PutDatabaseRecord Config I need to select Record reader - I select JsonPathReader, but state of JsonPathReader is Invalid. what could be wrong?

Ответить
@kasunmathota
@kasunmathota - 04.03.2021 10:58

Thank you so much, I am using your videos to learn NIFI ..I am getting java.sql.SQLDataException: None of the fields in the record map to the columns defined ..still couldn't fix the issue ..

Ответить
@abdullahaqeeli8074
@abdullahaqeeli8074 - 27.01.2021 11:21

Hi Steven,
Thanks a lot for the Nifi videos they've been VERY helpful. I'm running into a challenge using QueryDatabaseTable processor. My incremental load relies on two columns, an ID and an update_time column. When I put these two columns in the Maximum-value Columns property the generated query to poll the changes has WHERE clause with an AND between the two columns condition which results in losing some of the changes. However, I would like to change that to an OR. I'd like my WHERE clause to be where ID > {MAX_ID} OR update_time > {MAX_Update_time}. Any idea how to achieve that?

Ответить
@rahulkumar-jw1by
@rahulkumar-jw1by - 22.10.2020 07:12

Awesome video I like this video tutorial. I have one doubt if I have multiple table to read and insert then how can we achieve

Ответить
@samsal073
@samsal073 - 28.09.2020 07:26

Hi Steven,
Thanks for the videos ...very helpful information.

Ответить
@Tokolosk
@Tokolosk - 24.09.2020 23:54

Oh boy oh boy Steven, I have no words to describe how useful your videos been. We were quoted $4400/month by a company for an AWS solution that I am now able to do on a single $100/month server with Nifi with capacity to spare... anda second $100/month server ready for DR. Sure, still lots of things to do and learn but the numbers speak for themselves!

Your presentation style and pace is great, those 26 minutes went by fast! Looking forward to learning some more! Cheers.

Ответить

САМЫЙ ПОЛНЫЙ РАЗБОР ДЕМОВЕРСИИ 2024 | Русский язык ЕГЭ 2024 insperia | Анна Солдаева | Русский язык ЕГЭ 2025