NLP Data Import part 2 - Document parsing|How to parse pdf files in Python|Document parsing Python

Unfold Data Science

4 года назад

16,859 Просмотров

Скачать видео

Комментарии:

@user-yv7fe8jp3t - 18.10.2023 13:34

If the pdf contains 50 pages and I want to parse/extract a particular value in each page. Please help in this context.

Ответить

@ImranKhan-jn6zh - 20.03.2023 08:04

Hello Aman,

Can u please let me know why u have used != -1
in second for loop in if condition

Ответить

@xxxsamxxx130 - 16.02.2023 05:49

what is the purpose of not equal to -1, sir ?

Ответить

@kishanbeesa4139 - 16.01.2023 17:44

Please or make a video on extracting the checkboxes from the word document or pdf

Ответить

@AshwiniHindlekar - 01.10.2022 22:59

Hi..thanks its great learning..do u also do freelancing?

Ответить

@priyadharshinisivakumar4951 - 24.09.2022 17:04

It's very useful.Thank you Aman.

Ответить

@mohammedalshami3937 - 23.09.2022 18:24

I am really enjoying your NLP series. Thank you for making it look as simple as this.

Ответить

@alfredoderodt6519 - 07.08.2022 00:17

Thank you so much! This is great. I have a question though....¿How would you save this information in JSON format? : D

Ответить

@sahajanayak48 - 03.08.2022 03:27

HOW CAN I SCRAPE A KANNADA PDF TO UNICODE IN PYTHON

Ответить

@vishalgarg8423 - 25.06.2022 18:28

Dear Sir,
Thanks for This Video , Is there any way that I can enter a word and search in thousands of pdf and the pdf which contains the word will open.

Ответить

@brendensong8000 - 17.06.2022 07:06

Great video! Thank you for sharing!

Ответить

@alvin3428 - 09.04.2022 14:59

How do I extract specific data from invoice having different formats, please help sir.

Ответить

@sandipansarkar9211 - 29.01.2022 16:07

finished watching

Ответить

@raghudharavath2299 - 10.12.2021 13:25

Please do it with pdfminer

Ответить

@nakshatrasingh446 - 16.09.2021 17:30

Great video sir, how do I save those values in a CSV file? And my second question is how do I split on next line rather than : ?

Ответить

@mujeebullahkhan5201 - 26.05.2021 14:37

Sir, My Folder Has Various Files Like
txt,docs,excel,pdf etc then what is the solution? Can you make a separate video for them?

Ответить

@sandeyche - 05.05.2021 15:33

Could you please suggest if in case all the Invoices format are different each other.

Ответить

@yash422vd - 21.04.2021 16:02

GIving an error at this line ---> invoice_no = file_contents[i].split(': ')[1]
ERROR: IndexError: list index out of range
I tried & replicated same format of bills in word and saved them in PDF format, used random values in invoice, date and amount.
Please suggest!

Ответить

@prakharupadhyay9465 - 19.04.2021 22:23

for match in self._lang_vars.period_context_re().finditer(text):

TypeError: expected string or bytes-like object
while performing tokenization
please help

Ответить

@porudoryu - 13.04.2021 09:01

Still learning Python and your simple teaching style is really helpful.
You got yourself a subscriber sir. Thanks!

Ответить

@shreygrover3850 - 30.03.2021 17:46

Hi, I am getting this error 'PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will be corrected. [pdf.py:1736]'. Any idea why that's happening?

Ответить

@kiranvanukuri9382 - 19.03.2021 15:28

Bro plz make video on a how to extract data from docs and pdfs and how to add that entities to data frame plz bro

Ответить

@yitao_ - 02.03.2021 17:18

very good thank you.

Ответить

@sandipansarkar9211 - 29.01.2021 14:08

Bur this is not working in my google colab
import os
dir_Path = 'C://Users//server//Desktop'
os.chdir(dir_Path)
print(dir_Path)
The eror which i am getting is
FileNotFoundError Traceback (most recent call last)
<ipython-input-13-13a426d276e1> in <module>()
1 import os
2 dir_Path = 'C://Users//server//Desktop'
----> 3 os.chdir(dir_Path)
4 print(dir_Path)

FileNotFoundError: [Errno 2] No such file or directory: 'C://Users//server//Desktop'

Please guide me

Ответить