You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
burathar
b41e4be772
|
5 years ago | |
---|---|---|
LICENSE.txt | 5 years ago | |
README.md | 5 years ago | |
__init__.py | 5 years ago | |
formatter.py | 5 years ago | |
reader.py | 5 years ago | |
setup.py | 5 years ago | |
writer.py | 5 years ago |
README.md
Invoice-Extractor
This is a purpose-built project to extract the tabluar data from Dutch KPN Mobile invoices. It is heavily relying on the PDFQuery package.
Compatibility
This package has been tested on invoices dating from 2016 until 2019, but will probably work with older and more recent invoices.
Installation
At this moment this package does not exist in PyPI, and has to be put into the python package directory manually. For windows this is usually C:Program Files (x86)\Python<version>\Lib\
Usage
It is possible to extract the data from just one pdf file, or a directory containing compatable pdf files.
Usage example
import invoice_extractor
pdf_file = <somefilepath>
pfd_directory = <somefilepath>
output_directory = <somepath>
invoice_extractor.extract(pfd_file, output_directory)
invoice_extractor.extract(pfd_directory, output_directory)