Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

dumppdf(1) [debian man page]

DUMPPDF(1)							  PDFMiner Manual							DUMPPDF(1)

NAME
dumppdf - dumps internal contents of a PDF files SYNOPSIS
dumppdf [option...] file... DESCRIPTION
dumppdf dumps the internal contents of a PDF file in pseudo-XML format. This program is primarily for debugging purposes, but it's also possible to extract some meaningful contents OPTIONS
-a Dump all the objects. By default only the document trailer is printed. -i objno[,objno,...] Specifies PDF object IDs to display. Comma-separated IDs, or multiple -i options are accepted. -p pageno[,pageno,...] Specifies the comma-separated list of the page numbers to be extracted. Page numbers start at one. By default, it extracts text from all the pages. -r, -b, -t Specifies the output format of stream contents. Because the contents of stream objects can be very large, they are omitted when none of the options above is specified. With -r option, the "raw" stream contents are dumped without decompression. With -b option, the decompressed contents are dumped as a binary blob. With -t option, the decompressed contents are dumped in a text format, similar to repr() manner. When -r or -b option is given, no stream header is displayed for the ease of saving it to a file. -T Show the table of contents. -P password Provides the user password to access PDF contents. -d Increase the debug level. EXAMPLES
Dump all the headers and contents, except stream objects: $ dumppdf -a test.pdf Dump the table of contents: $ dumppdf -T test.pdf Extract a JPEG image: $ dumppdf -r -i6 test.pdf > image.jpeg SEE ALSO
pdf2txt(1) AUTHORS
Jakub Wilk <jwilk@debian.org> Wrote this manual page for the Debian system. Yusuke Shinyama <yusuke@cs.nyu.edu> Author of PDFMiner and its original HTML documentation. dumppdf 08/24/2011 DUMPPDF(1)

Check Out this Related Man Page

pdfinfo(1)						      General Commands Manual							pdfinfo(1)

NAME
pdfinfo - Portable Document Format (PDF) document information extractor (version 3.03) SYNOPSIS
pdfinfo [options] [PDF-file] DESCRIPTION
Pdfinfo prints the contents of the 'Info' dictionary (plus some other useful information) from a Portable Document Format (PDF) file. The 'Info' dictionary contains the following values: title subject keywords author creator producer creation date modification date In addition, the following information is printed: tagged (yes/no) form (AcroForm / XFA / none) page count encrypted flag (yes/no) print and copy permissions (if encrypted) page size file size linearized (yes/no) PDF version metadata (only if requested) OPTIONS
-f number Specifies the first page to examine. If multiple pages are requested using the "-f" and "-l" options, the size of each requested page (and, optionally, the bounding boxes for each requested page) are printed. Otherwise, only page one is examined. -l number Specifies the last page to examine. -box Prints the page box bounding boxes: MediaBox, CropBox, BleedBox, TrimBox, and ArtBox. -meta Prints document-level metadata. (This is the "Metadata" stream from the PDF file's Catalog object.) -rawdates Prints the raw (undecoded) date strings, directly from the PDF file. -enc encoding-name Sets the encoding to use for text output. This defaults to "UTF-8". -listenc Lits the available encodings -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES
The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR
The pdfinfo software and documentation are copyright 1996-2011 Glyph & Cog, LLC. SEE ALSO
pdfdetach(1), pdffonts(1), pdfimages(1), pdftocairo(1), pdftohtml(1), pdftoppm(1), pdftops(1), pdftotext(1) 15 August 2011 pdfinfo(1)
Man Page