From 757Labs


[edit] PDFResurrect

PDFResurrect is a tool aimed at analyzing PDF documents. The PDF format allows for previous document changes to be retained in a more recent version of the document, thereby creating a running history of changes for the document. This tool attempts to extract all previous versions while also producing a summary of changes between versions. This tool can also "scrub" or write data over the original instances of PDF objects that have been modified or deleted, in an effort to disguise information from previous versions that might not be intended for anyone else to read.

This project is released under the GNU GPLv3 license. So have at it!

Many individuals were originally consulted on this and provided suggestions, including Tele, Remad, Derez, Count, and Sunpuke. Special thanks to Brent, not really part of the 757 crew, but aided in proofreading the paper. Thanks guys!

GNU GPL version 3.0

[edit] News

July 1, 2013
The code for this project has moved to github.

November 30, 2012
Version 0.12 (bug fix) is now available. The main fix regards how the tool was locating the EOF token that the PDF writers place at the end of different versions of PDFs. Previously, if an EOF token was split across a 256-byte boundary, then Mr. Resurrect would not have found it. The new algorithm might be a tad slower; however, it is more precise.

May 28, 2012
Version 0.11 (bug fix) has just been released. Yeah, it has been some time since the last update. Anyways, Francois Marier (the Debian maintainer for PDFresurrect) pointed out a bug to me, and also provided some makefile and configure script modifications. Thanks goes out to Francois, Valgrind, gcc -pedantic, gdb, and the letter 'GNU'). Ideally, if you are an end user, you shouldn't notice a difference :-) Thanks Francois!

March 21, 2010
Version 0.10 (bug fix) has just been released. The main correction here is dealing with the "-i" argument. Previously, if the creator information was stored in PDF-objects, as opposed to in-line, the object information was returned. Well that's cool and all, but probably not the desired result :-). The user probably actually wanted the data IN the object!

November 11, 2009
Version 0.9 (bug fix) has just been released. This is a bug fix release and addresses the gathering of data (within limit) for the Creator MetaData at the end of a PDF. The previous version would stop prematurely, or possibly get too much info (in certain cases). Yep, ended that one on a prepositional phrase (was hoping the parentheses would make that smoother :-) Well, you know what to do, snag a copy and get-a-conjuring on some documents!

September 10, 2009
Version 0.8 (bug fix) has just been released.
Special thanks to Francois Marier, the Debian maintainer, for pointing-out that a stall was occurring on a particular document.

Version 0.7 has just been released.
This version deals with linearized PDFs and adds the (-i) option to report "creator" information about the document. Creator information in the newer XML metadata stream format is not handled. Special thanks to Remad, who wanted to know more about who produced the document.

May 23, 2009
pdfresurrect has now made it into the Debain project. You can check out the package info here.
There is also a new release out of pdfresurrect (v0.6). Credited to this release is Francois Marier, who wrote the manual page and also added some configure hints and build targets. I took those changes and rolled them in as appropriate. Thanks Francois!

May 19, 2009
It has come to my attention that pdfresurrect has made it to the Fedora project. So 'yum' it up (yum install pdfresurrect).
Also, had a few minor changes to push for version 0.5. Which is now in the wild. The changes address portability, and add some under-the-hood possibilities for dealing with xref streams that are compressed (possible in PDFs of version 1.5+). A message will be displayed if such a xref table is found. Also, some notes on validity and security have been added to the README.

August 10, 2008
Spelling correction release v0.04 is out.
This release corrects a misspelling in the URL for the license location.

August 2, 2008
Initial release of PDFResurrect (v 0.03). Swipe a copy here. And for your reading pleasure, the whitepaper discussing this tool is also available.

[edit] Source

The source is available on github.

[edit] Downloads

[edit] Documents

Faith in the Format: Unintentional Data Hiding in PDFs August 1, 2008

[edit] Contributors

  • Matt Davis (enferex): Original writer and maintainer.
  • Francois Marier: Some Makefile target and configure work, and the original man page.
  • Skhisma, Tele, and Jody for working the backend web stuff.
  • Anyone who has provided ideas, such as Remad for suggesting the "info" option "-i" to pull Creator/Producer information from PDFs.

[edit] Contact

This project is maintained and originally written by Matt Davis (enferex). Questions, suggestions, comments, coffee dates welcome. Matt Davis (