Link Trouble With Adobe Reader

Spaces in filenames for files on your web server are a bad idea… No question about that, but sometimes you don’t have control over the files you want to link to.

So let’s assume you create a PDF file that contains a link to another PDF file on a web server, and that second file just happens to have a space (or another special character) in it’s filename. You plead to the web master, but all begging has no effect, the filename stays the same. So, you create your link in Acrobat and test the file out with your browser and Acrobat installed and everything looks fine: Acrobat actually escapes the spaces correctly – it replaces every space with the sequence %20.

This is starting to look good – and easy. But once you test it with Reader XI, things are not looking as bright anymore. All of a sudden, there is more going on than just the replacement of spaces with %20 – we end up with the space character being replaced with %2520.

When we look up what %25 stands for in the list of URL encodings, we see that %25 is the encoding for the percentage sign: It looks like the URL got encoded twice: In the first round, the spaces got replaced with %20, and in the second round, the percentage sign (the first element of %20) got replaced with %25, so we end up with %2520 for every space in the original URL.

This is a bug in Adobe Reader, and unfortunately we need to wait for Adobe to fix this. There is however a workaround: When you turn off protected mode, the links start to work. To turn off protected mode, go to Reader’s Preferences and select the “Security (Enhanced)” category when using Reader XI, and the General category when using Reader X, then uncheck the feature “Enable Protected Mode at startup”. Restart Reader, and give it another try, it should be working now.

Posted in Acrobat, PDF | Tagged , , , , | 1 Comment

The Adobe Certified Expert Exam for Acrobat XI is Available

Have you ever thought about adding the Adobe ACE Certification to your resume?

Adobe published the exams and the exam guides for the new Acrobat XI Adobe Certified Expert (ACE) exams. They come in two flavors: As a brand new certification and as recertification for an existing ACE. You can schedule your test right from these pages – the new certification requires you to take the test at a testing facility, the recertification can be done online from your computer.

I just passed my recertification (my credentials are not yet published, this will take a couple of days), but here is proof that I know this stuff 🙂

Acrobat ACE Recertification

Screenshot of my Acrobat ACE recertification results.

Posted in Acrobat, PDF | Tagged , , | 3 Comments

Adobe Community Professional

I was very honored when I found out a few days ago that Adobe invited me into their group of Adobe Community Professionals. This is a very small and exclusive club: For the year 2013 there are 258 Adobe users worldwide across all Adobe products that have been awarded this designation, and I am one of them.

This is what Adobe has to say about this program:

The Adobe Community Professionals Program is a community based program made up of Adobe customers who share their product expertise with the world-wide Adobe community. The Adobe Community Professionals’ mission is to provide high caliber peer-to-peer communication educating and improving the product skills of Adobe customers worldwide.

With this designation, Adobe is recognizing my work at Experts-Exchange.com and AcrobatUsers.com, the talks I’ve giving about Acrobat and it’s features, and the content of this web site. This of course means some pressure for me to create more great content 🙂

Here is the list of all current ACPs.

Community professional badge

Posted in Acrobat | Tagged , , | 4 Comments

All About PDF Stamps

Putting “All About PDF Stamps” on the cover puts the bar pretty high when you write a book about PDF Stamps… But Thom Parker clears that bar without any problems! His book “All About PDF Stamps in Acrobat & Paperless Workflows” is both a great introduction into using PDF stamps, but also a powerful guidebook to implementing complex workflow scenarios.

You may ask if one small feature of Adobe Acrobat, which is comes with a ton of different features, really deserves its own book. The book is about stamps on the surface, the real subject of the book is almost buried in the last part of the title: Paperless workflows. Stamps are just a means to implement these workflows.

We’ve all been promised our flying cars, and the paperless office. I am still waiting for that flying car, but the paperless office can be a reality if you use the right tools, and Thom is about to introduce you to one very powerful gadget to add to your virtual tool belt. The key to using PDF stamps in such a workflow is to use not just stamps, but dynamic PDF stamps. I’ve written about them before in my post “More Interactive Dynamic Stamps in Seven Easy Steps”.

The book starts out with a very basic description of what PDF stamps are, before it moves on to describing some workflows that are based on stamps. To make things easier to understand, we do get the description of how a manual process using a static stamp would work, and that gets then contrasted with how smooth an automated workflow, based on dynamic stamps would be.

This is not just a book for the techies who would  (and can) implement such a workflow – even if you don’t know your stamp from a sticky note, the first part of the book contains a lot of useful information that would enable you to come up with ideas about how to convert your paper based processes into paperless workflows. The examples in this part should give you enough information about what’s possible with Acrobat to get your creative juices flowing. Once you have a plan, you can then hand off the implementation to somebody who cannot wait to get to the more technical parts of the book 😉

There are enough samples in the book – starting from very basic to a fairly script intense sample in the appendix – to demonstrate the different techniques outlined in the text.

What I really appreciated is that Thom does talk about what features are documented, and which are undocumented. This allows the reader to make an educated decision about what to use and where to be cautious when the next release of Acrobat comes along.

Along the way, we are learning about a number of things that do not directly have to do with stamps like document flattening, coordinates in PDF files, storing data persistently and a few more things that broaden your JavaScript horizon. This means you are benefitting from the book even if you are not planning on implementing a paperless workflow anytime soon.

I’ve read the book in paper, but I would have loved to have a PDF version of it. There is a Kindle version available, but I spend my days in Acrobat, so having all this knowledge available in PDF and searchable via a PDF catalog would have been even better. Speaking of searching: The book does have an index, so things are easy to find.

The book was written when Acrobat X was the latest version, but everything you find in the book regarding Acrobat X should also work with Acrobat XI.

Posted in Acrobat, Books, JavaScript, PDF, Programming | Tagged , , , , | Leave a comment

Installing PDFtk on Mac OS X Mountain Lion

One of my favorite PDF related tools is PDFtk. Some time ago I complained that it was no longer supported, but it seems that development has picked up again, and you can now find a brand new version – one that even comes with an installer for Mac OS X Mountain Lion. At least that’s what the section heading on the PDFtk page led me to believe. However, after digging a little bit deeper, I noticed that there is only one installer for Mac OS X 10.6, 10.7 and 10.8. That’s the reason why you may end up with the following error message when you try to install the package on Mountain Lion:

CoreServicesUIAgent

There is only an “OK” button, which closes the dialog and stops the installer.

No need to give you at this point: The reason for the error message is Gatekeeper, a new feature in Mountain Lion that limits what kind of applications can be installed without user interaction on your computer.  When configured this way, it will not allow programs that do not come from trusted sources to run. The sources that are trusted are the Mac App Store, and application that were signed with a certificate that does identify a specific developer. The key to this is in the System Preferences – bring up the System Preferences dialog and select “Security & Privacy”.

Security  Privacy

You could modify these settings temporarily, and then return to these fairly save settings, but there is an easier way around the problem:

Find the PDFtk installer that you’ve download earlier in Finder and control-click or right-click on it.

Finder

Then pick the “Open” option from the menu. This will bring up a slightly different error dialog:

CoreServicesUIAgent 3

There is now an “Open” button in addition to the button that closes the dialog. The message on the dialog also indicates that once we do use the “Open” button, this application will always run on this Mac, and will therefore no longer display the first error message. For a program installer that is probably not such a big deal – once the software is installed, we do no longer need to run it – but if you ever run into this error with software that is actually installed in your Applications folder, this is the way to make it run every time you double-click it.

So, just click on “Open” and let the installer do it’s job. After that, you can run PDFtk in a terminal by just typing “pdftk” or “/usr/local/bin/pdftk” if your search path is not set up to include /usr/local/bin.

Khk  bash  85×22

Posted in PDF | Tagged , , , , , | 2 Comments

Prevent the Save Dialog when Printing to the Adobe PDF Printer

[ UPDATE: For use on a 64bit system, see Kevin Rappold’s comment below. ]

This is from the category “Frequently Asked Questions” – this time how to programmatically specify an output filename when printing to the Adobe PDF printer. As you may know, the “Adobe PDF” printer allows you to create PDF files from any application on a Windows system that can print. All you need to do is (that is, after you’ve installed Adobe Acrobat), select to print, then select the “Adobe PDF” printer, specify any job options, and print. Voila, a new PDF file is created – after the application asks you to specify the filename for the new PDF file. That is usually a good thing when you manually initiate the print operation, because then you know where your PDF file gets stored on the computer. However, if you want to programmatically create PDF files from your application (e.g. from an MS Office application using VBA), that step is quite annoying, and can ruin one’s day when trying to process 1000 files.

In the olden days, it was necessary to first print to a PostScript file, and then call Distiller from your program to convert that PostScript file to PDF, but for quite some time now, Adobe provides a way to specify that PDF filename by setting a registry key. The details can be found in the Acrobat SDK. Here is the link to the page that describes this registry key  process.

The registry key HKEY_CURRENT_USER\Software\Adobe\Acrobat Distiller\PrinterJobControl should already exist if you’ve printed to the Adobe PDF printer before. If it does not exist, create it before we go any further (or, just print using the Adobe PDF printer).

The documentation requires us to create a sub key – or a key value pair – where the key is the path to the application that wants to save a PDF file, and the value being the filename for the PDF file. Sounds more complicated than it actually is: To print from the WordPad application to a PDF file without begin prompted for the filename use the following key value pair:

C:\Program Files\Windows NT\Accessories\wordpad.exe = c:\MyPDFFileName.pdf

To do this in the regedit tool, navigate to the PrinterJobControl key and right-click in the pane that shows the key value pairs. Then select New>String Value – this will actually create a new string value, and will open the name up for editing – change it to C:\Program Files\Windows NT\Accessories\wordpad.exe

Once that is done, right-click on the new entry and select “Modify”. In the “Edit String” dialog set the value to c:\MyPDFFileName.pdf and click on OK.

RegistryEditor 2013 01 18 15 25 35

After creating this registry key, print from WordPad and see how the file is printed without prompting for a filename. The other thing that will happen is that the key we just created gets removed. This means that this key needs to be created before every print job that needs to be saved to PDF automatically. You can see this by refreshing the view in the regedit application, but also by printing again from WordPad: This time, it will prompt for a PDF filename.

Sometimes it’s not obvious which application is actually printing – you may be running one application, but in the background it is handing control over to the application that is handling the print process. In this case, you can use the registry editor to see which application is responsible for printing. In the screen shot I’ve attached, you can see entry called “LastPDFPortFolder – wordpad.exe” – such an entry gets created every time an application prints to the Adobe PDF printer. By clearing out all sub-keys in PrinterJobControl, we can make sure we know which application was last used to print. We won’t get the full path, but just knowing the name of the executable will help to find the application.

I’ve written a small VB Script sample that will take the path to one or more Excel files on the command line, and will print these files to the default printer after setting the registry key to specify the PDF output filename.

' Set registry key to control PDF output and print an Excel
' file to PDF
' Karl Heinz Kremer - khk@khk.net - 1/18/2013 

Dim fso, exl, exlWkbk

const HKEY_CURRENT_USER = &H80000001

strComputer = "."

Set StdOut = WScript.StdOut
Set oReg=GetObject("winmgmts:{impersonationLevel=impersonate}!\\" _
& strComputer & "\root\default:StdRegProv")
strKeyPath = "SOFTWARE\Adobe\Acrobat Distiller\PrinterJobControl"

' Just in case, create the PrinterJobControl registry key -
' it should already exist
oReg.CreateKey HKEY_CURRENT_USER,strKeyPath
Set fso = CreateObject("Scripting.FileSystemObject")
Set exl = CreateObject("Excel.Application")

exl.Visible = False

If WScript.Arguments.Count = 0 Then
WScript.Quit
Else
For A = 0 To (WScript.Arguments.Count - 1)
If ((Right(WScript.Arguments.Item(A), 3) = "xls" OR Right(WScript.Arguments.Item(A), 4) = "xlsx") AND _
fso.FileExists(WScript.Arguments.Item(A))) Then

' set the registry key
dir = fso.GetParentFolderName(WScript.Arguments.Item(A))
basename = fso.GetBaseName(WScript.Arguments.Item(A))
ext = fso.GetExtensionName(WScript.Arguments.Item(A))

strValueName = "C:\Program Files\Microsoft Office\Office14\excel.exe"
strValue = dir & "\" & basename & ".pdf"
oReg.SetStringValue HKEY_CURRENT_USER,strKeyPath,strValueName,strValue

Set exlWkbk = eXL.Workbooks.Open(WScript.Arguments.Item(A))
exlwkbk.PrintOut , , , , "Adobe PDF"
exlWkbk.Close xlDoNotSaveChanges
End If
Next
End If

eXL.Quit
Set fso = Nothing
Set exl = Nothing

You can download the script here.

To run the script, provide the full path to an Excel file on the command line:

excelprint.vbs c:\Temp\Test.xlsx

Play around with the script, see if you can implement the registry changes in a different environment (e.g. VBA or a C++ application), and most importantly, have fun!

 

Posted in Acrobat, PDF, Programming | Tagged , , , , , | 6 Comments

The End is Here – of ADM that is

Adobe just released the Acrobat XI SDK, and as usual, there are a number of changes – new APIs, modified ones, and the final removal of ADM (that is the Adobe Dialog Manager). The release notes are here [PDF].

I’ve talked about this in a blog post before, and given some suggestions about how to deal with the discontinuation of ADM support. Back then, Adobe had removed the header files so that no new plugins with ADM support could be compiled, but existing plug-ins that were using ADM were still working. This is no longer true: Any plug-in that was compiled with ADM enabled, will no longer work with Acrobat XI. I said “with ADM enabled” – that does not necessarily mean that ADM is actually used to generate dialogs, all that’s required for the plug-in to fail to load now is that ADM was compiled into the plug-in. For some versions of the Acrobat SDK this actually means that ADM was not explicitly disabled when the plug-in was compiled.

If your plug-in does not load in Acrobat XI due to ADM, it’s time to address this problem, there is no way around it. It might be quite a bit of work to replace an ADM based GUI, but if you want to extend the life of your plug-in, that’s what needs to be done. Let me know if I can help you with that process.

Posted in Acrobat, PDF, Programming | Tagged , , , , | 1 Comment

The Trouble With the XREF Table

Have you ever tried to debug a problem with the XREF table in a PDF file? I’ve been debugging PDF problems for a long time, and every now and then I come across a file that has a corrupt XREF table – either because the PDF generating application did not emit a valid XREF table, or because somebody tried to edit a PDF file and did not update the XREF table. And yes, there are applications that write out corrupt XREF data – either as a result of a bug, or because the developer did not understand the PDF spec.

The most important information about the XREF table (and many who try to create a PDF file in a text editor will stumble over this) is that every entry is exactly 20 bytes long – including the line ending character(s). This means that the content of the line is different depending on the line ending conventions used (e.g. CRLF for Windows vs. LF for Unix or Mac OS). If the file is generated on a Windows system, the line ending will already use up two bytes, so we have 18 bytes left for the actual XREF entry:

– 10 digits for the byte offset
– 1 space between the byte offset and the generation number
– 5 digits for the generation number
– 1 space as delimiter between the generation number and the in-use/free flag
– 1 byte for the in-use/free flag

That makes 18 bytes. For a line that ends with just a LF character, we need to “stuff” the line with a space character after the flag. So, when writing XREF data, make sure that you are indeed writing out 20 bytes per XREF entry.

The next problem is what to use as the byte offset for the different entries. There are different ways to determine the byte offset. If you are using vim as your editor, put the following into your .vimrc file and you will get the byte count of the character that’s currently under the cursor:

set laststatus=2
set statusline=%o/%l/%c

The byte count is not the same as the byte offset relative to the beginning of the file: The first character in the document will show a byte count of “1” – it’s the first byte, but this character will have an offset of ‘0’ relative to the beginning of the file (it is the beginning of the file). So, in order to convert the byte count we need to subtract 1 from the value that is being displayed.

For the rest of this document I will use the basic PDF file that gets created via the excellent article series “Make Your Own PDF” – created by the people who brought you JPedal and PDF2HTML5, IDRsolutions.

When I try to open the file in Ghostscript, I end up with an error message:

gs test.pdf
GPL Ghostscript 9.06 (2012-08-08)
Copyright (C) 2012 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
**** Error reading a content stream. The page may be incomplete.
**** Warning: An error occurred while reading an XREF table.
**** The file has been damaged. This may have been caused
**** by a problem while converting or transfering the file.
**** Ghostscript will attempt to recover the data.

Here is a screenshot of vim editing a PDF file. I’ve highlighted three lines: The first one shows where the cursor is positioned – it is on the start of the line that starts object 6. The second highlight represents the XREF entry for object 6. And the last highlight shows the character count up to and including the character under the cursor. This is 318 characters – we need to adjust that by subtracting one to get the byte offset, so object 6 starts at byte offset 317, which is what is found in the XREF table.

Vi pdf

I assume that other editors can provide similar information. I won’t go into debugging this file any further in the editor, it’s a tedious process, and I have a better solution:

A different approach is to output the PDF file with the byte offsets for every line prepended to the line’s content via a small program. A while ago I wrote such a utility, which I cleaned up a little for this post. You can download the C source code here: print_pdf_offset.c.

Just compile the utility (e.g. via make print_pdf_offset if you have a command line build system installed) and provide the filename of a PDF file on the command line:

print_pdf_offset test.pdf

This will then print the PDF file with the byte offsets (and this time this is the true byte offset, so no adjustments necessary) to stdout. Here is an example of the output:

00000317: 6 0 obj
00000325: <</Length 44>>
00000340: stream
00000347: BT /F1 24 Tf 175 720 Td (Hello World!)Tj ET
00000391: endstream
00000401: endobj
00000408: xref
00000413: 0 7
00000417: 0000000000 65535 f
00000436: 0000000009 00000 n
00000455: 0000000056 00000 n
00000474: 0000000111 00000 n
00000493: 0000000212 00000 n
00000512: 0000000250 00000 n
00000531: 0000000317 00000 n
00000550: trailer <</Size 7/Root 1 0 R>>
00000581: startxref
00000591: 406
00000595: %%EOF

By looking at the XREF table, it’s pretty obvious that there is something wrong with this file: The XREF entries are not 20 bytes long, they are each one byte short. This probably means that the table was created on a system that uses CRLF as line endings, and I just pasted it into an editor on a Mac that only uses LF – hence the missing byte (this is not the real reason, I “broke” the file for demonstration purposes). Also, the startxref information points to byte offset 406, but the XREF table actually starts at location 408. No wonder this file gave me problems when trying to open it with Ghostscript. After fixing these two problems (adding a space after every XREF table entry and changing the 406 to 408) the file loads without any problem.

By the way: The corrupt file loads without any error message in Adobe Acrobat or the Adobe Reader: The only indication that something is wrong is that Acrobat wants to save the file when it’s closed – because it got repaired behind the scenes. Chances are that there is a popup informing the user that the file is being repaired, but with such a small file and a fast computer, it’s gone before the user is able to register it.

Posted in PDF, Programming | Tagged , , | Leave a comment

Validating Field Contents

One of the questions I get asked again and again is how to validate a field value in an AcroForm with a custom validation script. Adobe provided a lot of infrastructure to do that with just a simple script.

Let’s take a look at how to do that with a text field that is only supposed to have a value of either ‘AAAA’ or ‘BBBB’ (yes, I know that this does not make much sense in a real PDF form). So, if the user enters ‘01234’ we should see an error message that would instruct the user about what type of data is valid for this field.

To start, we create a text field and bring up the properties dialog for the field. Then we select the “Validate” tab to see the validation options:

AcrobatScreenSnapz003.png

The default is that the field will not get validated. For numeric fields, there is a convenient way to validate a value range, but we want to select to run a custom validation script. After the “Edit” button is clicked, a new window will open that allows us to edit the new script:

AcrobatScreenSnapz005.png

To make things easier to copy&paste, here is the script again:

event.rc = true;
if (event.value != "" && event.value != "AAAA" && event.value != "BBBB")
{
    app.alert("The entered value needs to be either 'AAAA' or 'BBBB'!");
    event.rc = false;
}

This script also includes a check for an empty string, so that the user can wipe out a wrong string and start from scratch.

As I mentioned before, information is passed to the validation function in the event object, and in the code we see that the member ‘value’ is used to communicate the current value of the field. The member ‘rc’ (or return code) is used to communicate back if the validation was successful or not. In the latter case, we set rc to false, and also display an error message.

When you play around with the function, you’ll notice that the validation function is only called when the focus leaves the field, so you have to click outside of the field to actually make that error message pop up. In that case, the previous value of the field is restored, and the user has to enter the data again.

This is not always desired (for more complicated data, it will probably be much easier to take a look, correct that one typo and continue with the rest of the form), so my preference is actually to mark the field so that the user knows which field needs to be corrected, and have the validation script not report a validation error back to the field:

event.rc = true;
if (event.value != "" && event.value != "AAAA" && event.value != "BBBB")
{
    app.alert("The entered value needs to be either 'AAAA' or 'BBBB'!");
    event.target.textColor = color.red;
}
else
{
    event.target.textColor = color.black;
}

Using this method has implications on the form submission process: The form no longer can verify that the data is correct, so the submission function needs to do another round of validation to see if any of the required fields are not correct (one way to do that is to test all relevant fields to see if the text color is using the error color, or we can use global variables to store the validation state).

Another thing I like to do is to display the validation error message on the form in an otherwise hidden field: The problem with our last solution is that if the user saves a partially filled form, and picks it up at a later time, that error message that popped up is long gone, and the only indication that there is something wrong with the form is the modified field color. So, having a text field contain that error message might be a good idea.

There are other ways to highlight the field in question besides changing the text color, the border color or the fill color could be changed instead, or in addition, just make sure that you are not making the form impossible to read.

To learn more about the event object, take a look at http://livedocs.adobe.com/acrobat_sdk/10/Acrobat10_HTMLHelp/JS_API_AcroJS.88.560.html – make sure to click on the button in the upper left corner to display the navigation pane if it’s not shown automatically.

Posted in Acrobat, PDF, Programming | Tagged , , , | 355 Comments

Nastyware Claiming to be Adobe Reader

I just received an interesting email, claiming to come from Adobe. This is the first time I’ve seen such a sneaky attempt to get me to download something to my computer that could do some pretty nasty stuff to me, my computer or my online reputation.

Take a look at this screen shot:

MailplaneScreenSnapz002

The email seems to come from Adobe Systems Incorporated, but when I did one level deeper, it’s from Adobe@news.protours.de – not an Adobe address I am familiar with. I also clicked on the download links – I use a virtual machine that can easily be reset for such potentially dangerous clicks – and I ended up on a server in the former Soviet Union (.su) – always a good indication that whatever a web site claims is probably not what you are going to get.

So, if you receive such an email, don’t click on the link. There is only one source for save upgrades of the free Adobe Reader: Adobe’s own site at http://get.adobe.com/reader/

Posted in Acrobat, PDF | Tagged , , , , | 3 Comments