Thursday, November 24, 2011

Convert PDF to image via PDFbox

Recently I have been asked to generate an image from a PDF file.

In this post I'll use the Apache project pdfbox as the ImageToPDF converter.

Convert PDF page into image
I'll specify two samples, however full complete documentation about  possible options and default values could be found here.

String pdfPath = "/path/to/file.pdf";
//config option 1:convert all document to image
String [] args_1 =  new String[3];
args_1[0]  = "-outputPrefix";
args_1[1]  = "my_image_1";
args_1[2]  = pdfPath;

//config option 2:convert page 1 in pdf to image
String [] args_2 =  new String[7];
args_2[0] = "-startPage";
args_2[1] = "1"
args_2[2] = "-endPage";
args_2[3] = "1";
args_2[4] = "-outputPrefix"
args_2[5] = "my_image_2";
args_2[6] = pdfPath;

try {
// will output "my_image_1.jpg"
        PDFToImage.main(args_1); 
// will output "my_image_2.jpg" 
   PDFToImage.main(args_2); 
      } catch (Exception e) { logger.error(e.getMessage(),e); }
and that's it. As simple s that.
The output image is very good and include also text that was created in JavaScript in the PDF.

8 comments :

  1. This is a nice article.. But I think it would be more better is the Image is save to BufferedImage instead of local hard disk

    -Hemant

    ReplyDelete
  2. Where to specify where to save images to? Or set Java working directory?

    ReplyDelete
    Replies
    1. By default it is set to Java working directory.I found that when I run windows search for the image

      Delete
  3. I am running the PDFToImage on my windows box. I have included the Pdfbox-app-1.8.2.jar and i see output images are having the garbage values for text. Am i missing any dependent jar's? or any settings?

    Thanks
    Suresh K

    ReplyDelete
  4. Try Aspose.PDF for Java for converting pdf to image in java and vice versa. Its not a free library but you can try its free trial like i did and after getting perfect results i purchased one of their packages and it was worth purchasing.

    ReplyDelete
  5. Replies
    1. Well then your use case is different enough not to apply.

      Delete