Friday, June 27, 2014

Apaches PDFBox

With 2 kids to manage I am always looking out for interesting activities to keep them occupied and not tear down the house The elder one is now taking an interest in writing and reading so I download worksheets for her print them out and hand them to her .I need to print 20-30 pages for her on a daily basis so I wrote a small program to merge pdfs by directory ,name etc. I had used IText before but PDFBox is even easier
public void mergeFiles(File[] filesInFolder, String destinationFile)
   throws COSVisitorException, IOException {
  PDFMergerUtility mergePdf = new PDFMergerUtility();
  for (File file : filesInFolder) {
   mergePdf.addSource(file);
  }
  mergePdf.setDestinationFileName(destinationFile);
  mergePdf.mergeDocuments();

 }

List out all files that have an extension of pdf

class PDFExtFilter implements FilenameFilter {
  private String ext;

  public PDFExtFilter(String ext) {
   this.ext = ext;
  }

  public boolean accept(File dir, String name) {
   return (name.endsWith(ext));
  }
 }

private static final String ext = ".pdf";

private File[] getFiles(String folder, PDFExtFilter filter) {
  File dir = new File(folder);
  File[] filesInFolder;
  filesInFolder = dir.listFiles(filter);
  return filesInFolder;
 }
When files are encrypted we will need to decrypt them first then ask PDFBox to merge them

public void decrpyt(String folder, File[] files) 
throws IOException, CryptographyException, COSVisitorException {
  for (File file : files) {
          PDDocument doc=null;
   try{
   doc = PDDocument.load(file);
   if (doc.isEncrypted()) {
    doc.decrypt("");
    doc.setAllSecurityToBeRemoved(true);
    doc.save(file);
   }}
   finally {
    doc.close();
   }
  
 }}