Pages

Friday, March 22, 2013

freely convert PDFs to PDF/A using ghostscript-9.07

 Here is an updated better way to freely convert PDFs to PDF/A using ghostscript-9.07


Convert annots.pdf to AnnotsPDFA.pdf

> /home/fausser/ghostscript-9.07/bin/gs -sDEVICE=pdfwrite -q -dNOPAUSE -dBATCH -dNOSAFER -dPDFA -dUseCIEColor -sProcessColorModel=DeviceCMYK -sOutputFile=AnnontsPDFA.pdf annots.pdf
GPL Ghostscript 9.07: Annotation set to non-printing,
 not permitted in PDF/A, reverting to normal PDF output


 Need to set the flag with a java program.....here is the code listing:
>cat  FixPrintFlag.java

//package Utilities;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;


import java.util.List;

 public class FixPrintFlag
{
        public StringBuffer errMsg = new StringBuffer();


    private FixPrintFlag()
    {
        //utility class, should not be instantiated.
    }


                              public static void main( String[] args ) throws Exception
    {
        PDDocument doc = null;
        try
        {
            if( args.length != 2 )
            {
                usage();
            }
            else
            {
                doc = PDDocument.load( args[0] );
                List allPages = doc.getDocumentCatalog().getAllPages();
                                for ( int i=0; i< allPages.size(); i++ ) {
                                        PDPage page = (PDPage)allPages.get( i );
                                        List annotations = page.getAnnotations();
                                        for ( int j = 0; j < annotations.size(); j++ ) {
                                                PDAnnotation annot = (PDAnnotation)annotations.get( j );
                                                        if ( annot instanceof PDAnnotationLink ) {
                                                                        PDAnnotationLink link = (PDAnnotationLink)annot;
                                                                        link.setPrinted(true);
                                                                        System.out.println("setting print flag...");
                                                                        }
                                        }
                                }
                        }
                        if (args[1] != null)
                                doc.save(args[1]);
        } catch (Exception ex) {
                                System.err.println("Error parsing pdf: " + ex.getMessage());
        }
}
  private static void usage()
    {
        System.err.println( "Usage: java org.pdfbox.examples.pdmodel.FixPrintFlag " );
    }

}

To compile it......using  Apache's pdfbox.jar and commons-logging.jar

javac -cp /home/fausser/Fixflag/pdfbox.jar:/home/fausser/Fixflag/commons-logging.jar FixPrintFlag.java


 
 And Run it......

> java -cp /home/fausser/Fixflag/pdfbox.jar:/home/fausser/Fixflag/commons-logging.jar:. FixPrintFlag annots.pdf annots_out.pdf
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
setting print flag...
>
trying to convert again:
> /home/fausser/ghostscript-9.07/bin/gs -sDEVICE=pdfwrite -q -dNOPAUSE -dBATCH -dNOSAFER -dPDFA -dUseCIEColor -sProcessColorModel=DeviceCMYK -sOutputFile=/home/fausser/Fixflag/AnnontsPDFA.pdf /home/fausser/Fixflag/annots_out.pdf

not a PDFA  yet, does not verify as one using Adobe Acrobat Pro 11x.....needs OputIntent by using PDFA_def.ps:

cat /home/fausser/ghostscript-9.07/lib/PDFA_def.ps

%!
% This is a sample prefix file for creating a PDF/A document.
% Feel free to modify entries marked with "Customize".

% This assumes an ICC profile to reside in the file (ISO Coated sb.icc),
% unless the user modifies the corresponding line below.

% Define entries in the document Info dictionary :

/ICCProfile (/home/fausser/eciRGB_v2.icc)   % Customize.
def

[ /Title (Title)                  % Customize.
  /DOCINFO /home/fausser/pdfmark     %not used

% Define an ICC profile :

[/_objdef {icc_PDFA} /type /stream /OBJ pdfmark
[{icc_PDFA} <

> /PUT pdfmark
[{icc_PDFA} ICCProfile (r) file /PUT pdfmark

% Define the output intent dictionary :

[/_objdef {OutputIntent_PDFA} /type /dict /OBJ pdfmark
[{OutputIntent_PDFA} <<
  /Type /OutputIntent             % Must be so (the standard requires).
  /S /GTS_PDFA1                   % Must be so (the standard requires).
  /DestOutputProfile {icc_PDFA}            % Must be so (see above).
  /OutputConditionIdentifier (CGATS TR001)      % Customize
>> /PUT pdfmark
[{Catalog} <> /PUT pdfmark




gs command using PDFA_def.ps
> /home/fausser/ghostscript-9.07/bin/gs -sDEVICE=pdfwrite -q -dNOPAUSE -dBATCH -dNOSAFER -dPDFA -dUseCIEColor -sProcessColorModel=DeviceCMYK -sOutputFile=/home/fausser/Fixflag/AnnontsPDFA.pdf /home/fausser/ghostscript-9.07/lib/PDFA_def.ps  /home/fausser/Fixflag/annots_out.pdf
[fausser@sally Fixflag]$

now it verifies as one

3 comments:

  1. The most recent version of pdfbox, 2.0.7, won't compile your javascript. I had to revert to the older 1.8.13 version. The error is :

    FixPrintFlag.java:33: error: no suitable method found for load(String)
    doc = PDDocument.load( args[0] );


    ReplyDelete
    Replies
    1. Even after I make this change, it still doesn't work. And even for files that don't have the Annotation problem, ghostscript doesn't create PDF/A that verifies with verapdf.

      Delete