Wiki Table of Contents

PDF


On this page:


Introduction

The Portable Document Format or PDF was developed by Adobe Systems in 1993. The main purpose of the effort was to develop a document format that would be independent of the application software, the operating system, and the computer hardware. When creating a new document, one usually uses an editing application such as a word processor, a paint program, a spreadsheet program, etc., all of which support formats that are designed for the editing process. In general, conversion to PDF is viewed as a last step in this document generation process; i.e., the dissemination of a final version of the document. Accordingly, most such editing applications have the capability of easily saving the document or drawing as a PDF file. A desirable characteristic of a final version format such as PDF is a limitation on the ability to change the document. Consequently, the PDF format tends to discourage easy editing once a document is converted to a PDF file.

Edit PDF files

There are a series of tools available to the Linux user that can be used to edit PDF files. Unfortunately, no one tool is optimal for all PDF documents. You will find that the editing results can vary widely based on a range of factors such as the version of the PDF format being modified, the rendering engine, the options with which the original PDF file was created, what fonts were used to create the document, what system fonts are available, etc. Consequently, the user may have to try more than one approach to editing a PDF document to achieve the best results. A search of Synaptic provides an extensive listing of a number of such utilities.

The following is a summary of the three most common solutions to PDF editing in Linux. It is recommended that when editing a PDF file, the user try each of these approaches to determine which provides the best rendering for a given document.

Solution 1: PDFedit

PDFedit is a native Linux application based on the xpdf library with a QT3.x-based Graphical User Interface (GUI) that is currently being ported to QT4. It is not really a word processing-type editor as such, but rather is designed to allow direct editing access to the raw PDF file code. As a result, it is more oriented to the advanced user with knowledge of PDF file code constructs and supports extensive user-customized scripting based on the ECMAScript scripting language. For most users, this means there is a fairly steep learning curve to become proficient in the more advanced features. However, there are some basic GUI functions already installed for the casual user that are useful for small edits such as limited text changes or filling out form fields. PDFedit is not in the repos, but can be installed from the project download page. For more information, visit the PDFedit Wiki.

Solution 2: LibreOffice with PDF Import

LibreOffice (=LO) is a complete open source office suite originally based on StarOffice and is comprised of Writer (word processor), Draw (drawing program), Calc (spreadsheet program), Base (database program), and Impress (presentation editor). The LO suite is installed in antiX 15 and MX Linux by default. Install libreoffice-pdfimport from the repos in order to edit PDF files with LO. You can then simply open a PDF document like any other document and it will open in LO Draw for your use.

Solution 3: GIMP

Particularly useful when dealing with image PDFs such as page scans, maps, etc., GIMP will “Open” and “Export” PDF files. The default resolution is 100 dpi, which is OK for on-screen use, but for later printing you should consider a higher value such as 300 dpi.

Importing multipage PDFs is supported either as layers or distinct images. While you can’t directly export a multipage PDF, you can export the layers as a .mng (multipage png) file and use the imagemagick convert command to convert that to a multipage pdf.

Solution 4: Inkscape with PDF Tool Kit

Inkscape is an open source vector-based drawing program (similar to Illustrator or CorelDraw) that loads and saves a subset of the SVG (Scalable Vector Graphics) file format and can be installed using Synaptic.

Method A: Open PDF File Directly with Inkscape

PDF files can be opened directly with Inkscape, but only one page at a time. So, if you are working with a single page document then you just open it, make the edits, and save the file back to PDF format. However, if you try to load a multi-page PDF document, Inkscape will ask which page of the document you want to open. You will need to open and save each page of the document as separate PDF files, leaving you with an individual file for each page. To re-assemble these individual files back into a single document, you will need to use PDF Took Kit (pdftk). Pdftk, which can be installed using Synaptic, is an open source set of tools for manipulating PDF documents. In this case, we need to use pdftk to re-assemble the pages of the document. This is accomplished by entering the following command sequence into the console command line interface…

  pdftk page1.pdf page2.pdf page3.pdf cat output merged_document.pdf

…where page1.pdf, page2.pdf, etc. are the names of the individual page files and merged_document.pdf is the name of the re-assembled output file.

Method B: Convert PDF File to SVG Before Opening

Sometimes a PDF document opened directly with Inkscape will not be rendered correctly. In such a case, you should try converting the document to SVG before opening it with Inkscape. To do this, you will need to install pdf2svg using Synaptic. Pdf2svg is a small command line utility that can split up a PDF document into individual SVG page files. To do this, you enter the following command sequence into the console command line interface…

   pdf2svg input.pdf output_page%d.svg all

…where input.pdf is the PDF document and output_page is the name for the page files. The %d simply appends a sequence number to each page file. Once converted, the files can be edited with Inkscape and re-assembled with pdftk.

Reduce PDF file size

Method 1

Open the pdf with qpdfview and print (CTRL-P) and change the printer Name to Print to File (PDF) and specify to desired output file name and path.

Method 2

Use the gs command below in a terminal, replacing output.pdf and input.pdf to your filenames. The -dPDFSETTINGS= can be changed to the following (from lowest to highest resolution):

  • /screen
  • /ebook
  • /printer
  • /prepress
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -sOutputFile=output.pdf input.pdf

Insert/remove, rearrange, and merge pages

A very handy graphical application called PDFShuffler is available in the repos that enables easy management of pages and documents.

PDF forms

Create

You can create a PDF form using LibreOffice that users can fill out and print or save by making use of Form Controls. Click View > Tool Bars > Form Controls to pull up the controls (text box, list, etc.), then click on the control you want to use and draw a box in your document. When done, click File > Export as PDF, where Create as PDF Form will already be checked. See this extensive discussion for details.

Read

Sometimes the filled data in a form PDF can not be seen or printed. One tool that usually solves this is PDF-Xchange VIewer which runs under Wine.

Insert comments

The default reader qpdfview has no way of making comments that other PDF readers can see, nor does the Adobe Reader for Linux. There are two good options for doing this available:

  • PDF-XChange Viewer is an application for Windows XP and up, and runs very well under Wine. It inserts comments and has many options.
  • Xournal is a native Linux application that is available from the repos. It is designed for note-taking, and can be used with a PDF.

Web to PDF

Many modern browers including the default Iceweasel have an extension that can be installed called Web2PDF. Such extensions use a free online service with the same name. Note that the setup of the particular webpage you want to use may affect the readability of the PDF that results.

Scan to PDF

A very handy app is installed by default in MX-14 and MX-15 for scanning to a PDF called gscan2pdf; it is also good as a general scanning application. See usage and tips in the main Wiki entry.

Encryption

To deal with encrypting and de-encrypting a PDF file, install qpdf. It is a CLI program with great capabilities that can be viewed by opening a terminal and typing

qpdf --help

Details on unencryption can be found in this document.


v. 20170313

Leave a Comment