I’m working with a bunch of PDF files, some of which have been scanned at a bit of an angle. Adobe Acrobat allows me to rotate PDF files by 90 or 180 degrees. But is there a way to rotate a PDF just a few degrees – just enough to make it straighter?
I could perhaps take a screenshot, open it in Photoshop and rotate it, then somehow convert the Photoshop file to a PDF. However, that seems like a really clumsy process.
7
Answers
I had this at one time. I don’t know how many pages there are that you have.
What I did is print the pages that wear off use a paper cutter to square them up and rescanned them. Hope this helps.
And yes I’ve try to find some type of program to fix this and I still have not found one .
PDF supports for complete pages only
/Rotate
values of 90 degrees, because that is (of course) simple. What you need to do is rotate the contents, not the page. So you need to use something which can remake the PDF file for you.You could use either Ghostscript or MuPDF to do this. Either will require some coding:
Using Ghostscript you would need to define a
BeginPage
procedure which rotates the content by a small amount and moves the origin of the content slightly as well (because the rotation rotates around the origin, which is at the bottom left, not the centre).Here is a short utility script for rotating pages (written in Perl). It converts each page of the input PDF to a PDF XObject Form, rotates the form, then outputs the rotated page.
You’ll need to ensure the
PDF::API2
andGeopt::Long
modules are installed from CPAN.The script by default rotates 3 degrees anticlockwise, this is configurable vi the
--rotate
options.There are also
-x
,-y
and--scale
options to allow fine adjustments of the positioning and scale of the output pages.This question has also been asked on unix.stackexchange.com .
Another option is using LaTeX:
In this case, I have the file
odd-scan.pdf
(a slightly rotated one page scan) in the same folder as the LaTeX filerotated.tex
with the content above and then I runpdflatex rotated.tex
. The output is a filerotated.pdf
with the PDF rotated by 1.5 degrees clockwise.(I assume a *nix-style environment. On Windows, you can follow these instructions in Cygwin, although I think you might have to build MuPDF from source there as it doesn’t appear to be in the Cygwin repos. If you don’t want to do that and you’re okay with rasterizing the PDF, ImageMagick is in the Cygwin repos and can do the whole job if needed—see below.)
MuPDF’s mutool utility can do this. Say you have a PDF file
rotate_me.pdf
and you want a version of it rotated by 20° clockwise written to a filerotated.pdf
:(
mutool draw
docs)You can also rasterize the PDF using
mutool convert
, work with the image files, and then create a new PDF from them (this assumesrotate_me.pdf
has between a hundred and a thousand pages—edit the%3d
to your liking):(
mutool convert
docs)Once you’ve done whatever else you need to do the image files and you’re ready to turn them back into a PDF, you can use ImageMagick:
(If you get an error saying the security policy for PDFs doesn’t permit this, you may need to edit
/etc/ImageMagick-7/policy.xml
and comment out or remove the<policy domain="coder" rights="none" pattern="PDF" />
line. Be aware of this Ghostscript pre-v9.24 vulnerability which that security policy may be intended to mitigate. If you’re working with files you made yourself, you should be safe here, but you may want to re-enable this policy afterwards depending on your needs and environment. If you’re not working with files you made yourself, especially PDFs, be careful, whether you have a pre-v9.24 Ghostscript installed or not. PDF as a format is very complex and offers many different places to squirrel away maliciousness, and practically speaking you can never be 100% confident that the software you’re using to work with it is perfectly hardened.)ImageMagick can also rasterize PDFs on its own, although it’s a bit more complicated. For example:
This might look similar to the
mutool draw
command, but the difference is that ImageMagick will rasterize the input PDF and then use the resulting images to make the output PDF, so you can use all the regular ImageMagick transformations with this command.Anyway,
-density
is for DPI. It will default to 72 DPI if you don’t pass that argument, which is likely to not look very good. Also, ImageMagick doesn’t seem to be quite as smart as MuPDF about margins and things like that as far as PDFs go, so you may need to do more work with it than this to get reasonable output for your use case. If you do have access to both MuPDF and ImageMagick, I think doing the rasterization with MuPDF and then doing further work on the resulting images with ImageMagick tends to give the nicest results with the least work, but of course that may or may not be practical for you.(
magick convert
docs)Rasterization has obvious disadvantages if your PDF is vector-based—increased file size, fixed resolution, loss of flexibility, etc. Also, even if your PDF is already storing raster graphics, you may lose text data or the like from it in the conversion. If the PDF is really horrible, though, sometimes this is the least painful approach. You can OCR it if needed once you’ve cleaned it up using Tesseract, often with superior results to whatever may have been done before you arrived.
This can be done with
cpdf
:cpdf -rotate-contents 5 in.pdf -o out.pdf
(Rotates around the centre of the page by five degrees)
On Linux (and maybe on Windows via cygwin?)
unpaper
has a content deskew function that uses edge detection to straighten content. The PDF has to be converted to an image file first, such as ppm e.g.:NOTE: If the PDF is multi-page, the command above will generate one output file per page.
Then use unpaper with the default settings:
If the content is quite heavily skewed, then arguments like
--deskew-scan-range
may help, see the docs and man page.And lastly, convert back to PDF. Here is an example: