Random Wisdom

Tag: file processing

Create links with absolute paths in Linux

by on Jan.16, 2010, under How To ..., Linux, Software

The default behaviour of the linking command (ln) is a little strange under certain circumstances. Since it creates the links using the literal value of the target, symbolic links created using relative path structures can often fail. Consider the following:

$ ln -s targetfile ../src/targetfile_link

Without a doubt, ‘targetfile_link’ will be a broken symlink since it links to a target that it assumes is in the same directory:

$ cd ../src && ls -l targetfile_link
lrwxrwxrwx 1 mafgani mafgani 5 2010-01-16 18:19 targetfile_link -> targetfile

This is quite unfortunate since it clearly clashes with the way that the linking mechanism should work intuitively.

The solution is to force ln into automatically appending the absolute path to the target files. This can be achieved by using a simple shell script that acts as a wrapper for the real linking command:


# Step through the supplied arguments and append the absolute
# path to targets that exist
for ARG in $@
  if [ -e $ARG ]; then
    LNARGS="${LNARGS} ${PWD}/${ARG}";
    LNARGS="${LNARGS} ${ARG}";

# Execute the actual link command with the modified args
exec /bin/ln ${LNARGS};

There are two known caveats:

  • The link is ‘sub-optimal’ if created from within the destination directory (the absolute path contains ‘../’s). It will still work however.
  • The links will always be absolute. If that is undesirable, save the script as ‘absln’ or something other than ‘ln’.

Using ‘absln’ instead of ‘ln’ in the previously described scenario now produces a working symlink:

$ absln -s targetfile ../src/targetfile_link
$ cd ../src/ && ls -l targetfile_link
lrwxrwxrwx 1 mafgani mafgani 16 2010-01-16 19:13 targetfile_link -> /tmp/files/targetfile
1 Comment :, , , , , more...

Graphics format conversion

by on Dec.09, 2009, under LaTeX, Linux, Software

Up until now I have been using the ‘convert‘ tool that comes with ImageMagick to switch between image formats — mainly for creating EPS files from JPG/PNG (raster format) files for use with LaTeX. Then I came across sam2p.

It is a light-weight utility that does one thing only and it does it well: convert between image formats. I’ve been using it for a while now and find that it can greatly reduce files sizes with minimal drop in quality. I’ve even used it to process existing EPS files just to get the reduction in file size. Best of all, it is multi-platform — executables are available for both Windows and Linux on the project homepage.

Goodbye convert and hello sam2p!

2 Comments :, , , , , , , more...

Printing multi-page duplex documents

by on Oct.18, 2009, under How To ..., Linux, Software

The psnup tool can be used to place multiple pages on each sheet of a document. E.g., the following command places two pages from the input file into each sheet of the output:

$ psnup -l -2 input.ps output.ps

While psnup is excellent for quick “N-up” conversion jobs, it doesn’t provide much control over the layout. The pstops utility on the other hand allows for fine grained scale, rotation and placement settings for each page that goes into a sheet of the output. The command syntax is a bit more complicated on account of the page specification strings that must now be provided. The following example shows a typical command needed to prepare a document for duplex printing with two pages on each side of a sheet:

$ pstops -pa4 \
  '4:0L@0.8(21cm,-1cm)+1L@0.8(21cm,12.55cm),2R@0.8(0,29.85cm)+3R@0.8(0,16.25cm)' \
  input.ps output.ps

The command is best understood by referring to the relevant section from the manpage:

       Pstops rearranges pages from a  PostScript  document,  creating  a  new
       PostScript  file.   The  input  PostScript file should follow the Adobe
       Document Structuring Conventions.  Pstops can  be  used  to  perform  a
       large  number  of  arbitrary  re-arrangements  of  Documents, including
       arranging for printing 2-up, 4-up, booklets, reversing, selecting front
       or back sides of documents, scaling, etc.

       pagespecs follow the syntax:

              pagespecs   = [modulo:]specs

              specs       = spec[+specs][,specs]

              spec        = [-]pageno[L][R][U][@scale][(xoff,yoff)]

       modulo is the number of pages in each block. The value of modulo should
       be greater than 0; the default value is 1.  specs are the page specifi-
       cations  for  the  pages in each block. The value of the pageno in each
       spec should be between 0 (for the first page in the block) and modulo-1
       (for  the  last page in each block) inclusive.  The optional dimensions
       xoff and yoff shift the page by the specified amount.   xoff  and  yoff
       are  in  PostScript’s points, but may be followed by the units cm or in
       to convert to centimetres or inches, or the flag w or h to specify as a
       multiple  of  the width or height.  The optional parameters L, R, and U
       rotate the page left, right, or upside-down.  The optional scale param-
       eter  scales the page by the fraction specified.  If the optional minus
       sign is specified, the page is relative to the  end  of  the  document,
       instead of the start.

       If  page  specs  are  separated  by + the pages will be merged into one
       page; if they are separated by  they will be  on  separate  pages.   If
       there  is only one page specification, with pageno zero, the pageno may
       be omitted.

       The shift, rotation, and scaling are performed in that order regardless
       of which order they appear on the command line.
Leave a Comment :, , , , more...

Embedding fonts in a PDF document

by on Oct.03, 2008, under How To ..., LaTeX, Linux, Software

It is often a good idea (or a requirement) to embed the used font faces in a PDF document. This is easily accomplished using ps2pdf during the final stage of conversion of a document from PS to PDF:

$ ps2pdf -sPAPERSIZE=a4 -dPDFSETTINGS=/printer -dCompatibilityLevel=1.3 \
         -dMaxSubsetPct=100 -dSubsetFonts=true -dEmbedAllFonts=true \
         'input_file.ps' 'output_file.pdf'

An explanation of the command options can be found in the Ps2pdf.htm file in the Ghostscript documentations (or here).


Leave a Comment :, , , , more...

Processing files using ‘find’

by on Mar.26, 2008, under How To ..., Linux, Software

In its most basic form, find is often used to locate files that are subsequently piped through a complex set of commands for processing. However, this particular method is easily broken by files that contain spaces in their names.

This is where the ‘exec’ option provided by find comes in handy. From the man-page:

-exec command ;
       Execute  command;  true  if 0 status is returned.  All following
       arguments to find are taken to be arguments to the command until
       an  argument  consisting of ‘;’ is encountered.  The string ‘{}’
       is replaced by the current file name being processed  everywhere
       it occurs in the arguments to the command, not just in arguments
       where it is alone, as in some versions of find.  Both  of  these
       constructions might need to be escaped (with a ‘\’) or quoted to
       protect them from expansion by the shell.  See the EXAMPLES sec-
       tion  for examples of the use of the ‘-exec’ option.  The speci-
       fied command is run once for each matched file.  The command  is
       executed  in  the  starting  directory.    There are unavoidable
       security problems surrounding  use  of  the  -exec  option;  you
       should use the -execdir option instead.

An example that recursively touches all *.log files from the current directory would be:

$ find . -name \*.log -exec touch {} \;
2 Comments :, more...

Mass conversion of images

by on May.07, 2007, under How To ..., LaTeX, Linux, Software

The following “one-liner” can be used to mass convert a given image format into another using the convert (part of ImageMagick) and basename tools:

$ for A in $(ls *.$SRC_TYPE); do convert $A $(basename $A .$SRC_TYPE).$DST_TYPE; done

where $SRC_TYPE is the file suffix of the original images (e.g. png) and $DST_TYPE is the file suffix of the type desired (e.g. eps).

Leave a Comment :, , , more...

Cropping a PDF Document

by on Jul.31, 2006, under How To ..., LaTeX, Software

Easily accomplished using pdftops:

$ pdftops -paperw WIDTH \
             -paperh HEIGHT \
             -noshrink -expand document.pdf && ps2pdf document.ps

WIDTH and HEIGHT are in points — they basically specify the dimensions of the image to be cropped.

Content is extracted from the center of the page. This technique is specially useful as a bypass for using psfrag with pdfLatex:

  • Save EPS figure with TAGS
  • Create a very simple tex document that simply includes the figure (centered) with psfrag replacements and run latex -> dvips -> ps2pdf
  • Follow the step above to crop out the figure.

The cropped out figure will have the TAGS replaced and be in PDF format — ready to be used with pdfLatex!

UPDATE [16 July 2009] It looks like pdfcrop might actually be a better option:

$ pdfcrop --help
PDFCROP 1.5, 2004/06/24 - Copyright (c) 2002, 2004 by Heiko Oberdiek.
Syntax:   pdfcrop [options] <input[.pdf]> [output file]
Function: Margins are calculated and removed for each page in the file.
Options:                                                    (defaults:)
  --help              print usage
  --(no)verbose       verbose printing                      (false)
  --(no)debug         debug informations                    (false)
  --gscmd <name>      call of ghostscript                   (gs)
  --pdftexcmd <name>  call of pdfTeX                        (pdftex)
  --margins "<left> <top> <right> <bottom>"                 (0 0 0 0)
                      add extra margins, unit is bp. If only one number is
                      given, then it is used for all margins, in the case
                      of two numbers they are also used for right and bottom.
  --(no)clip          clipping support, if margins are set  (false)
  --(no)hires         using `%%HiResBoundingBox'            (false)
                      instead of `%%BoundingBox'
  --papersize <foo>   parameter for gs's -sPAPERSIZE=<foo>,
                      use only with older gs versions <7.32 ()
  pdfcrop --margins 10 input.pdf output.pdf
  pdfcrop --margins '5 10 5 20' --clip input.pdf output.pdf

The tool comes as a part of the ‘tetex’ package.

Leave a Comment :, , , , more...

Prosper & PDF Output

by on Feb.24, 2006, under How To ..., LaTeX, Software

Prosper cannot be used with PDFTeX and hence PDF files must be obtained via the DVI -> PS -> PDF route. The default ps2pdf conversion, however, generates a PDF with pages that are a bit too narrow. This is easily remedied by specifying the size of the output desired to the ps2pdf program:


Where ‘x’ is the width in and ‘y’ is the height in 1/72″ units. So, for an approximately A4 size output, ‘x’=595 & ‘y’=842.

Leave a Comment :, , , more...

PSfrag for EPS Graphichs Text Manipulation

by on Jan.17, 2006, under How To ..., LaTeX, Software

There’s a nice package called psfrag that allows you to insert LaTeX constructs into EPS figures. This is specially useful with EPS files saved from MATLAB plots. The way it works is by replacing a given tag in the text of the EPS file with the LaTeX construct.

E.g. label the x-axis of of the plot as XLABEL and save the plot as an EPS file. Then, when you include that file, just put in the \psfrag{}{} tag:


The most obvious disadvantage is that it only works with EPS figures — so no PdfLaTeX. So, to compile a document to PDF, you’ll need to go the old latex -> dvi2ps -> ps2pdf way.

More details can be found on CTAN.

Leave a Comment :, , , , , more...