The tiffsplit program from the libtiff library does a good job of splitting multi-page TIFF files into single pages. It doesn’t seem that great at handling TIFFs saved with ‘old-style’ JPEG encoding.

It’ll produce files, and the correct number of files too, but the files are actually in JPEG format. Most image viewers won’t open them still because they’re JPEG files with TIFF headers.

Assuming you have a TIFF file made up entirely of old-style JPEG encoded images you can loop through the files that tiffsplit has exported and remove the first eight bytes to give valid JPEG files.

The Python function below will write the contents of a given file that come after the first eight bytes into a new file with the ‘.jpg’ extension before deleting the original passed file.

def remove_header(path):
    """
    Write the contents of the given file after the first 8 bytes to
    a new file, then delete the original.
    """
    new_path = '%s.jpg' % os.path.splitext(path)[0]
    with open(path, 'rb') as input_file:
        with open(new_path, 'wb') as output_file:
            input_file.seek(8)
            output_file.write(input_file.read())
    os.remove(path)

Comments

comments powered by Disqus