Skip to content

Reading header tags is slow, especially for files with many frames #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DylanMuir opened this issue Oct 25, 2017 · 3 comments
Open

Comments

@DylanMuir
Copy link
Owner

The tiffread31_header function is slow to iterate over the IFDs in large files. imfinfo is fast, but reads all tags.

@DylanMuir DylanMuir self-assigned this Oct 25, 2017
@DylanMuir DylanMuir changed the title Reading header tags is slower than imfinfo Reading header tags is slow, especially for files with many frames Oct 25, 2017
@DylanMuir
Copy link
Owner Author

DylanMuir commented Oct 25, 2017

imfinfo is fast because it uses the undocumented Matlab mex file matlab.io.internal.imagesci.tifftagsread. The calling syntax appears to be:

function vsInfoStructure = ...tifftagsread(strFilename, nBytesOffset, nIFDsToSkip, nNumIFDsToRead);

tifftagsread is 10x faster than the matlab version of tiffread31_header, but reads all tags. This is undesirable for ScanImage TIFF files, since the headers contain large duplicated Software and Artist tags (and others?).

@DylanMuir
Copy link
Owner Author

Suggestion: to write an accelerated mex version that reads only necessary tags from the TIF file.
Hassles:

  • We can't know how many IFDs are in the file in advance. Therefore need to handle re-allocation or chaining of data. Return cell arrays, allocate cells with chunks of data as necessary?
  • Returning structures is kind of a pain. Return a cell array for each tag, convert to a structure in Matlab?

@ehennestad
Copy link

ehennestad commented Apr 1, 2022

Hi, I have some ideas/questions about this.

  1. Sometimes we might know how many images/IFDs are in a tiff file? What about creating an option for passing such information to the TIFFStack on creation? That could partially solve the problem of reallocation you describe above.

  2. Sometimes, all the image in a tiff stack are uniform. Is it then necessary to read through all the headers? Say I know the number of images in the TIFFStack I want to open, and I know that all the images are the same format, is there a reason not to jus read the header of the first directory and use that information for all the remaining directories?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants