Skip to content

WIP: Attempt to provide a proxy for gzip file #454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 36 commits into from

Conversation

bpinsard
Copy link
Contributor

@bpinsard bpinsard commented Sep 5, 2012

My hard drive is full but some of our codes/interfaces uses matlab/spm.
So I coded an FileProxy extension to Node that uncompress the .nii.gzip files before running the interfaces, then compress the output nii files to nii.gz (though only if the output_name is specified).
It maybe not the best design, I have been thinking after that Node instances could be added a list of file preprocessor/postprocessors object to handle certain types of files.
I push it not to be included but to discuss about solution to handle spm .nii.gz files problem and maybe other interfaces I/O that I am not aware of.
What do you think?

bpinsard added 13 commits August 31, 2012 16:26
* master: (34 commits)
  Added a test for SPM bool conversion.
  Call super for all SPM _format_args()
  call super
  Make all Booleans to be formated as 1,0 in SPM
  updated changelog
  fix: added test for deepcopy bug
  fix: added back deepcopy
  fix: thanks to @rkern we now have a passing test with dynamic traits retaining their type through pickling
  fix: test that fails when traitedspec is not pickled properly
  updated changelog
  Added a default to maintain backwards compatibility
  utils.Merge option for not flattening the output list
  utils.Merge option for not flattening the output list
  utils.Merge option for not flattening the output list
  added mention about the new interface to the changelog
  Added tests stub files
  Make check-before-commit
  Significantly faster smoothing in tessellation_tutorial with MeshFix, CFF output option
  Interface for MeshFix by Marco Attene, Mirko Windhoff, Axel Thielscher
  Fixed deprecated config option.
  ...
@chrisgorgo
Copy link
Member

This is something that we should finally address. The general testcase is on the fly conversion of files into compatible formats (not only nii -> nii.gz, but also nrrd to nii maybe even dicom to nii). There are some use cases we need to address:

  1. could we make it implicit? For example we know that SPM interfaces are not compatible with nii.gz. Therefore if ones connects .nii file to an SPM Interface it should be converted. MAybe we should put compatible filetypes in traits?
  2. File conversion can be made by many different external pieces of software as well as python libraries. Maybe we should make some flexible framework for those converters?

@bpinsard
Copy link
Contributor Author

I completely agree there will be other cases of file format incompatibility and we need to find a as generic as possible way to handle this without having to add nodes to do the pipeline. I tried to handle this at the Node level but there could be information in the interface traits, it is more of a quick hack to solve my full disk problem.

@chrisgorgo
Copy link
Member

The issue is that hard disk space and format incompatibilities are sometimes not the same problem. For example - imagine this pipeline:

nii.gz -> A(FSL) -> B(FSL) -> C(SPM) -> D(SPM) -> E(FSL)

if you only care about data format conversion nii.gz will be converted to .nii only once (B->C). Since FSL can read .nii there is no need to do anything at the D->E step.

On the other side if you want to save space you would like to compress outputs of SPM each time.

@chrisgorgo
Copy link
Member

Ok men. I don;t want to hold you back with this any longer. Just please add some code example in the docstrings (like a super simple SPM/FSL pipeline) and a mention in the changelog and we are ready to merge.

bpinsard added 2 commits October 4, 2012 10:17
* master: (52 commits)
  fix: update getmask to dilate while binarizing
  sty: rest formatting
  Skipping dynamic traits tests for now.
  Fixed failing doctest.
  sty: fixed white space and removed traits fix
  Reverted the traits "fix" workaround.
  Removed redundant example, renamed the antsRegistration template.
  Put interpolation argument back in. Renamed the input/outputspec.
  Added changelog information.
  Revert "fix: added test for deepcopy bug"
  reverted traits change2
  reverted traits change
  ENH: Future proof the API for new template building
  Fixed input/outputspec names.
  Refactoring.
  Added default values.
  Simplification and refactoring of the first example.
  We can provide a default value for an output - no need to have it mandatory.
  Removed unecessary imports.
  Renamed the example to conform with naming conventions.
  ...
@bpinsard
Copy link
Contributor Author

bpinsard commented Oct 4, 2012

Yes I left this a bit aside as to my eyes it seems more a quick hack to solve my disk space problem, maybe it should not be merge now.
There would be better way to do the job, maybe being able to register multiple proxies for a single node to handle multiple file format changes.
The on-the-fly conversion with interfaces specifying accepted file formats would be great and then having a collection of standard file-proxies that can be extended with new ones for exotic cases.
Btw, is there any kind of roadmap for nipype, the "0.4 brainstorm" is a bit old now and there are multiple ideas showing up for enhancements.

bpinsard added 10 commits December 7, 2012 10:21
* master: (138 commits)
  fix: config updated before checking for deprecated options
  put back the n_procs docs
  Udated docs with n_procs for MultiProc
  fix: adding atlases
  fix: removing recommends
  fix: restore fsl
  fix: remove fsl for now
  fix: variable name
  fix: use neurodebian sources
  fix: add missing nibabel
  fix: install nibabel in python dir
  fix: remove easy_install suffix
  fix: testing sklearn travis fix
  added cmdline example
  pep8
  Fixed doctests.
  Fixed missing comma
  Added back SlicerCommandLine proxy for backward compatiblity
  Regenerated Slicer classes.
  fixed data path
  ...
* master: (76 commits)
  fixed paths
  Back to dev
  0.7 release
  Code cleanup.
  doc: updated inputs help to indicate FSL and AFNI inaccuracy when using bounds_by_brainmask
  Fix nipy#493
  Fixes nipy#459
  fixed doctests
  changelog
  fix: ants workflows setup
  BF: Fixed several issues with GroupAndStack interface
  DOC: Various clean up an improvements to docstrings.
  DOC: Added example to LookupMeta docstring
  ENH: Allow meta_keys input to LookupMeta be a mapping.
  minor fixes
  deprecated version should be a string, added AFNITraitedSpec for backward compatibility
  docs
  PEP8
  name_source can be a list, make "infolder" a deprecated name
  Removed unnecessary outputsspecs
  ...
* master: (47 commits)
  fixed applymask output
  fixed test
  Fixed some details for the PR nipy#530
  Removed accidentally added files (nipy#530)
  Updated changelog
  Updated dMRI and fMRI pre-processing
  Added motion correction pipeline
  Modified FSL FLIRT interface to provide log support
  Polished code
  buildtemplateparallel.sh - issue with input files
  Update legacy.py - buildtemplateparallel.sh
  Minor fixes
  Added MBIS to sub-package configuration list
  Standardized import to add all utilities at once
  First implementation for MBIS
  New EPI dewarp workflow (nipy#525)
  New EPI dewarping using a workflow
  api: force users to explicitly define whether to sort files
  BF: Fix SplitNifti to generate unique filenames for outputs.
  added mask input for bias field correction
  ...
* master: (54 commits)
  constrain write_which to length 2
  BF: Fix issue where unicode paths were not recognized as files.
  changelog
  Doctests fix
  Fixed tests.
  in fact masking wasn't working, HistogramRegistration takes mask as numpy not nibabel
  allow similarity computation without mask
  Added [xor] metadata to afni.AutoTcorrelate
  added mask_source option to afni.AutoTcorrelate
  added mask_source option to afni.AutoTcorrelate
  Fixed error in _rotate_bvec that was adding an extra entry to the output rotated_bvecs file and offsetting the correction by one volume.
  fix fugue for not necessarly wanted output
  add Bandpass to init
  homogenize name function accross interfaces
  fixing for new gen_filename autopep8
  autopep8
  few interfaces added, pep8
  add min version for bbr options
  pep8
  doctest
  ...
* master: (133 commits)
  fix: added svg neurodebian logo
  fix: notebook links to latest version
  enh: updated tutorial
  fix: broken test to reflect interface changes.
  Slight correction to previous PR
  Fix SPM interfaces
  fix: mark's last name
  fix: updated tutorial to include download of data
  enh: added tutorial notebook
  fix: reverted fix while keeping changed meta tag.
  each time I use an afni interface I have not been using for a long time I spend 4 hours fixing some bugs that goes down to base interface specs......
  fix tcorrmap for use of source_name, depends on recent fix cline_bug branch
  where does source_name come from? should fix a lot of gen_file in afni other than out_file
  fixe name_source traits
  fix the TCorrMap histogram options
  doc: update information on output options for CommandLine interfaces
  tst: added doctest ellipses to fix afni tests
  tst: added missing doctest file maps.nii
  fix: changed input name for 3dallineate
  fix: clean up fsl model tests.
  ...
bpinsard added 4 commits July 26, 2013 10:50
* master: (64 commits)
  PEP8
  Added human order sorting for FLAMEO outputs, fixed abspath for Cluster outputs.
  added back the fstat outputs to fsl flameo interface
  enh: replace id extraction with regular expression
  fix: check only the last line in the sge qsub stdout
  FIX broken MapNode doctest
  Added XOR condition between apply_isoxfm and apply_xfm
  fix imports
  Solve Merge issue
  Solve merge issue
  PEP8
  Add missing option apply_isoxfm to FSL Flirt wrapper
  Merge and clean up FSLGLM into GLM.
  added FSLGLM
  doc: fixed building with Sphinx 1.2
  Updating CHANGES
  base_dir in Workflow
  Updating pipeline docstrings
  Reversing the order of the name and iterfield args in MapNode
  Also making iterfield positional in MapNode
  ...
* master: (193 commits)
  fix: version property
  doc: added pointer to ants example
  fix: removed cma to freesurfer mapping, updated docs and pointed to atlas/template downloads
  fix: remove np deprecation decorator and moved imports to run_interface.
  fix: explicitly test for nipy version and remove imports to specific interfaces closes nipy#648
  Added FuzzyOverlap to changelog
  fix: set slice_times as seconds outside workflow generating function
  fix: replaced plugin_args with proper arguments
  enh: replace realignment routine with newer Nipype interface for Nipy's newer class
  enh: added SpaceTimeRealign from nipy 0.4.dev
  api: always run datasink and let copyfiles decide if things need to be copied or not. in case inputs are directories, this will allow refreshing outputs.
  fix: moved imports for dicominfo outside of the function
  fix: use ceiling of slice thickness for erosion purposes
  fix: add imports to the functions
  enh: compressing more into functions
  enh: replaced erosion with slicethickness and added epimask support for bbregister
  fix: updated ants example to use new parameters from ants
  tst: fixed ants doctests to indicate smoothing sigmas
  enh: updated antsApplyTransforms to accept input image type
  api: changed smoothing sigmas to accept float in antsRegistration
  ...
* master: (29 commits)
  add DicomImport to spm init.py
  fix merge conflict in CHANGES
  add interface for spm's dicom import
  spm wants a two-item array here
  Fence matplotlib import with try block
  Get backend configurable from node config
  Set matplotlib backend from config in pyscript
  Attempt to set backend in local matplotlib
  fix: matplotlib import to inside function
  added changelog information
  enh: update internal prov to latest version of external library
  added hash_file protocol and dummy test file
  Remove inheritance_diagram.py.
  fix: afni url in afni package
  added more documentation
  Added dummy files
  fixed inputs for Retroicor
  adding retrocicor to __init__.py
  addition of 3dretroicor interface, very basic version
  enh: allow naming a workflow
  ...
* master: (293 commits)
  fix: cleaned up file removal to pay attention to related files such as img/hdr/mat, BRIK/HEAD
  fix: switch to allatonce mode to prevent forking (closes nipy#705)
  fixed version checking
  sty: white spaces
  enh: added checkspecs to makefiel
  ref: removing unneeded tests
  ref: FILMGLS needs its own test because of api changes in FSL 5.0.5
  fix: update test generator to take manual edits into account
  tst: added autogenerated input and output spec tests
  enh: modified checkspecs to write output spec tests
  enh: modified checkspecs to write input spec tests
  sty: remove commented interface
  fix: remove deprecation warning.
  doc: include travis and coveralls badges
  sty: pep8 and metadata fixes
  enh: added coveralls support
  Allow ArtifactDetect to correctly handle AFNI motion correction parameters generated with -dfile or -1Dfile
  fix: updated doctest, pep8 and unit test
  doc: updated doctest to use literal string
  fix: display cmdline string explicitly
  ...
@oesteban
Copy link
Contributor

I find this proxy very interesting. Not only for the compression management itself, it would be a great advance for those interfaces in python (I'm thinking of dipy right now), that handle very large nifti files. If they are not compressed, then they are not fully loaded into memory if it's not absolutely necessary.

@codecov-io
Copy link

codecov-io commented Dec 15, 2016

Codecov Report

Merging #454 into master will decrease coverage by 1.57%.
The diff coverage is 20%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #454      +/-   ##
==========================================
- Coverage   72.16%   70.58%   -1.58%     
==========================================
  Files        1144     1029     -115     
  Lines       57606    50721    -6885     
  Branches     8251     7366     -885     
==========================================
- Hits        41572    35802    -5770     
+ Misses      14733    13805     -928     
+ Partials     1301     1114     -187
Flag Coverage Δ
#smoketests 52.37% <ø> (-19.79%) ⬇️
#unittests 70.34% <20%> (+0.54%) ⬆️
Impacted Files Coverage Δ
nipype/pipeline/file_proxy.py 20% <20%> (ø)
nipype/interfaces/freesurfer/tests/test_utils.py 14.03% <0%> (-82.84%) ⬇️
nipype/workflows/smri/freesurfer/ba_maps.py 11.11% <0%> (-79.17%) ⬇️
nipype/interfaces/freesurfer/tests/test_model.py 25% <0%> (-75%) ⬇️
nipype/workflows/rsfmri/fsl/resting.py 14.75% <0%> (-70.73%) ⬇️
...ype/interfaces/freesurfer/tests/test_preprocess.py 18.96% <0%> (-70.65%) ⬇️
nipype/workflows/smri/freesurfer/autorecon2.py 2.3% <0%> (-68.96%) ⬇️
nipype/workflows/smri/freesurfer/autorecon3.py 2.09% <0%> (-67.01%) ⬇️
nipype/workflows/smri/freesurfer/bem.py 38.46% <0%> (-61.54%) ⬇️
nipype/workflows/smri/freesurfer/recon.py 8.57% <0%> (-57.15%) ⬇️
... and 946 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ee584b...2f6a944. Read the comment docs.

@effigies
Copy link
Member

This pull request is "orphaned," which means it has been deemed to be abandoned by its original author. Orphaned pull requests have not been rejected, and we hope that if a user sees one that will meet their needs with a little work, that they will fork it and open a new pull request (or, in the case of the original author, reopen the original PR).

We ask that all adopted PRs be updated to merge or rebase the current master. If you would like to adopt a PR and need help getting started, any of a number of contributors will be happy to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants