Skip to content
Helen Burns edited this page Oct 4, 2016 · 16 revisions

When running ocean models you can end up with TBs of data, that you might not quite be ready to delete. So here's my archiving procedure for output binary/ NetCDF files.

  1. Time average netcdf files
#!/bin/bash
for file in folder_pattern/fpat
do 
 ncra -O "${file}" "${file}"
done
  1. Remove excess standard output. I only want one tile of info:
#!/bin/bash
for file in folder_pattern/fname
do 
 mv -i "${file}" "${file/fname/output}"
done
rm -f folder_pattern/fpat

This takes you file STDOUT0001 (fname) and changes its name to output and then removes the rest using wild cards to establish your folder and file pattern.

  1. Now you have directories with time averaged outputs and only one STDOUT per run. Let's tar and compress them:
#!/bin/bash
tar cvfj spinup.tar.bz2 folder_pattern

You can either set a folder pattern or make a spinup dir with any readme or input files.

NOTE bz2 is the maximum compression so slower than tar.gz

  1. Extract your data if you need it:
tar -xvf spinup.tar.bz2

or count the number of files without decompressing

tar -jtvf spinup.tar.bz2 | wc -l
Clone this wiki locally