[MBDyn-users] Refactoring of MBDyn's NetCDF interface

Pierangelo Masarati masarati at aero.polimi.it
Wed May 18 10:15:38 CEST 2011


On 05/18/2011 09:53 AM, Rix, Patrick wrote:
>
> @Pierangelo:
> Hmmh..if you've made such good experience, seems that (file IO) performance is
> more related to the operating system and to the specific (plot-)utillity than to absolute
> file size.
> Your observations and the performance tests made by Jens during the abs2rel
> discussion seem to prove that Windows file IO is significantly slower than on Linux.

...

> I think for now I'll skip the idea of a splitted data base and see how far we come
> with a single-file data base. If for our wind turbine models (working on Windows) we run
> at some point of time into (post-proc) perfomance problems due to file sizes a splitting
> of the data base can also be done in a post-proc step. The idea was just to have the
> possiblity to skip one post-processing step.
>
> @Marco:
> ..I think my Flex annotation was confusing. Our Flex5 standard format does not use NetCDF.
> It's a format of it's own using (16Bit) integers for data compression (with some loss of precision)
> generated in a post-processing step (Flex5 uses floats for it's raw time series output).
> I agree with you that e.g. a 75MB (16Bit-int Flex5) data file is not that big (the files could become much
> bigger) but one can realize a significant increase in time for data access (e.g. during plotting) and
> I expect at some point of size this will be the case for  NetCDF files, too.
> In general the NetCDF format has no explicit limit from verison 4. For version 3.x I read something
> about a 4GB limit for large files when using a so called 'classic' NetCDF format.

likely 3.x used 32bit unsigned as keys

> So for normal use cases file size should not be a problem for the nc data format - but I discovered
> that some of the 3rd party open source tools seem to have problems with large files>2GB.
> (BTW: with NetCDF format the same sort of (16Bit integer) data packing can be done when using
> 'short' as a variable's data type in conjuction with the variable attributes 'scale_factor'  and  'add_offset',
> with  scale_factor=(Max - Min)/65535   and  add_offset=0.5*(Min + Max)
> then it is:   unpacked_data = ( packed_data * scale_factor ) + add_offset .
> I found an example for Matlab illustrating this<http://jisao.washington.edu/data/matlab_netcdf.html>)

If size were an issue, by abstracting the interface to NetCDF a little 
bit (in order to allow floats) we could use the same approach.  I would 
keep this confined in the (yet to be implemented) MBDyn2NetCDF layer.

Cheers, p.


More information about the MBDyn-users mailing list