[MBDyn-users] Refactoring of MBDyn's NetCDF interface

Rix, Patrick Patrick.Rix at repower.de
Tue May 17 14:33:03 CEST 2011


in general I prefer keeping all data in single file, too.
Up to a size of some 10MB this seems to be a good idea, but
for larger models and/or simulation time the data base would become
extremely large - so large that post-processing & visualization might
slow down significantly or could even fail. If one runs into such problems it
would ge good to be able to reduce file sizes by simply splitting the
data base to several output files.
For our Flex5 wind turbine simulations we made good experiences
in writing all data into one file - up to a size of 50-75MB per data base.
Beyond that working with such large files becomes a torture..
and we already use a compressed 16Bit integer format as standard output, so
using floats or doubles for the nc files a size of  >100MB will be reached
quite fast:
e.g. a wind turbine model with 4*beam3 for the tower and 3 blades made
of 5*beam3 gives 53 nodes. Simulating 600sec at 50msec output rate
results in 12000 time steps. Using floats for the results gives
12000[steps] * 4[byte] * 53[nodes] * 18[columns] = 43.7MB just for the
.mov data. (..for offshore wind turbines the simulations would be even
longer due to the low frequency of the waves).
With the other files (.act, .ine, .jnt, etc.) this amount would be roughly
doubled (..tripled when having .aer output, too).
As you can see a critical size would be already reached with a standard
(wind turbine) simulation without having an extraordinary model.
Now each time when accessing (e.g.plotting) a single time series from
the data base a (plot) utility would have to read 100-150MB just to get
some 48kb of data of interest.
Being able to choose between either a single or a splitted data base
depending on the size of the problem/output would offer the most
flexibility, wouldn't it ?

Best regards,

-----Ursprüngliche Nachricht-----
Von: masarati at aero.polimi.it [mailto:masarati at aero.polimi.it]
Gesendet: Dienstag, 17. Mai 2011 10:46
An: Rix, Patrick
Cc: mbdyn-users at mbdyn.org
Betreff: Re: [MBDyn-users] Refactoring of MBDyn's NetCDF interface

> Hi Pierangelo,
> dear MBDyn users,
> I'd like to rework / enhance MBDyn's NetCDF output and wanted to make
> suggestions for modifications / improvements of the current
> implementation.
> And I'd like to encourage other users who are interested in the binary
> output to bring in their ideas, too.
> The list below are some points which I felt to be important - if I
> overlooked or forgot something important please add.
> List of suggested modifications
> ======================
> 1.) adding a parameter / switch for toggling between a MONOTLITHIC or
> this way the user can decide if he wants to have one (big) single  .nc
> file with the output of all nodes, joints, beams, foces, etc. in it
> (as it this is currently the case) or if he prefers to have several files
> with output sorted by element-type just exactly as it is right now the
> for text output: so to each text file would exist a corresponding binary
> .nc file existing, e.g. additional to  'output.mov'  we would have
> a 'output.mov.nc'.
> Having one single file is very handy and for small models, short time
> series or less output this might be preferrable but for large models
> with a lot of nodes/elems and long time series the file can fastly become
> impractically large even for binary format.

I think one of the key points of moving to binary was the possibility to
store all data in a *single* database, thus solving consistency of data
location.  This would be magnified by the possibility to store an echo of
the model and the simulation data in the same db.  Right now, the only way
to efficiently handle multiple executions consists in placing results in
separate folders.  My understanding of using a database for output is that
it could handle this more efficiently that a filesystem (unchecked,
though).  Unless there is a proven significant penalty I'd stick with a
monolithic file.

> 2.) adding a switch to toggle the output data type between DOUBLE / FLOAT
> to enable the user to adapt the output to his needs depending on whether
> precision or less required disk space has higher priority.


> [ 3.) ?OPTIONAL?: thinking about a general philosophy for storing time
> independent data with informal / descriptive character to avoid problems
> in Blender ]
> The problems in Blender are caused by the NetCDF-IO routines of the SciPy
> package (which are included in the MBDyn NetCDF interface for Blender) -
> so there is no
> error in MBDyn's nc output.

I need to investigate this, it's not entirely clear to me, yet.

> But on the other hand one could discuss if it does make more sense to
> separate model topology informations from the time series data.
> I think it would increase the overview having a small file with the model
> specific information (total number of nodes & elements and their mbdyn
> types, references between nodes & elems, etc.)
> Model topology information could be written into a binary nc file, too,
> but I think these might be better written in XML(?) for displaying the
> relations between items in a tree view.
> Maybe someone else has a better idea how the model topology could be
> visualized best ?

See above.  I'd stick with a single database.  Duplicating information
storage media and format can only lead to further misalignment,
duplication of efforts and so.  If you need to further obfuscate
information by rendering it in XML, you can write a simple NetCDF to XML
converter, if it does not exist yet.  The connectivity structure of a
model is simple enough that it can be simply written using
cross-references between entities, and reconstructed using simple logics
whatever the tool and the format.

> 4.) completion of the list of elements with NetCDF-output capabilities


Cheers, p.

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte umgehend den Absender und löschen Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der in dieser E-Mail enthaltenen Daten ist nicht gestattet. Wie Sie wissen, kann die Sicherheit von Übermittlungen per E-Mail nicht gewährleistet werden, E-Mails können missbräuchlich unter fremdem Namen erstellt oder verändert werden. Aus diesem Grund bitten wir um Verständnis dafür, dass wir zu Ihrem und unserem Schutz die rechtliche Verbindlichkeit der vorstehenden Erklärungen ausschließen müssen. Diese Regelung gilt nur dann nicht, wenn wir mit Ihnen eine anderweitige schriftliche Vereinbarung über die Einhaltung von Sicherheits- und Verschlüsselungsstandards getroffen haben.
This e-mail contains confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden. As you know, the security of e-mail transmissions can not be guaranteed. E-mails can be misused to be written or modified under false names. For that reason, we ask you to understand the necessity for us to rule out the legal obligation of the above statement, for your protection and ours. This regulation is only invalid if we have concluded a special written agreement with you about the compliance with security and encryption standards.
REpower Systems AG Sitz: Hamburg Vorstand: Andreas Nauen (Vorsitz), Gregor Gnädig, Derrick Noe, Matthias Schubert
Aufsichtsratsvorsitzender: Tulsi Tanti Registergericht: AG Hamburg (Mitte) HRB Nr.: 75543

More information about the MBDyn-users mailing list