Page 1 of 1

HDF5 file locking errors

Posted: Wed Nov 20, 2019 1:41 pm
by joelongjiamian
I'm making this note here in case anyone else has been running into the same issue.

I recently began running into nondeterministic GYRE errors after upgrading to MESA r12115 (and also MESASDK 20190830). After some investigation, this seems to be related to a new file locking feature in HDF5 > 1.10.x. In particular, when GYRE is set to write output files (even ASCII text files) hosted on some networked filesystems, it (nondeterministically) complains about not being able to lock the output files, and then exits without writing any output. I've also seen this happen (also nondeterministically, but less often) on local filesystems.

For now, I have been using an environment variable flag disabling this file locking feature as a workaround (see https://support.nesi.org.nz/hc/en-gb/ar ... le-locking).

Re: HDF5 file locking errors

Posted: Wed Nov 20, 2019 1:42 pm
by rhtownsend
joelongjiamian wrote:
Wed Nov 20, 2019 1:41 pm
I'm making this note here in case anyone else has been running into the same issue.

I recently began running into nondeterministic GYRE errors after upgrading to MESA r12115 (and also MESASDK 20190830). After some investigation, this seems to be related to a new file locking feature in HDF5 > 1.10.x. In particular, when GYRE is set to write output files (even ASCII text files) hosted on some networked filesystems, it (nondeterministically) complains about not being able to lock the output files, and then exits without writing any output. I've also seen this happen (also nondeterministically, but less often) on local filesystems.

For now, I have been using an environment variable flag disabling this file locking feature as a workaround (see https://support.nesi.org.nz/hc/en-gb/ar ... le-locking).
Hi Joel --

What version of GYRE are you using?

cheers,

Rich

Re: HDF5 file locking errors

Posted: Thu Nov 21, 2019 8:44 am
by joelongjiamian
Hi Rich,

I'm running GYRE 5.2 (which came with MESA).

So far I've observed this behaviour mostly on NFS mounts on my local HPC cluster, and also occasionally on department machines, but not on my laptop (although that might be because I don't run GYRE on my laptop very often). All of these run some flavour of linux (RHEL on the HPC, Scientific Linux on the dept machine, Arch on my own laptop)

Re: HDF5 file locking errors

Posted: Mon Nov 25, 2019 10:42 am
by rhtownsend
Thanks for the info. Earlier releases of GYRE had a problem performing HDF5 input/output on multi-core architectures; however, this was fixed in 5.2 and so is different from the issues you're encountering.

I'm not sure what can be done to fix this, but I'll give it some thought. For the time being, I suggest that you continue to use the environment variable workaround.

cheers,

Rich

Re: HDF5 file locking errors

Posted: Fri Feb 05, 2021 4:43 pm
by TSteindl
Hi,

I am running GYRE version 6.01 und a cluster with SGE scheduling and ran into the same issue. This only happens when using the traditional approximation of rotation and when GYRE is executed within an SGE job. Running GYRE by hand works fine as well as if I don't use the TAR. The workaround with the environment variable works for me too, but I figured I'd post this reply to maybe help pinpoint the issue.

Best,
Thomas

Re: HDF5 file locking errors

Posted: Sun Feb 07, 2021 1:42 pm
by rhtownsend
Thanks for the info, Thomas. My suspicion is that the problem lies with the HDF5 library that ships with the SDK, rather than with GYRE. So, I'm not sure there's much that can be done at my end.

cheers,

Rich