HDF5 file locking errors

General discussion of all things GYRE-related (e.g., results, talks, ideas, tips)
Post Reply
joelongjiamian
Posts: 11
Joined: Wed Nov 20, 2019 10:39 am

HDF5 file locking errors

Post by joelongjiamian » Wed Nov 20, 2019 1:41 pm

I'm making this note here in case anyone else has been running into the same issue.

I recently began running into nondeterministic GYRE errors after upgrading to MESA r12115 (and also MESASDK 20190830). After some investigation, this seems to be related to a new file locking feature in HDF5 > 1.10.x. In particular, when GYRE is set to write output files (even ASCII text files) hosted on some networked filesystems, it (nondeterministically) complains about not being able to lock the output files, and then exits without writing any output. I've also seen this happen (also nondeterministically, but less often) on local filesystems.

For now, I have been using an environment variable flag disabling this file locking feature as a workaround (see https://support.nesi.org.nz/hc/en-gb/ar ... le-locking).

User avatar
rhtownsend
Site Admin
Posts: 397
Joined: Sun Mar 31, 2013 4:22 pm

Re: HDF5 file locking errors

Post by rhtownsend » Wed Nov 20, 2019 1:42 pm

joelongjiamian wrote:
Wed Nov 20, 2019 1:41 pm
I'm making this note here in case anyone else has been running into the same issue.

I recently began running into nondeterministic GYRE errors after upgrading to MESA r12115 (and also MESASDK 20190830). After some investigation, this seems to be related to a new file locking feature in HDF5 > 1.10.x. In particular, when GYRE is set to write output files (even ASCII text files) hosted on some networked filesystems, it (nondeterministically) complains about not being able to lock the output files, and then exits without writing any output. I've also seen this happen (also nondeterministically, but less often) on local filesystems.

For now, I have been using an environment variable flag disabling this file locking feature as a workaround (see https://support.nesi.org.nz/hc/en-gb/ar ... le-locking).
Hi Joel --

What version of GYRE are you using?

cheers,

Rich

joelongjiamian
Posts: 11
Joined: Wed Nov 20, 2019 10:39 am

Re: HDF5 file locking errors

Post by joelongjiamian » Thu Nov 21, 2019 8:44 am

Hi Rich,

I'm running GYRE 5.2 (which came with MESA).

So far I've observed this behaviour mostly on NFS mounts on my local HPC cluster, and also occasionally on department machines, but not on my laptop (although that might be because I don't run GYRE on my laptop very often). All of these run some flavour of linux (RHEL on the HPC, Scientific Linux on the dept machine, Arch on my own laptop)

User avatar
rhtownsend
Site Admin
Posts: 397
Joined: Sun Mar 31, 2013 4:22 pm

Re: HDF5 file locking errors

Post by rhtownsend » Mon Nov 25, 2019 10:42 am

Thanks for the info. Earlier releases of GYRE had a problem performing HDF5 input/output on multi-core architectures; however, this was fixed in 5.2 and so is different from the issues you're encountering.

I'm not sure what can be done to fix this, but I'll give it some thought. For the time being, I suggest that you continue to use the environment variable workaround.

cheers,

Rich

TSteindl
Posts: 1
Joined: Fri Feb 05, 2021 3:31 am

Re: HDF5 file locking errors

Post by TSteindl » Fri Feb 05, 2021 4:43 pm

Hi,

I am running GYRE version 6.01 und a cluster with SGE scheduling and ran into the same issue. This only happens when using the traditional approximation of rotation and when GYRE is executed within an SGE job. Running GYRE by hand works fine as well as if I don't use the TAR. The workaround with the environment variable works for me too, but I figured I'd post this reply to maybe help pinpoint the issue.

Best,
Thomas

User avatar
rhtownsend
Site Admin
Posts: 397
Joined: Sun Mar 31, 2013 4:22 pm

Re: HDF5 file locking errors

Post by rhtownsend » Sun Feb 07, 2021 1:42 pm

Thanks for the info, Thomas. My suspicion is that the problem lies with the HDF5 library that ships with the SDK, rather than with GYRE. So, I'm not sure there's much that can be done at my end.

cheers,

Rich

Post Reply