Hi Ludovic, ludo@gnu.org (Ludovic Courtès) writes: > Mark H Weaver skribis: > >> ludo@gnu.org (Ludovic Courtès) writes: > > [...] > >>> I’m thinking we could perhaps add a compiler barrier before ‘vp->fp = new_fp’ >>> statements, but in practice it’s not necessary here (x86_64, gcc 7). >>> >>> Thoughts? > > I just pushed the patch as 23af45e248e8e2bec99c712842bf24d6661abbe2. > >> Indeed, we need something to ensure that the compiler won't reorder >> these writes. My recommendation would be to use the 'volatile' >> qualifier in the declarations of both the 'fp' field of 'struct scm_vm' >> and the stack element modified by SCM_FRAME_SET_DYNAMIC_LINK. >> >> Maybe something like this: >> >> diff --git a/libguile/frames.h b/libguile/frames.h >> index ef2db3df5..8554f886b 100644 >> --- a/libguile/frames.h >> +++ b/libguile/frames.h >> @@ -89,6 +89,7 @@ >> union scm_vm_stack_element >> { >> scm_t_uintptr as_uint; >> + volatile scm_t_uintptr as_volatile_uint; > > [...] > >> +#define SCM_FRAME_SET_DYNAMIC_LINK(fp, dl) ((fp)[1].as_volatile_uint = ((dl) - (fp))) > > I’m not sure this is exactly what we need. > > What I had in mind, to make sure the vp->fp assignment really happens > after the SCM_FRAME_SET_DYNAMIC_LINK, was to do something like this: > > SCM_FRAME_SET_RETURN_ADDRESS (new_fp, …); > SCM_FRAME_SET_DYNAMIC_LINK (new_fp, …); > > asm volatile ("" : : : "memory"); > > vp->fp = new_fp; > > I think that more accurately reflects what we want, WDYT? > > It’s GCC-specific though (I don’t think Clang implements ‘asm’ in a > compatible way, does it?) and I suppose we’d have to ignore the non-GCC > case. Right, the problem is that the "asm volatile" solution is not portable. To my mind, it's fine to optionally use GCC extensions for improved performance, but I think we should ensure that Guile works reliably when compiled with any conforming C compiler. What problem do you see with my proposed portable solution? If I understand C99 section 5.1.2.3 paragraph 2 correctly, compilers are not permitted to reorder accesses to volatile objects as long as there is a sequence point between them. I should say that I'm not confident that _either_ of these proposed solutions will adequately address all of the possible problems that could occur when GC is performed on VM threads stopped at arbitrary points in their execution. Mark