Skip to content

[flang] Add EXECUTE_COMMAND_LINE runtime and lowering intrinsics implementation #74077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Jan 10, 2024

Conversation

yiwu0b11
Copy link
Contributor

@yiwu0b11 yiwu0b11 commented Dec 1, 2023

This patch add support of intrinsics Fortran 2008 EXECUTE_COMMAND_LINE.
The patch contains both the lowering and the runtime code and works on
both Windows and Linux. The patch contains a list of commits, to convey
the authorship and the history of changes. Some implementation specifics
or status has been added to flang/docs/Intrinsics.md.

I have provided a summary of the usage and the options required for the
EXECUTE_COMMAND_LINE intrinsic. The intrinsic supports both a synchronous
(by default) and an asynchronous option.

System Mode Implemention
Linux Sync std::system()
Windows Sync std::system()
Linux Async fork()
Windows Async CreateProcess

Support for the SYSTEM GNU extension will be added in a separate PR.

Co-authored with @jeffhammond

jeffhammond and others added 22 commits October 18, 2023 15:12
Signed-off-by: Jeff Hammond <[email protected]>
exitstat: If sync, assigned processor-dependent exit status. Otherwise unchanged.
cmdstast: Assigned 0 as specifed in standard, if error then overwrite.
If a condition occurs that would assign a nonzero value to CMDSTAT but
the CMDSTAT variable is not present, error termination is initiated.
Work and test on Linux, both sync and async mode.
Sync mode: termination will terminate directly
Async mode: will only terminate the child/async process, no effect on parent
Standard: If a condition occurs that would assign a nonzero value to CMDSTAT,
but the CMDSTAT variable is not present, error termination is initiated.
@yiwu0b11 yiwu0b11 marked this pull request as ready for review December 1, 2023 14:32
@yiwu0b11
Copy link
Contributor Author

yiwu0b11 commented Dec 1, 2023

Implementation specifics or status on flang/docs/Intrinsics.md

Standard Intrinsics: EXECUTE_COMMAND_LINE

Usage and Info

  • Standard: Fortran 2008 and later, specified in 16.9.73

  • Class: Subroutine

  • Syntax: CALL EXECUTE_COMMAND_LINE(COMMAND [, WAIT, EXITSTAT, CMDSTAT, CMDMSG ])

  • Arguments:

    Argument Description
    COMMAND Shall be a default CHARACTER scalar.
    WAIT (Optional) Shall be a default LOGICAL scalar.
    EXITSTAT (Optional) Shall be an INTEGER of the default kind.
    CMDSTAT (Optional) Shall be an INTEGER of the default kind.
    CMDMSG (Optional) Shall be a CHARACTER scalar of the default kind.

Implementation Specifics

  • COMMAND:

    • Must be preset.
  • WAIT:

    • If set to false, the command is executed asynchronously. If not preset or set to false, it is executed synchronously.
    • Sync: achieved by passing command into std::system on all systems.
    • Async: achieved by calling a fork() on POSIX-compatible systems, or CreateProcess() on Windows.
  • CMDSTAT:

    • -2: No error condition occurs, but WAIT is present with the value false, and the processor does not support asynchronous execution.
    • -1: The processor does not support command line execution.
    • + (positive value): An error condition occurs.
      • 1: Fork Error, where pid_t < 0, would only occur on POSIX-compatible systems.
      • 2: Execution Error, a command exits with status -1.
      • 3: Invalid Command Error, determined by the exit code depending on the system.
        • On Windows, if the exit code is 1.
        • On POSIX-compatible systems, if the exit code is 127 or 126.
      • 4: Signal error, either it is stopped or killed by signal, would only occur on POSIX-compatible systems.
    • 0: Otherwise.
  • CMDMSG:

    • If an error condition occurs, it is assigned an explanatory message. Otherwise, it remains unchanged.
    • If a condition occurs that would assign a nonzero value to CMDSTAT but the CMDSTAT variable is not present, error termination is initiated.
      • On POSIX-compatible systems, this applies to both synchronous and asynchronous error termination. When the execution mode is set to async with error termination, the child process (async process) will be terminated with no effect on the parent process (continues).
      • On Windows, this only applies to synchronous error termination.

@klausler
Copy link
Contributor

klausler commented Dec 1, 2023

Given that you have implementations using fork() and its Windows analog, why have synchronous implementations with system()? Can't the synchronous path just use the asynchronous implementation with synchronization?

@yiwu0b11
Copy link
Contributor Author

yiwu0b11 commented Dec 1, 2023

Given that you have implementations using fork() and its Windows analog, why have synchronous implementations with system()? Can't the synchronous path just use the asynchronous implementation with synchronization?

Yes it is do-able. I think that would reduce readability, my current structure is

if(sync){
    system(cmd);
}else{
  if (Windows){
      CreateProcess(cmd);
  }else {
      fork();
      system(cmd)
}

If wait in async, making it sync:

if (Windows){
    CreateProcess(cmd);
    if(sync){
        wait()
    }else{
        no_wait()
    }
}else{
    fork()
    system(cmd)
    if(sync){
        wait_pid()
    }else{
        no_wait()
    }
}

On a second thought, I might not able to use system in Linux under this structure, I probably need to use execl.

@yiwu0b11 yiwu0b11 force-pushed the execute_command_line branch from 85f44a3 to 51c9670 Compare December 21, 2023 19:48
@yiwu0b11 yiwu0b11 requested a review from klausler December 27, 2023 14:48
@yiwu0b11 yiwu0b11 requested a review from klausler January 4, 2024 13:51
@yiwu0b11
Copy link
Contributor Author

yiwu0b11 commented Jan 8, 2024

Hi, @jeanPerier welcome back! I've made quite a few changes, mainly on memory allocate and deallocate, test improvements, and have moved some commonly used descriptor functions into tools.cpp to re-use some code instead of creating duplicates of them. Could you review the recent changes? Thanks in advance!

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @yi-wu-arm, LGTM

@yiwu0b11 yiwu0b11 merged commit e2b896a into llvm:main Jan 10, 2024
FreeMemory((void *)wcmd);
#else
// terminated children do not become zombies
signal(SIGCHLD, SIG_IGN);
Copy link
Contributor Author

@yiwu0b11 yiwu0b11 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signal(SIGCHLD, SIG_IGN);
This actually cause a crash on async execution. Once an async execute_command_line was called, all execute_command_line after that will result in std::system returns a status code of -1, so all execute_command_line will exit with: Execution error with system status code: -1
Comment out this line would be a temporary fix.
I will come up with a real fix asap.

Copy link
Contributor Author

@yiwu0b11 yiwu0b11 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reproducer: aarch64, linux, clang++

#include <unistd.h>
#include <signal.h>
#include <iostream>
void execute_command_line(const char *cmd, bool isWait, int &exitstatus){
    if(isWait){
        exitstatus=std::system(cmd);
    }else{
        signal(SIGCHLD, SIG_IGN);
        pid_t pid=fork();
        if(pid<0){
            exitstatus = -999;
        }else if(pid==0){
            int status=std::system(cmd);
            exitstatus=status;
            exit(status);
        }
    }
}

int main() {
    int exitstatus=404;
    execute_command_line("InvalidCommand", false, exitstatus);
    std::cout << "exitstatus async: " << exitstatus<<std::endl;
    execute_command_line("InvalidCommand", true, exitstatus);
    std::cout << "exitstatus sync: " << exitstatus<<std::endl;

    return 0;
}

console printout:

exitstatus async: 404
sh: 1: InvalidCommand: not found
exitstatus sync: -1
sh: 1: InvalidCommand: not found

console printout when signal(SIGCHLD, SIG_IGN); is commented out.

exitstatus async:404
sh: 1: InvalidCommand: not found
exitstatus sync:32512
sh: 1: InvalidCommand: not found

*This is not a very good representation as the value of exitstatus is not stored.

Copy link
Contributor Author

@yiwu0b11 yiwu0b11 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or alternatively do what gfortran is doing: do not change the value of exitstatus, cmdmsg, and set cmdstatus to 0, no mater what happened.

program test_associated
  integer :: exitstatus, cmdstatus
  character(len=35) :: cmdmsg
  exitstatus=404
  cmdstatus=303
  cmdmsg="Does async execution change cmdmsg?"

  call execute_command_line ("InvalidCommand", .false., exitstatus, cmdstatus, cmdmsg)
  call sleep(1)

  print *, "Exist status was: ", exitstatus
  print *, "Command status was: ", cmdstatus
  print *, "Error message is: ", trim(cmdmsg)

  end

console output when isWait=false (async)

sh: 1: InvalidCommand: not found
Exist status was:          404
Command status was:            0
Error message is: Does async execution change cmdmsg?

console output when isWait=true (sync)

sh: 1: InvalidCommand: not found
Exist status was:          127
Command status was:            3
Error message is: Invalid command line

(This approach doesn't follow the standard well...) The standard had only one special case for async execution: NOT to change the value of exitstats. Other value would behave as what the standard defined.
Note that call sleep() is not available on flang-new, so on llvm-test-suite: https://github.com/llvm/llvm-test-suite/blob/9ca97f5027150f7e507e5ab4c56f38a29fb3c696/Fortran/gfortran/regression/execute_command_line_1.f90#L16, it would fail to compile as well,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having discussed this offline, it sounds like std::system is missbehaving because SIGCHLD is ignored in the child (inherited from the parent). To work around this I think you can reset the signal handler in the child process after the fork before calling std::system.

Copy link
Contributor Author

@yiwu0b11 yiwu0b11 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that resetting the signal doesn't work well. put signal(SIGCHLD, SIG_DFL) will still cause system return -1. However, using a non-blocking waitpid to get the exit code of child process will clear up the zombie. (I didn't know there is a non-blocking wait).

void sigchld_handler(int signo) {
    int status;
    waitpid(-1, &status, WNOHANG); // -1 means any child, WNOHANG means non-blocking
}
.......
signal(SIGCHLD, sigchld_handler);
pid_t pid{fork()};
if (pid < 0) {
  error;
} else if (pid == 0) {
  std::system(cmd);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me 👍

Out of interest, what did you try to reset the handler for SIGCHLD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opps, sorry I type it wrong in the comment, I mean signal(SIGCHLD, SIG_DFL)

Comment updated.

int exitStatusVal{status};
if (exitStatusVal == 1) {
#else
int exitStatusVal{WEXITSTATUS(status)};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux defines WEXITSTATUS in stdlib.h (and therefore cstdlib), but at least FreeBSD and NetBSD require including sys/wait.h.

It should be harmless to always include sys/wait.h, so I'd suggest doing that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #77675 for this.

Copy link
Contributor Author

@yiwu0b11 yiwu0b11 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted! Thanks for the fix.

tblah pushed a commit that referenced this pull request Jan 12, 2024
Linux defines WEXITSTATUS in stdlib.h, but at least FreeBSD and NetBSD
only define it in sys/wait.h. Include this header unconditionally, since
it is required on the BSDs and should be harmless on other platforms.

Fixes FreeBSD build after #74077.
@psteinfeld
Copy link
Contributor

See issue #77975.

@psteinfeld
Copy link
Contributor

See issue #77984

@psteinfeld
Copy link
Contributor

See issue #77979.

justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…ementation (llvm#74077)

This patch add support of intrinsics Fortran 2008 EXECUTE_COMMAND_LINE.
The patch contains both the lowering and the runtime code and works on
both Windows and Linux. The patch contains a list of commits, to convey
the authorship and the history of changes. Some implementation specifics
or status has been added to `flang/docs/Intrinsics.md`.

I have provided a summary of the usage and the options required for the
`EXECUTE_COMMAND_LINE intrinsic`. The intrinsic supports both a
synchronous
(by default) and an asynchronous option.

| System  | Mode  | Implemention              |
|---------|-------|---------------------------|
| Linux   | Sync  | std::system()             |
| Windows | Sync  | std::system()             |
| Linux   | Async | fork()  |
| Windows | Async | CreateProcess             |

Support for the SYSTEM GNU extension will be added in a separate PR.

Co-authored with @jeffhammond

---------

Signed-off-by: Jeff Hammond <[email protected]>
Co-authored-by: Jeff Hammond <[email protected]>
Co-authored-by: Yi Wu <[email protected]>
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Linux defines WEXITSTATUS in stdlib.h, but at least FreeBSD and NetBSD
only define it in sys/wait.h. Include this header unconditionally, since
it is required on the BSDs and should be harmless on other platforms.

Fixes FreeBSD build after llvm#74077.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants