Skip to content

[bash,zsh] Improve the AWK compatibility #4412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 7, 2025

Conversation

akinomyoga
Copy link
Contributor

@akinomyoga akinomyoga commented Jun 5, 2025

This PR adds some workarounds for bugs and quirks of awk implementations. Details are described in the respective commit messages.

Shell function shared by four files

To implement this, I added a function __fzf_exec_awk, but I needed to copy this function to all {complete,key-bindings}.{zsh,bash} (just like the existing __fzf_defaults). However, it might not be useful. When modifying the function in the future, one needs to update all the four files at the same time. Also, the size of copied lines becomes large. Is there an existing policy about how to process this situation? If you want to introduce a mechanism to automatically update the copied part of the files or to introduce a mechanism to generate the final form of the files from templates, I'm not reluctant to implement it.

edit: Ref. #4407 (comment)

macOS awk is a variant of nawk, but it contains a unique patch for the
UTF-8 support.  However, this patch causes the problem.  If the input
contains any non-UTF-8 data, macOS awk stops processing and does not
do anything, instead of ignoring the unrecognized data and continue
the processing.  However, the contents of the ssh configuration and
/etc/hosts is not under the control of fzf, so we cannot fix the input
when those files contain non-UTF-8 data.  To work around this
behavior, one can set the locale to LC_ALL=C to treat the input data
with the plain 8-bit encoding.
Solaris awk at /usr/bin/awk is meant for backward compatibility with
an ancient implementation of 1977 awk in the original UNIX.  It lacks
many features of POSIX awk.  To use a standard-conforming version in
Solaris, one needs to explicitly use /usr/xpg4/bin/awk.
@junegunn
Copy link
Owner

junegunn commented Jun 6, 2025

Is there an existing policy about how to process this situation?

No, there isn't one currently. These days, we recommend users to use the output of fzf --bash or fzf --zsh, but the script files on this repo should be self-contained and complete, as some packages include them directly and expect them to be individually usable OOTB.

Also, the size of copied lines becomes large.

Yeah, that's unfortunate, especially because I like what you did here. But most of the lines are comments, so I don't think it's too bad. Anyway, I don't recall getting a bug report about awk compatibility in the past few years unlike in your case, so the number of users who would benefit from this is probably quite small? How was your experience?

@akinomyoga
Copy link
Contributor Author

Yeah, that's unfortunate, especially because I like what you did here. But most of the lines are comments, so I don't think it's too bad.

I'm not sure if you like it, but I added commit 9d8feb6 to maintain the common shell functions (i.e., __fzf_defaults and __fzf_exec_awk) in shell/common.sh. The executable shell script shell/update-common.sh can be used to update the embedded codes in shell/{completion,key-bindings}.{bash,zsh} with the contents of shell/common.sh. In this process, the lines of the code comments are removed so that the actual code embedded in shell/{completion,key-bindings}.{bash,zsh} becomes compact. The code comments describing the implementation are kept in the source shell/common.sh. I can adjust the detailed implementation. Or if it's not to your taste, I can drop the commit.

Anyway, I don't recall getting a bug report about awk compatibility in the past few years unlike in your case, so the number of users who would benefit from this is probably quite small? How was your experience?

Maybe the situation is different from the Fzf project, but in my project (ble.sh), it is usually difficult to identify from the initial report that the problem is related to the AWK compatibility. First, it is hard to associate the problem with an AWK script inside ble.sh just by looking at the symptom because the processing in ble.sh involves thousands of lines of codes (which are mostly Bash script), and the reporters and I wouldn't usually suspect the AWK compatibility first. Second, the problem is usually only reproducible in the user's environment, and I cannot reproduce it because it is caused by a very specific version of a specific implementation of awk. For those reasons, AWK compatibility issues are not usually reported as they are associated with awk. In this sense, there could be a possibility that some of the open issues in Fzf are actually related to the AWK compatibility, but I'm not sure because the situation between Fzf and ble.sh seems quite different.

Another thing is related to whether we really need to take care of the bugs and quirks of the external tools. Ideally, they should be fixed at the upstream awk side. However, the users seem to have a stereotype that a project written in shell scripts has tons of bugs, while the programs written in C, Go, Rust, etc. are trustworthy and have no bugs. Then, when a user faces a bug with a combination of ble.sh (which is implemented in Bash scripts) and another tool, the user will come to ble.sh and report the problem as a ble.sh bug. Actually, in my feeling, about 1/2 of the bug reports coming to ble.sh finally turn out to be caused by bugs of external tools. I don't want to see complaints on ble.sh repeatedly due to the external bugs, so I add workarounds. However, the situation can be different in Fzf. Fzf is written in Go, and it is a widely known and accepted tool, so I think the situation is opposite. When a compatibility issue happens with Fzf, the user would report it to a more correct place closer to the upstream of the bug, such as the awk project or the OS distribution (where the problem is exclusively happening due to the awk version they distribute). Then, there may not be a reason to take care of these bugs at the Fzf side.

Nevertheless, some AWK compatibility issues exist in the awk implementations that are widely used in a major distribution, such as the old mawk distributed by Ubuntu 18.04 LTS, and macOS awk (which hasn't been fixed). Then, even if we believe the problem should be fixed at the awk side, practically, we will have to take measures on the downstream side.

  • We need to wait until 2030 for the end of the extended support for Ubuntu 18.04 LTS.
  • I think the quirk of macOS awk is unlikely to be fixed because we don't have a way to report the problem to the developers at Apple Inc., who are responsible for macOS awk. We reported it to Apple, but there is no response, probably because it hasn't been properly triaged due to the massive reports about macOS from random users. Apple provides only one channel to report all kinds of problems related to macOS, and I guess they only handle issues reported by many users (not developers). There is no way to report specific problems to the corresponding channel.
  • On the other hand, we can probably ignore Solaris awk. Technically, the end of the extended support for Solaris 11.4 is declared to be 2037, but Oracle has already stopped developing new versions of Solaris, and I think Solaris is now only used in old systems in companies. If you want, I can drop the commit for the Solaris-awk workaround.

@junegunn
Copy link
Owner

junegunn commented Jun 7, 2025

I'm not sure if you like it, but I added commit 9d8feb6 to maintain the common shell functions

It's a nice touch, thanks. And I really appreciate the detailed explanation.

Copy link
Owner

@junegunn junegunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good. Thanks. Is there anything else you want to address?

@akinomyoga
Copy link
Contributor Author

Thanks! I checked it once again. If you like it, I'm fine with merging the PR in its current state.

@junegunn junegunn merged commit bfa287b into junegunn:master Jun 7, 2025
5 checks passed
@junegunn
Copy link
Owner

junegunn commented Jun 7, 2025

Merged, thanks!

@akinomyoga akinomyoga deleted the bash-awk-compat branch June 7, 2025 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants