Skip to content

fix: default en_US.UTF-8 if lang has problems #1687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 7, 2025
Merged

Conversation

oneirocosm
Copy link
Member

Mac's defaults read -g AppleLocale command can return locales that don't have a language in /usr/share/locale. This prevents those locales from using a language with UTF-8 support. This will use en_US.UTF-8 as a default to cover those cases if there are problems. As per usual, if LANG is already set, that value will be used instead.

Mac's `defaults read -g AppleLocale` command can return locales that
don't have a language in /usr/share/locale. This prevents those locales
from using a language with UTF-8 support. This will use en_US.UTF-8 as a
default to cover those cases if there are problems. As per usual, if
LANG is already set, that value will be used instead.
Copy link
Contributor

coderabbitai bot commented Jan 6, 2025

Walkthrough

The changes in the wavebase.go file introduce a new function listValidLangs, which executes the locale -a command to obtain a list of valid locale languages. The determineLang function has been modified to include a default language variable, defaultLang, set to "en_US.UTF-8". The error handling within determineLang has been updated to log a message when the command to read the Apple locale fails, indicating that the default language will be used. The function now checks if the preferred language is valid against the list obtained from listValidLangs. If the preferred language is not valid, it logs a message and returns the default language. Additionally, the utilfn package has been added to the imports to utilize the ContainsStr function for validating the preferred language.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pkg/wavebase/wavebase.go (2)

204-213: Consider caching and filtering results.

Although this function returns valid locales by calling locale -a, it might be beneficial to:

  1. Cache the results to avoid multiple external calls if used repeatedly.
  2. Filter out empty lines or malformed entries to ensure only valid locales are returned.

219-226: Add a check for an existing LANG environment variable.

The PR objectives mention preserving the user’s LANG setting if it is already defined. However, when defaults read -g AppleLocale fails, the logic directly falls back to "en_US.UTF-8". You might want to first check if LANG is set before forcing this fallback.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b51ff83 and 761fbb1.

📒 Files selected for processing (1)
  • pkg/wavebase/wavebase.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
pkg/wavebase/wavebase.go (2)

20-21: No issues with the new import.

The import for utilfn is necessary for ContainsStr. This looks fine as is.


230-238: Confirm consistent fallback logic across OSes.

The code checks if preferredLang is in the valid locale list and falls back to en_US.UTF-8 otherwise. This is good for macOS, but Windows and other systems might need equivalent checks. Ensuring a consistent approach across all platforms can prevent locale mismatches.

You can verify usage across OS detection points with the following script:

✅ Verification successful

Fallback logic is consistent across platforms

The code implements a uniform approach for locale validation and fallback across all supported platforms:

  • macOS: Uses AppleLocale with UTF-8 validation
  • Windows: Uses Get-Culture with UTF-8 validation
  • All platforms: Fall back to 'en_US.UTF-8' when validation fails
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Searching for platform-specific code paths in wavebase.go
rg -A 5 'runtime.GOOS'

Length of output: 4855

The windows LANG code used the incorrect runtime.GOOS name for go, so it
never actually ran. That said, international characters still seem to
work fine on windows, so there is no harm in removing this entirely.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
pkg/wavebase/wavebase.go (3)

204-213: Consider caching the valid languages list.

Since the available locales rarely change during runtime, consider caching the results using a similar pattern to osLangOnce to avoid repeated system calls.

+var validLangsOnce = &sync.Once{}
+var validLangs []string
+
 func listValidLangs(ctx context.Context) []string {
+    validLangsOnce.Do(func() {
     out, err := exec.CommandContext(ctx, "locale", "-a").CombinedOutput()
     if err != nil {
         log.Printf("error running 'locale -a': %s\n", err)
-        return []string{}
+        validLangs = []string{}
+        return
     }
-    return strings.Split(string(out), "\n")
+    // Filter out empty strings from the split result
+    allLangs := strings.Split(string(out), "\n")
+    validLangs = make([]string, 0, len(allLangs))
+    for _, lang := range allLangs {
+        if lang != "" {
+            validLangs = append(validLangs, lang)
+        }
+    }
+    })
+    return validLangs
 }

211-211: Enhance the Windows compatibility comment.

The current comment could be more explicit about Windows compatibility.

-// this command doesn't work on windows
+// Note: The 'locale -a' command is not available on Windows systems

230-238: Consider handling edge cases in language processing.

The current implementation might benefit from additional validation:

  1. Check if the truncated output is empty
  2. Validate the UTF-8 suffix format
-        preferredLang := strings.TrimSpace(truncOut) + ".UTF-8"
+        truncOut = strings.TrimSpace(truncOut)
+        if truncOut == "" {
+            log.Printf("empty locale detected, will use default %s\n", defaultLang)
+            return defaultLang
+        }
+        
+        // Ensure proper format for UTF-8 suffix
+        preferredLang := truncOut
+        if !strings.Contains(preferredLang, ".") {
+            preferredLang += ".UTF-8"
+        }
 
-        validLangs := listValidLangs(ctx)
         if !utilfn.ContainsStr(validLangs, preferredLang) {
-            log.Printf("unable to use desired lang %s, will use default 'en_US.UTF-8'\n", preferredLang)
+            log.Printf("desired language %s is not available, will use %s\n", preferredLang, defaultLang)
             return defaultLang
         }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 761fbb1 and 979126c.

📒 Files selected for processing (1)
  • pkg/wavebase/wavebase.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (2)
pkg/wavebase/wavebase.go (2)

20-21: LGTM!

The addition of the utilfn package is appropriate for the new language validation functionality.


204-238: Overall implementation aligns with PR objectives.

The changes successfully implement the fallback to en_US.UTF-8 when problems are detected with the specified locale. The implementation includes proper validation against available locales and handles macOS-specific locale detection as intended.

The suggested improvements above would make the implementation more robust, but the current implementation satisfies the core requirements.

Comment on lines +219 to +226
defaultLang := "en_US.UTF-8"
ctx, cancelFn := context.WithTimeout(context.Background(), 2*time.Second)
defer cancelFn()
if runtime.GOOS == "darwin" {
out, err := exec.CommandContext(ctx, "defaults", "read", "-g", "AppleLocale").CombinedOutput()
if err != nil {
log.Printf("error executing 'defaults read -g AppleLocale': %v\n", err)
return ""
log.Printf("error executing 'defaults read -g AppleLocale', will use default 'en_US.UTF-8': %v\n", err)
return defaultLang
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error handling and validation.

A few suggestions to enhance the robustness of the language determination:

  1. The error message could be more informative by including the actual error output.
  2. The default language should also be validated against available locales.
 func determineLang() string {
     defaultLang := "en_US.UTF-8"
     ctx, cancelFn := context.WithTimeout(context.Background(), 2*time.Second)
     defer cancelFn()
+    validLangs := listValidLangs(ctx)
+    
+    // Validate that default language is available
+    if !utilfn.ContainsStr(validLangs, defaultLang) {
+        log.Printf("warning: default language %s is not available, falling back to C.UTF-8\n", defaultLang)
+        defaultLang = "C.UTF-8"
+    }
+
     if runtime.GOOS == "darwin" {
         out, err := exec.CommandContext(ctx, "defaults", "read", "-g", "AppleLocale").CombinedOutput()
         if err != nil {
-            log.Printf("error executing 'defaults read -g AppleLocale', will use default 'en_US.UTF-8': %v\n", err)
+            log.Printf("error executing 'defaults read -g AppleLocale', will use %s: %v (output: %s)\n",
+                defaultLang, err, string(out))
             return defaultLang
         }

Committable suggestion skipped: line range outside the PR's diff.

@oneirocosm oneirocosm merged commit cee698b into main Jan 7, 2025
8 checks passed
@oneirocosm oneirocosm deleted the sylvie/default-lang branch January 7, 2025 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant