Skip to content
github-actions[bot] edited this page Jun 15, 2025 · 1 revision

This document was generated from 'src/documentation/print-linter-wiki.ts' on 2025-06-15, 17:37:59 UTC presenting an overview of flowR's linter (v2.2.15, using R v4.5.0). Please do not edit this file/wiki page directly.

This page describes the flowR linter, which is a tool that utilizes flowR's dataflow analysis to find common issues in R scripts. The linter can currently be used through the linter query. For example:

$ docker run -it --rm eagleoutice/flowr # or npm run flowr 
flowR repl using flowR v2.2.15, R v4.5.0 (r-shell engine)
R> :query @linter "read.csv(\"/root/x.txt\")"
Output
Query: [;1mlinter[0m (2 ms)
   ╰ deprecated-functions:
       ╰ _Metadata_: <code>{"totalRelevant":1,"totalNotDeprecated":1,"searchTimeMs":0,"processTimeMs":0}</code>
   ╰ file-path-validity:
       ╰ definitely:
           ╰ Path `/root/x.txt` at 1.1-23
       ╰ _Metadata_: <code>{"totalReads":1,"totalUnknown":0,"totalWritesBeforeAlways":0,"totalValid":0,"searchTimeMs":1,"processTimeMs":0}</code>
   ╰ absolute-file-paths:
       ╰ definitely:
           ╰ Path `/root/x.txt` at 1.1-23
       ╰ _Metadata_: <code>{"totalConsidered":1,"totalUnknown":0,"searchTimeMs":1,"processTimeMs":0}</code>
   ╰ unused-definitions:
       ╰ _Metadata_: <code>{"totalConsidered":0,"searchTimeMs":0,"processTimeMs":0}</code>
[;3mAll queries together required ≈2 ms (1ms accuracy, total 7 ms)[0m[0m

The linter will analyze the code and return any issues found. Formatted more nicely, this returns:

[ { "type": "linter" } ]

Results (prettified and summarized):

Query: linter (8 ms)
   ╰ deprecated-functions:
       ╰ Metadata: {"totalRelevant":1,"totalNotDeprecated":1,"searchTimeMs":1,"processTimeMs":0}
   ╰ file-path-validity:
       ╰ definitely:
           ╰ Path /root/x.txt at 1.1-23
       ╰ Metadata: {"totalReads":1,"totalUnknown":0,"totalWritesBeforeAlways":0,"totalValid":0,"searchTimeMs":3,"processTimeMs":1}
   ╰ absolute-file-paths:
       ╰ definitely:
           ╰ Path /root/x.txt at 1.1-23
       ╰ Metadata: {"totalConsidered":1,"totalUnknown":0,"searchTimeMs":2,"processTimeMs":0}
   ╰ unused-definitions:
       ╰ Metadata: {"totalConsidered":0,"searchTimeMs":0,"processTimeMs":0}
All queries together required ≈8 ms (1ms accuracy, total 23 ms)

Show Detailed Results as Json

The analysis required 22.8 ms (including parsing and normalization and the query) within the generation environment.

In general, the JSON contains the Ids of the nodes in question as they are present in the normalized AST or the dataflow graph of flowR. Please consult the Interface wiki page for more information on how to get those.

{
  "linter": {
    "results": {
      "deprecated-functions": {
        "results": [],
        ".meta": {
          "totalRelevant": 1,
          "totalNotDeprecated": 1,
          "searchTimeMs": 1,
          "processTimeMs": 0
        }
      },
      "file-path-validity": {
        "results": [
          {
            "range": [
              1,
              1,
              1,
              23
            ],
            "filePath": "/root/x.txt",
            "certainty": "definitely"
          }
        ],
        ".meta": {
          "totalReads": 1,
          "totalUnknown": 0,
          "totalWritesBeforeAlways": 0,
          "totalValid": 0,
          "searchTimeMs": 3,
          "processTimeMs": 1
        }
      },
      "absolute-file-paths": {
        "results": [
          {
            "certainty": "definitely",
            "filePath": "/root/x.txt",
            "range": [
              1,
              1,
              1,
              23
            ]
          }
        ],
        ".meta": {
          "totalConsidered": 1,
          "totalUnknown": 0,
          "searchTimeMs": 2,
          "processTimeMs": 0
        }
      },
      "unused-definitions": {
        "results": [],
        ".meta": {
          "totalConsidered": 0,
          "searchTimeMs": 0,
          "processTimeMs": 0
        }
      }
    },
    ".meta": {
      "timing": 8
    }
  },
  ".meta": {
    "timing": 8
  }
}

Tags

We use tags to categorize linting rules. The following tags are available:

Tag Description
bug This rule is used to detect bugs in the code. Everything that affects the semantics of the code, such as incorrect function calls, wrong arguments, etc. is to be considered a bug. Otherwise, it may be a smell or a style issue. (rule: file-path-validity)
deprecated This signals the use of deprecated functions or features. (rule: deprecated-functions)
documentation This rule is used to detect issues that are related to the documentation of the code. For example, missing or misleading comments. (rules: none)
experimental This marks rules which are currently considered experimental, not that they detect experimental code. (rules: none)
performance This rule is used to detect issues that are related to the performance of the code. For example, inefficient algorithms, unnecessary computations, or unoptimized data structures. (rules: none)
robustness This rule is used to detect issues that are related to the portability of the code. For example, platform-specific code, or code that relies on specific R versions or packages. (rules: file-path-validity and absolute-file-paths)
rver3 The rule is specific to R version 3.x. (rules: none)
rver4 The rule is specific to R version 4.x. (rules: none)
readability This rule is used to detect issues that are related to the readability of the code. For example, complex expressions, long lines, or inconsistent formatting. (rule: unused-definitions)
reproducibility This rule is used to detect issues that are related to the reproducibility of the code. For example, missing or incorrect random seeds, or missing data. (rules: deprecated-functions, file-path-validity, and absolute-file-paths)
security This rule is used to detect security-critical. For example, missing input validation. (rules: none)
shiny This rule is used to detect issues that are related to the shiny framework. (rules: none)
smell This rule is used to detect issues that do not directly affect the semantics of the code, but are still considered bad practice. (rules: deprecated-functions, absolute-file-paths, and unused-definitions)
style This rule is used to detect issues that are related to the style of the code. For example, inconsistent naming conventions, or missing or incorrect formatting. (rules: none)
usability This rule is used to detect issues that are related to the (re-)usability of the code. For example, missing or incorrect error handling, or missing or incorrect user interface elements. (rule: deprecated-functions)
quickfix This rule may provide quickfixes to automatically fix the issues it detects. (rules: absolute-file-paths and unused-definitions)

Linting Rules

The following linting rules are available:

Absolute Paths (absolute-file-paths)

smell quickfix reproducibility robustness
Checks whether file paths are absolute.

Configuration

Linting rules can be configured by passing a configuration object to the linter query as shown in the example below. The absolute-file-paths rule accepts the following configuration options:

  • absolutePathRegex
    Extend the built-in absolute path recognition with additional regexes
  • additionalPathFunctions
    The set of functions that should additionally be considered as using a file path. Entries in this array use the FunctionInfo format from the dependencies query.
  • include
    Include paths that are built by functions, e.g., file.path()
  • useAsWd
    Which path should be considered to be the origin for relative paths. This is only relevant with quickfixes. In the future we may be sensitive to setwd etc.

Example

read.csv("C:/Users/me/Documents/My R Scripts/Reproducible.csv")

The linting query can be used to run this rule on the above example:

[ { "type": "linter",   "rules": [ { "name": "absolute-file-paths",     "config": {} } ] } ]

Results (prettified and summarized):

Query: linter (1 ms)
   ╰ absolute-file-paths:
       ╰ definitely:
           ╰ Path C:/Users/me/Documents/My R Scripts/Reproducible.csv at 2.1-63
       ╰ Metadata: {"totalConsidered":1,"totalUnknown":0,"searchTimeMs":1,"processTimeMs":0}
All queries together required ≈1 ms (1ms accuracy, total 3 ms)

Show Detailed Results as Json

The analysis required 3.3 ms (including parsing and normalization and the query) within the generation environment.

In general, the JSON contains the Ids of the nodes in question as they are present in the normalized AST or the dataflow graph of flowR. Please consult the Interface wiki page for more information on how to get those.

{
  "linter": {
    "results": {
      "absolute-file-paths": {
        "results": [
          {
            "certainty": "definitely",
            "filePath": "C:/Users/me/Documents/My R Scripts/Reproducible.csv",
            "range": [
              2,
              1,
              2,
              63
            ]
          }
        ],
        ".meta": {
          "totalConsidered": 1,
          "totalUnknown": 0,
          "searchTimeMs": 1,
          "processTimeMs": 0
        }
      }
    },
    ".meta": {
      "timing": 1
    }
  },
  ".meta": {
    "timing": 1
  }
}

Deprecated Functions (deprecated-functions)

smell deprecated reproducibility usability
Marks deprecated functions that should not be used anymore.

Configuration

Linting rules can be configured by passing a configuration object to the linter query as shown in the example below. The deprecated-functions rule accepts the following configuration options:

Example

first <- data.frame(x = c(1, 2, 3), y = c(1, 2, 3))
second <- data.frame(x = c(1, 3, 2), y = c(1, 3, 2))
dplyr::all_equal(first, second)

The linting query can be used to run this rule on the above example:

[ { "type": "linter",   "rules": [ { "name": "deprecated-functions",     "config": {} } ] } ]

Results (prettified and summarized):

Query: linter (0 ms)
   ╰ deprecated-functions:
       ╰ Metadata: {"totalRelevant":9,"totalNotDeprecated":9,"searchTimeMs":0,"processTimeMs":0}
All queries together required ≈0 ms (1ms accuracy, total 9 ms)

Show Detailed Results as Json

The analysis required 8.6 ms (including parsing and normalization and the query) within the generation environment.

In general, the JSON contains the Ids of the nodes in question as they are present in the normalized AST or the dataflow graph of flowR. Please consult the Interface wiki page for more information on how to get those.

{
  "linter": {
    "results": {
      "deprecated-functions": {
        "results": [],
        ".meta": {
          "totalRelevant": 9,
          "totalNotDeprecated": 9,
          "searchTimeMs": 0,
          "processTimeMs": 0
        }
      }
    },
    ".meta": {
      "timing": 0
    }
  },
  ".meta": {
    "timing": 0
  }
}

File Path Validity (file-path-validity)

bug reproducibility robustness
Checks whether file paths used in read and write operations are valid and point to existing files.

Configuration

Linting rules can be configured by passing a configuration object to the linter query as shown in the example below. The file-path-validity rule accepts the following configuration options:

  • additionalReadFunctions
    The set of functions that should additionally be considered as reading a file path. Entries in this array use the FunctionInfo format from the dependencies query.
  • additionalWriteFunctions
    The set of functions that should additionally be considered as writing to a file path. Entries in this array use the FunctionInfo format from the dependencies query.
  • includeUnknown
    Whether unknown file paths should be included as linting results.

Example

my_data <- read.csv("C:/Users/me/Documents/My R Scripts/Reproducible.csv")

The linting query can be used to run this rule on the above example:

[ { "type": "linter",   "rules": [ { "name": "file-path-validity",     "config": {} } ] } ]

Results (prettified and summarized):

Query: linter (1 ms)
   ╰ file-path-validity:
       ╰ definitely:
           ╰ Path C:/Users/me/Documents/My R Scripts/Reproducible.csv at 2.12-74
       ╰ Metadata: {"totalReads":1,"totalUnknown":0,"totalWritesBeforeAlways":0,"totalValid":0,"searchTimeMs":0,"processTimeMs":1}
All queries together required ≈1 ms (1ms accuracy, total 3 ms)

Show Detailed Results as Json

The analysis required 2.9 ms (including parsing and normalization and the query) within the generation environment.

In general, the JSON contains the Ids of the nodes in question as they are present in the normalized AST or the dataflow graph of flowR. Please consult the Interface wiki page for more information on how to get those.

{
  "linter": {
    "results": {
      "file-path-validity": {
        "results": [
          {
            "range": [
              2,
              12,
              2,
              74
            ],
            "filePath": "C:/Users/me/Documents/My R Scripts/Reproducible.csv",
            "certainty": "definitely"
          }
        ],
        ".meta": {
          "totalReads": 1,
          "totalUnknown": 0,
          "totalWritesBeforeAlways": 0,
          "totalValid": 0,
          "searchTimeMs": 0,
          "processTimeMs": 1
        }
      }
    },
    ".meta": {
      "timing": 1
    }
  },
  ".meta": {
    "timing": 1
  }
}

Unused Definitions (unused-definitions)

smell quickfix readability
Checks for unused definitions.

Configuration

Linting rules can be configured by passing a configuration object to the linter query as shown in the example below. The unused-definitions rule accepts the following configuration options:

  • includeFunctionDefinitions
    Whether to include (potentially anonymous) function definitions in the search (e.g., should we report uncalled anonymous functions?).

Example

x <- 42
y <- 3
print(x)

The linting query can be used to run this rule on the above example:

[ { "type": "linter",   "rules": [ { "name": "unused-definitions",     "config": {} } ] } ]

Results (prettified and summarized):

Query: linter (0 ms)
   ╰ unused-definitions:
       ╰ maybe:
           ╰ Definition of y at 3.1 (quick fix available)
       ╰ Metadata: {"totalConsidered":2,"searchTimeMs":0,"processTimeMs":0}
All queries together required ≈1 ms (1ms accuracy, total 3 ms)

Show Detailed Results as Json

The analysis required 3.4 ms (including parsing and normalization and the query) within the generation environment.

In general, the JSON contains the Ids of the nodes in question as they are present in the normalized AST or the dataflow graph of flowR. Please consult the Interface wiki page for more information on how to get those.

{
  "linter": {
    "results": {
      "unused-definitions": {
        "results": [
          {
            "certainty": "maybe",
            "variableName": "y",
            "range": [
              3,
              1,
              3,
              1
            ],
            "quickFix": [
              {
                "type": "remove",
                "range": [
                  3,
                  1,
                  3,
                  6
                ],
                "description": "Remove unused definition of `y`"
              }
            ]
          }
        ],
        ".meta": {
          "totalConsidered": 2,
          "searchTimeMs": 0,
          "processTimeMs": 0
        }
      }
    },
    ".meta": {
      "timing": 0
    }
  },
  ".meta": {
    "timing": 1
  }
}
Clone this wiki locally