Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lint lambdas nested inside loops? #2685

Open
IndrajeetPatil opened this issue Nov 10, 2024 · 4 comments
Open

Lint lambdas nested inside loops? #2685

IndrajeetPatil opened this issue Nov 10, 2024 · 4 comments

Comments

@IndrajeetPatil
Copy link
Collaborator

The variant involving lambdas (anonymous functions) inside loops is slower (and severely memory inefficient) due to the overhead of repeatedly creating new function objects during each iteration of the loop. The performance gains compound as the computations in the function gets more costly.

numbers <- 1:1e6
results <- numeric(length(numbers))

# example with `for()` loop, but the same logic applies to a `while()` loop
bench::mark(
  "lambda inside loop" = {
    for (i in seq_along(numbers)) {
      results[i] <- (function(x) x^2)(numbers[i])
    }
  },
  "function outside loop" = {
    quad_func <- function(x) x^2
    for (i in seq_along(numbers)) {
      results[i] <- quad_func(numbers[i])
    }
  },
  time_unit = "ms"
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#>   expression              min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <dbl>  <dbl>     <dbl> <bch:byt>    <dbl>
#> 1 lambda inside loop     183.   187.      5.39    11.9MB     43.1
#> 2 function outside loop  171.   173.      5.80    98.5KB     36.7

Created on 2024-11-10 with reprex v2.1.1

Session info

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.2 (2024-10-31)
#>  os       macOS Sequoia 15.1
#>  system   aarch64, darwin20
#>  hostname MacBookAir.fritz.box
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Berlin
#>  date     2024-11-10
#>  pandoc   3.5 @ /usr/local/bin/ (via rmarkdown)
#>  quarto   1.6.33 @ /usr/local/bin/quarto
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date (UTC) lib source
#>  bench         1.1.3      2023-05-04 [2] RSPM (R 4.4.0)
#>  cli           3.6.3      2024-06-21 [1] CRAN (R 4.4.0)
#>  digest        0.6.37     2024-08-19 [1] CRAN (R 4.4.1)
#>  evaluate      1.0.1      2024-10-10 [1] CRAN (R 4.4.1)
#>  fansi         1.0.6      2023-12-08 [1] CRAN (R 4.4.0)
#>  fastmap       1.2.0      2024-05-15 [1] CRAN (R 4.4.0)
#>  fs            1.6.5      2024-10-30 [1] CRAN (R 4.4.1)
#>  glue          1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
#>  htmltools     0.5.8.1    2024-04-04 [1] CRAN (R 4.4.0)
#>  knitr         1.49       2024-11-08 [1] CRAN (R 4.4.2)
#>  lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.4.0)
#>  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.4.0)
#>  pillar        1.9.0      2023-03-22 [1] CRAN (R 4.4.0)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.4.0)
#>  profmem       0.6.0      2020-12-13 [2] RSPM (R 4.4.0)
#>  reprex        2.1.1      2024-07-06 [2] CRAN (R 4.4.1)
#>  rlang         1.1.4      2024-06-04 [1] CRAN (R 4.4.0)
#>  rmarkdown     2.29       2024-11-04 [1] CRAN (R 4.4.1)
#>  rstudioapi    0.17.1     2024-10-22 [1] CRAN (R 4.4.1)
#>  sessioninfo   1.2.2.9000 2024-11-09 [1] Github (r-lib/sessioninfo@37c81af)
#>  tibble        3.2.1      2023-03-20 [1] CRAN (R 4.4.0)
#>  utf8          1.2.4      2023-10-22 [1] CRAN (R 4.4.0)
#>  vctrs         0.6.5      2023-12-01 [1] CRAN (R 4.4.0)
#>  withr         3.0.2      2024-10-28 [1] CRAN (R 4.4.1)
#>  xfun          0.49       2024-10-31 [1] CRAN (R 4.4.1)
#>  yaml          2.3.10     2024-07-26 [1] CRAN (R 4.4.1)
#> 
#>  [1] /Users/indrajeetpatil/Library/R/arm64/4.4/library
#>  [2] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@MichaelChirico
Copy link
Collaborator

MichaelChirico commented Nov 10, 2024 via email

@AshesITR
Copy link
Collaborator

Since you can redefine almost anything in R with sufficiently mean code, the only lambda you can be sure about is the identity lambda \(x) x.

@MichaelChirico
Copy link
Collaborator

I think very few of our linters survive the meanest possible code 😂

For the proposed linter, I mainly wonder how much effort is needed to avoid the most common types of false positive.

I assume running a proposed linter against r-devel will find the most typical kinds of false positive.

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 3, 2024

I like the idea of using r-devel as a test suite.

Rough implementation idea:

  1. find all lambdas defined inside loops.
  2. enumerate symbols used inside these lambdas which are not defined within the lambda (via assignments) or provided as arguments.
  3. enumerate symbols defined inside the containing loop.
  4. lint iff (2) and (3) are disjoint.

Maybe codetools::checkUsage can be leveraged to do the heavy lifting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants