Skip to content

Cannot perform a series of crossings using reduce: <tibble> must not be duplicated. #992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
idavydov opened this issue Jul 7, 2020 · 5 comments · Fixed by #1230
Closed
Labels
bug an unexpected problem or unintended behavior grids #️⃣ expanding, nesting, crossing, ...

Comments

@idavydov
Copy link

idavydov commented Jul 7, 2020

I'm trying to reduce crossings using a list of tibbles, but I get unexpected error: (<tibble> must not be duplicated).

library(tidyverse)
a <- list(tibble(a=1:2), tibble(b=2:4), tibble(c=5:6))
crossing(crossing(a[[1]], a[[2]]), a[[3]])
#> # A tibble: 12 x 3
#>        a     b     c
#>    <int> <int> <int>
#>  1     1     2     5
#>  2     1     2     6
#>  3     1     3     5
#>  4     1     3     6
#>  5     1     4     5
#>  6     1     4     6
#>  7     2     2     5
#>  8     2     2     6
#>  9     2     3     5
#> 10     2     3     6
#> 11     2     4     5
#> 12     2     4     6
crossing(a[[1]], a[[2]], a[[3]])
#> # A tibble: 12 x 3
#>        a     b     c
#>    <int> <int> <int>
#>  1     1     2     5
#>  2     1     2     6
#>  3     1     3     5
#>  4     1     3     6
#>  5     1     4     5
#>  6     1     4     6
#>  7     2     2     5
#>  8     2     2     6
#>  9     2     3     5
#> 10     2     3     6
#> 11     2     4     5
#> 12     2     4     6
do.call(crossing, a)
#> Error: Column names `<tibble>` and `<tibble>` must not be duplicated.
reduce(a, crossing)
#> Error: Column name `<tibble>` must not be duplicated.

Created on 2020-07-07 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       CentOS Linux 7 (Core)       
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Zurich               
#>  date     2020-07-07                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source           
#>  assertthat    0.2.1   2019-03-21 [2] CRAN (R 3.6.1)   
#>  backports     1.1.8   2020-06-17 [2] CRAN (R 3.6.1)   
#>  blob          1.2.1   2020-01-20 [2] CRAN (R 3.6.1)   
#>  broom         0.5.6   2020-04-20 [2] CRAN (R 3.6.1)   
#>  callr         3.4.3   2020-03-28 [2] CRAN (R 3.6.1)   
#>  cellranger    1.1.0   2016-07-27 [2] CRAN (R 3.6.1)   
#>  cli           2.0.2   2020-02-28 [2] CRAN (R 3.6.1)   
#>  colorspace    1.4-2   2019-11-14 [2] R-Forge (R 3.6.1)
#>  crayon        1.3.4   2017-09-16 [2] CRAN (R 3.6.1)   
#>  DBI           1.1.0   2019-12-15 [2] CRAN (R 3.6.1)   
#>  dbplyr        1.4.4   2020-05-27 [2] CRAN (R 3.6.1)   
#>  desc          1.2.0   2018-05-01 [2] CRAN (R 3.6.1)   
#>  devtools      2.3.0   2020-04-10 [2] CRAN (R 3.6.1)   
#>  digest        0.6.25  2020-02-23 [2] CRAN (R 3.6.1)   
#>  dplyr       * 1.0.0   2020-05-29 [2] CRAN (R 3.6.1)   
#>  ellipsis      0.3.1   2020-05-15 [2] CRAN (R 3.6.1)   
#>  evaluate      0.14    2019-05-28 [2] CRAN (R 3.6.1)   
#>  fansi         0.4.1   2020-01-08 [2] CRAN (R 3.6.1)   
#>  forcats     * 0.5.0   2020-03-01 [2] CRAN (R 3.6.1)   
#>  fs            1.4.2   2020-06-30 [2] CRAN (R 3.6.1)   
#>  generics      0.0.2   2018-11-29 [2] CRAN (R 3.6.1)   
#>  ggplot2     * 3.3.2   2020-06-19 [2] CRAN (R 3.6.1)   
#>  glue          1.4.1   2020-05-13 [2] CRAN (R 3.6.1)   
#>  gtable        0.3.0   2019-03-25 [2] CRAN (R 3.6.1)   
#>  haven         2.3.1   2020-06-01 [2] CRAN (R 3.6.1)   
#>  highr         0.8     2019-03-20 [2] CRAN (R 3.6.1)   
#>  hms           0.5.3   2020-01-08 [2] CRAN (R 3.6.1)   
#>  htmltools     0.5.0   2020-06-16 [2] CRAN (R 3.6.1)   
#>  httr          1.4.1   2019-08-05 [2] CRAN (R 3.6.1)   
#>  jsonlite      1.7.0   2020-06-25 [2] CRAN (R 3.6.1)   
#>  knitr         1.29    2020-06-23 [2] CRAN (R 3.6.1)   
#>  lattice       0.20-41 2020-04-02 [2] CRAN (R 3.6.1)   
#>  lifecycle     0.2.0   2020-03-06 [2] CRAN (R 3.6.1)   
#>  lubridate     1.7.9   2020-06-08 [2] CRAN (R 3.6.1)   
#>  magrittr      1.5     2014-11-22 [2] CRAN (R 3.6.1)   
#>  memoise       1.1.0   2017-04-21 [2] CRAN (R 3.6.1)   
#>  modelr        0.1.8   2020-05-19 [2] CRAN (R 3.6.1)   
#>  munsell       0.5.0   2018-06-12 [2] CRAN (R 3.6.1)   
#>  nlme          3.1-148 2020-05-24 [2] CRAN (R 3.6.1)   
#>  pillar        1.4.4   2020-05-05 [2] CRAN (R 3.6.1)   
#>  pkgbuild      1.0.8   2020-05-07 [2] CRAN (R 3.6.1)   
#>  pkgconfig     2.0.3   2019-09-22 [2] CRAN (R 3.6.1)   
#>  pkgload       1.1.0   2020-05-29 [2] CRAN (R 3.6.1)   
#>  prettyunits   1.1.1   2020-01-24 [2] CRAN (R 3.6.1)   
#>  processx      3.4.2   2020-02-09 [2] CRAN (R 3.6.1)   
#>  ps            1.3.3   2020-05-08 [2] CRAN (R 3.6.1)   
#>  purrr       * 0.3.4   2020-04-17 [2] CRAN (R 3.6.1)   
#>  R6            2.4.1   2019-11-12 [2] CRAN (R 3.6.1)   
#>  Rcpp          1.0.4.6 2020-04-09 [2] CRAN (R 3.6.1)   
#>  readr       * 1.3.1   2018-12-21 [2] CRAN (R 3.6.1)   
#>  readxl        1.3.1   2019-03-13 [2] CRAN (R 3.6.1)   
#>  remotes       2.1.1   2020-02-15 [2] CRAN (R 3.6.1)   
#>  reprex        0.3.0   2019-05-16 [2] CRAN (R 3.6.1)   
#>  rlang         0.4.6   2020-05-02 [2] CRAN (R 3.6.1)   
#>  rmarkdown     2.3     2020-06-18 [2] CRAN (R 3.6.1)   
#>  rprojroot     1.3-2   2018-01-03 [2] CRAN (R 3.6.1)   
#>  rvest         0.3.5   2019-11-08 [2] CRAN (R 3.6.1)   
#>  scales        1.1.1   2020-05-11 [2] CRAN (R 3.6.1)   
#>  sessioninfo   1.1.1   2018-11-05 [2] CRAN (R 3.6.1)   
#>  stringi       1.4.6   2020-02-17 [2] CRAN (R 3.6.1)   
#>  stringr     * 1.4.0   2019-02-10 [2] CRAN (R 3.6.1)   
#>  testthat      2.3.2   2020-03-02 [2] CRAN (R 3.6.1)   
#>  tibble      * 3.0.1   2020-04-20 [2] CRAN (R 3.6.1)   
#>  tidyr       * 1.1.0   2020-05-20 [2] CRAN (R 3.6.1)   
#>  tidyselect    1.1.0   2020-05-11 [2] CRAN (R 3.6.1)   
#>  tidyverse   * 1.3.0   2019-11-21 [2] CRAN (R 3.6.1)   
#>  usethis       1.6.1   2020-04-29 [2] CRAN (R 3.6.1)   
#>  utf8          1.1.4   2018-05-24 [2] CRAN (R 3.6.1)   
#>  vctrs         0.3.1   2020-06-05 [2] CRAN (R 3.6.1)   
#>  withr         2.2.0   2020-04-20 [2] CRAN (R 3.6.1)   
#>  xfun          0.15    2020-06-21 [2] CRAN (R 3.6.1)   
#>  xml2          1.3.2   2020-04-23 [2] CRAN (R 3.6.1)   
#>  yaml          2.2.1   2020-02-01 [2] CRAN (R 3.6.1)   
#> 
#> [1] home
#> [2] xxx
#> [3] xxx
@hadley
Copy link
Member

hadley commented Aug 28, 2020

Somewhat more minimal reprex:

library(tidyr)
x <- list(tibble(a=1:2), tibble(b=2:4))
purrr::reduce(x, crossing)
#> Error: Column name `<tibble>` must not be duplicated.

Created on 2020-08-28 by the reprex package (v0.3.0.9001)

But weirdly the problem goes away if you insert an anonymous function:

library(tidyr)
x <- list(tibble(a=1:2), tibble(b=2:4))
purrr::reduce(x, function(x, y) {crossing(x, y)})
#> # A tibble: 6 x 2
#>       a     b
#>   <int> <int>
#> 1     1     2
#> 2     1     3
#> 3     1     4
#> 4     2     2
#> 5     2     3
#> 6     2     4

Created on 2020-08-28 by the reprex package (v0.3.0.9001)

Any ideas @lionel-?

@lionel-
Copy link
Member

lionel- commented Aug 29, 2020

Column name <tibble> must not be duplicated.

It looks like crossing() is getting its arguments evaluated and so can't label them from the argument names. Instead they get the same default name as controlled by as_label().

I think the arguments are forced because of: https://github.com/tidyverse/purrr/blob/5de5ad293b817d8a6baea32cc4415487d71da955/R/reduce.R#L160

One solution might be to repair the names in crossing().

@twest820
Copy link

I think I've hit this as well, though in my case it's only a single call to crossing().

Error: Column name `tibble(...)` must not be duplicated.
Use .name_repair to specify repair.
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
x
+-<error/tibble_error_column_names_must_be_unique>
| Column name `tibble(...)` must not be duplicated.
| Use .name_repair to specify repair.
\-<error/vctrs_error_names_must_be_unique>
  Names must be unique.
Backtrace:
  1. `%>%`(...)
  3. tidyr::crossing(...)
  4. tidyr::expand_grid(!!!cols, .name_repair = .name_repair)
  6. tibble:::as_tibble.list(out, .name_repair = .name_repair)
  7. tibble:::lst_to_tibble(x, .rows, .name_repair, col_lengths(x))
  8. tibble:::set_repaired_names(x, repair_hint = TRUE, .name_repair)
 10. tibble:::repaired_names(...)
 13. vctrs::vec_as_names(...)
 15. vctrs:::validate_unique(names = names, arg = arg)
 16. vctrs:::stop_names_must_be_unique(names, arg)
 17. vctrs:::stop_names(...)
 18. vctrs:::stop_vctrs(class = c(class, "vctrs_error_names"), ...)
Run `rlang::last_trace()` to see the full context.

Some investigation in search of a minimal repex shows sensitivity to naming and tibble content which, to me, is curious.

crossing(tibble(name000001 = 1, name000002 = 93, name000003 = 16, name000004 = 65, name000005 = 24),
         tibble(name000006 = c(6, 7, 8), name000007Efficie = c(7, 8, 9))) # works
crossing(tibble(name000001 = 1, name000002 = 93, name000003 = 16, name000004 = 65, name000005 = 24),
         tibble(name000006 = c(6, 7, 8), name000007Efficien = c(7, 8, 9))) # fails due to one additional character in name of second tibble's second column 
crossing(tibble(name000001 = 1, name000002 = 93, name000003 = 16, name000004 = 65, name000005 = 24),
         tibble(name000006 = 6, name000007Efficien = 7)) # extra character works with different data

@DavisVaughan
Copy link
Member

@eutwt that actually looks very broken? It should be like:

library(tidyr)

x <- tibble(name000001 = 1, name000002 = 93, name000003 = 16, name000004 = 65, name000005 = 24)
y <- tibble(name000006 = c(6, 7, 8), name000007Efficien = c(7, 8, 9))

crossing(x, y)
#> # A tibble: 3 × 7
#>   name000001 name000002 name000003 name000004 name000005 name000006
#>        <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
#> 1          1         93         16         65         24          6
#> 2          1         93         16         65         24          7
#> 3          1         93         16         65         24          8
#> # … with 1 more variable: name000007Efficien <dbl>

Created on 2021-08-26 by the reprex package (v2.0.0.9000)

@eutwt
Copy link

eutwt commented Aug 26, 2021

@DavisVaughan Yes thank you, sorry about that. I noticed after I posted it was incorrect. I've deleted the erroneous comment.

@DavisVaughan DavisVaughan added bug an unexpected problem or unintended behavior rectangling 🗄️ converting deeply nested lists into tidy data frames labels Oct 29, 2021
@DavisVaughan DavisVaughan added grids #️⃣ expanding, nesting, crossing, ... and removed rectangling 🗄️ converting deeply nested lists into tidy data frames labels Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior grids #️⃣ expanding, nesting, crossing, ...
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants