Skip to content

Type checking with converters= on read_excel #849

Open
@oliverbeagley-pgg

Description

@oliverbeagley-pgg

Describe the bug
Pyright doesn't seem to like something about using converters=... with read_excel

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
from functools import partial

import pandas as pd

df_1 = pd.read_excel(
    "foo.csv",
    converters={"field_1": partial(pd.to_datetime, errors="coerce")},
)
df_2 = pd.read_excel(
    "foo.csv",
    sheet_name="sheet",
    converters={"field_1": partial(pd.to_datetime, errors="coerce")},
)

reveal_type(df_1)
reveal_type(df_2)
  1. Indicate which type checker you are using (mypy or pyright).: Both
  2. Show the error message received from that type checker while checking your example.

mypy:

$ mypy t.py
t.py:15: note: Revealed type is "pandas.core.frame.DataFrame"
t.py:16: note: Revealed type is "pandas.core.frame.DataFrame"
Success: no issues found in 1 source file

pyright:

/home/user/t.py
  /home/user/t.py:7:16 - error: Argument of type "dict[str, partial[Timestamp]]" cannot be assigned to parameter "converters" of type "dict[int | str, (object) -> object] | None" in function "read_excel"
    Type "dict[str, partial[Timestamp]]" cannot be assigned to type "dict[int | str, (object) -> object] | None"
      "dict[str, partial[Timestamp]]" is incompatible with "dict[int | str, (object) -> object]"
        Type parameter "_KT@dict" is invariant, but "str" is not the same as "int | str"
        Type parameter "_VT@dict" is invariant, but "partial[Timestamp]" is not the same as "(object) -> object"
      "dict[str, partial[Timestamp]]" is incompatible with "None" (reportGeneralTypeIssues)
  /home/user/t.py:9:8 - error: No overloads for "read_excel" match the provided arguments (reportGeneralTypeIssues)
  /home/user/t.py:12:16 - error: Argument of type "dict[str, partial[Timestamp]]" cannot be assigned to parameter "converters" of type "dict[int | str, (object) -> object] | None" in function "read_excel"
    Type "dict[str, partial[Timestamp]]" cannot be assigned to type "dict[int | str, (object) -> object] | None"
      "dict[str, partial[Timestamp]]" is incompatible with "dict[int | str, (object) -> object]"
        Type parameter "_KT@dict" is invariant, but "str" is not the same as "int | str"
        Type parameter "_VT@dict" is invariant, but "partial[Timestamp]" is not the same as "(object) -> object"
      "dict[str, partial[Timestamp]]" is incompatible with "None" (reportGeneralTypeIssues)
  /home/user/t.py:15:13 - information: Type of "df_1" is "DataFrame"
  /home/user/t.py:16:13 - information: Type of "df_2" is "Unknown"

Please complete the following information:

  • WSL - ubuntu 22.04
  • python version 3.10.12
  • version of type checker: pyright==1.1.345, mypy==1.8.0
  • version of installed pandas-stubs: pandas_stubs==1.4.4.220919 (also present on latest pandas_stubs version for most recent pandas version)

Additional context
Not sure if its something with the converters type or read_excel, but using a lambda instead of partial results in both type checkers not liking it even though read_csv has no issue with either ie

converters={"field": lambda s: pd.to_datetime(s, errors="coerce")}
#vs
converters={"field": partial(pd.to_datetime, errors="coerce")}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions