Skip to content

[BUG] pandera model wrongly detected as pydantic and pytkdocs tries to read non existent attributes #148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
camold opened this issue Jul 10, 2023 · 7 comments

Comments

@camold
Copy link

camold commented Jul 10, 2023

First of all, thanks for developing pytkdocs!

I do not use pytkdocs directly, but rather mkdocs and mkdocstrings which call pytkdocs.
Here is an example for reproduction.

import pandera as pa
from pandera.typing import DataFrame
from pandera.typing import Series

class Foo(pa.DataFrameModel):
    """
    Some description
    """
    bar: Series[int]

cause_error = DataFrame[Foo]({"bar": [1,2,3]})

Without any instantiated code (that actually uses the panderas models) it runs just fine. But, as soon as I USE the models somewhere in some precomputed objects pytkdocs runs into the following errors:

ERROR    -  mkdocstrings: 'tuple' object has no attribute 'required'
            Traceback (most recent call last):
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/cli.py", line 205, in main
                output = json.dumps(process_json(line))
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/cli.py", line 114, in process_json
                return process_config(json.loads(json_input))
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/cli.py", line 91, in process_config
                obj = loader.get_object_documentation(path, members)
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 358, in get_object_documentation
                root_object = self.get_module_documentation(leaf, members)
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 426, in get_module_documentation
                root_object.add_child(self.get_class_documentation(child_node))
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 544, in get_class_documentation
                self.add_fields(
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 612, in add_fields
                root_object.add_child(add_method(child_node))
              File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 712, in get_pydantic_field_documentation
                if prop.required:
            AttributeError: 'tuple' object has no attribute 'required'

It seems these pandera models are detected as pydantic, but they do not have the same attributes.
For proper pydantic classes we have this:

from pydantic import BaseModel
class Test(BaseModel):
    i: int
Test.__fields__["i"]
# yields True

For pandera models we seem to have this

import pandera as pa
from pandera.typing import Series
from pandera.typing import DataFrame

class Foo(pa.DataFrameModel):
    bar: Series[int]

Foo.__fields__
# is {}
foo = DataFrame[Foo]({"bar": [1,2,3]})
# after instantiating things exist and probably that's why it causes errors in pytkdocs
Foo.__fields__["bar"]
# <pandera.typing.common.AnnotationInfo at 0x7f...>, <pandera.api.pandas.model_components.FieldInfo("bar") object at 0x7f1...>)

pytkdocs 0.16.1, python 3.10.7, Linux
pydantic 1.10.11 with pydantic_core 2.1.2 (pandera imposes a restriction of pydantic <2)
pandera 0.15.2

@pawamoy
Copy link
Member

pawamoy commented Jul 11, 2023

Hello, thanks for the report.

Do you use the legacy handler by necessity? Out of curiosity, is there something preventing you from using the new handler?

@camold
Copy link
Author

camold commented Jul 16, 2023

Hi @pawamoy, thanks for pointing out that we are using a legacy handler. I guess there was a phase where our docs were not supported yet so I kept working with mkdocstrings[python-legacy].
Upon upgrade to griffe things also break. I get runtime errors (IndexError: list out of range) inside griffe without any (for me readable) information on what is wrong. I guess I will need to cook up a minimal example and post it as issue for griffe :(

@pawamoy
Copy link
Member

pawamoy commented Jul 17, 2023

That would be great if you could report these issues you get indeed. If your repo is public, I can also use it to investigate (this way you don't need to create a minimal example).

I bet the index errors come from how we parse Returns section in docstrings. Try indenting continuation lines once more:

Returns:
    A long description
    of the return value.
    Blah blah blah.

->

Returns:
    A long description
        of the return value.
        Blah blah blah.

Unless you're not using Google docstrings?

@camold
Copy link
Author

camold commented Jul 18, 2023

That worked indeed. At least there are no runtime errors anymore.
I did notice a change though from pytkdocs to griffe. Before, if I had submodules, pytkdocs would list them in the documentation. Now it really only lists the main module, and not even any objects that I import from submodules.
So I guess I will have to add pages individually for these submodules or get the automatic reference creation to work (is this still the best approach: https://mkdocstrings.github.io/recipes/ ?)

@pawamoy
Copy link
Member

pawamoy commented Jul 18, 2023

Still the best approach, yes. And you can use show_submodules: true to render every submodule (see https://mkdocstrings.github.io/python/usage/configuration/members/#show_submodules). We changed the default from true to false between the legacy and new handler.

@camold
Copy link
Author

camold commented Jul 18, 2023

Great. Thanks for pointing that out. It worked just fine.
However, submodules that have a function with the same name in it will not be processed (they don't show up in the documentation).
So e.g. if you had package.foo as submodule that has a function foo in it, the entire package.foo submodule will be excluded.
The same if your package was already called foo and you had a foo submodule (foo/foo.py) in it, the package will not be rendered.

@pawamoy
Copy link
Member

pawamoy commented Jul 18, 2023

Yes, these are known issues and we plan to alleviate them. Note that wildcard imports make the situation worse, I recommend avoiding them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants