Skip to content

serialization of columns added into the definition of the table #1715

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

matteocacciola
Copy link

@matteocacciola matteocacciola commented Apr 12, 2025


Important

Adds column serialization to DataframeSerializer.serialize() and updates tests and documentation accordingly.

  • Behavior:
    • DataframeSerializer.serialize() in dataframe_serializer.py now includes column serialization in the table definition.
    • is_expression_valid() in semantic_layer_schema.py now returns Optional[str] to handle None values.
  • Testing:
    • Updates test_dataframe_serializer.py to test new column serialization format.
    • Adds new test commands make test_core, make test_extensions, and make test-coverage in CONTRIBUTING.md.

This description was created by Ellipsis for 2d97886. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 2d97886 in 43 seconds

More details
  • Looked at 107 lines of code in 4 files
  • Skipped 1 files when reviewing.
  • Skipped posting 10 drafted comments based on config settings.
1. CONTRIBUTING.md:65
  • Draft comment:
    Updated test commands (make test_all, test_core, test_extensions, test-coverage) are clearly documented. Ensure future updates maintain consistent naming conventions throughout.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
2. pandasai/data_loader/semantic_layer_schema.py:64
  • Draft comment:
    In the is_expression_valid validator, adding an explicit check for None is good. Consider if empty strings should be allowed or handled similarly.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
3. pandasai/helpers/dataframe_serializer.py:31
  • Draft comment:
    Serialization now includes a 'columns' attribute with JSON output. This implementation is clear; ensure that any future changes to Column model are reflected in the serializer tests.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
4. tests/unit_tests/helpers/test_dataframe_serializer.py:9
  • Draft comment:
    Test expectations updated with the new columns JSON attribute. Confirm the nested double quotes are rendered as expected in different environments.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
5. CONTRIBUTING.md:65
  • Draft comment:
    Good update of test commands. Ensure that any command-line updates in documentation are synced with the makefile targets.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
6. pandasai/data_loader/semantic_layer_schema.py:64
  • Draft comment:
    The validator now returns None if expression is None. Consider updating the docstring (or adding a comment) to clarify that a missing expression bypasses SQL parsing.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
7. pandasai/helpers/dataframe_serializer.py:31
  • Draft comment:
    New logic adds a 'columns' attribute with JSON-dumped column definitions. Verify that HTML attribute escaping is robust if column descriptions may include quotes or special characters.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
8. tests/unit_tests/helpers/test_dataframe_serializer.py:8
  • Draft comment:
    The expected output embeds JSON inside an HTML attribute. This literal comparison may be fragile if key ordering changes. Consider parsing the JSON portion and comparing as objects to make tests more robust.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
9. CONTRIBUTING.md:56
  • Draft comment:
    Typo found: On line 56, "We usee codespell..." should be corrected to "We use codespell...".
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. pandasai/helpers/dataframe_serializer.py:8
  • Draft comment:
    Minor naming inconsistency: The class is named 'DataframeSerializer' (with a lowercase 'f'), whereas the type hints and commonly used naming conventions (e.g., DataFrame) suggest it might be more consistent to name it 'DataFrameSerializer'. This is a trivial cosmetic issue.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_9zWS9G4iItB46wo8


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Columns not added during the serialization of the DataFrame
1 participant