Open
Description
- Package Name:
azure.ai.formrecognizer
- Package Version: 3.3.3
- Operating System: MacOS
- Python Version: 3.12
Describe the bug
The words in a DocumentLine
seem to store their polygons as a list of floats rather than as a list of Point
objects.
from typing import reveal_type
line: DocumentLine
for w in line.get_words():
for p in w.polygon:
reveal_type(p) # this reveals as `Point` but it turns out to be a `float` at runtime
This is in contrast to iterating over words from the words
field in a DocumentPage
, where the polygons are indeed sequences of Point
objects.
To Reproduce
Steps to reproduce the behavior:
- Use a
prebuilt-read
model on pdf bytes to run OCR - Examine the words in a document line in the result
Tangentially related to #39031
Metadata
Metadata
Assignees
Labels
This issue points to a problem in the data-plane of the library.Workflow: This issue is responsible by Azure service team.Issues that are reported by GitHub users external to the Azure organization.Workflow: This issue needs attention from Azure service team or SDK teamThe issue doesn't require a change to the product in order to be resolved. Most issues start as that