pytorchcocotools.mask

Functions

iou — Compute intersection over union between masks.
merge — Compute union or intersection of encoded masks.
frPyObjects — Convert (list of) polygon, bbox, or uncompressed RLE to encoded RLE mask.
encode — Encode binary masks using RLE.
decode — Decode binary masks encoded via RLE.
area — Compute area of encoded masks.
toBbox — Get bounding boxes surrounding encoded masks.

source iou(dt: IoUObject, gt: IoUObject, pyiscrowd: Bools, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → Tensor

Compute intersection over union between masks.

Note

Finally, a note about the intersection over union (iou) computation. The standard iou of a ground truth (gt) and detected (dt) object is .. code-block:: python iou(gt,dt) = area(intersect(gt,dt)) / area(union(gt,dt))

For "crowd" regions, we use a modified criteria. If a gt object is marked as "iscrowd", we allow a dt to match any subregion of the gt. Choosing gt' in the crowd gt that best matches the dt can be done using gt'=intersect(dt,gt). Since by definition union(gt',dt)=dt, computing iou(gt,dt,iscrowd) = iou(gt',dt) = area(intersect(gt,dt)) / area(dt) For crowd gt regions we use this modified criteria above for the iou.

Parameters

dt : IoUObject — The detected objects.
gt : IoUObject — The ground truth objects.
pyiscrowd : Bools — A list of booleans indicating whether the ground truth objects are crowds.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

Tensor — The intersection over union between the detected and ground truth objects.

source merge(rleObjs: RleObjs, intersect: bool = False, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RleObj

Compute union or intersection of encoded masks.

Parameters

rleObjs : RleObjs — The masks to merge.
intersect : bool — Whether to compute the intersection.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

RleObj — The merged mask.

source frPyObjects(pyobj: PyObj, h: int, w: int, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RleObjs

Convert (list of) polygon, bbox, or uncompressed RLE to encoded RLE mask.

Parameters

pyobj : PyObj — The object to convert.
h : int — The height of the mask.
w : int — The width of the mask.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

RleObjs — The encoded mask.

source encode(bimask: Annotated[tv.Mask, 'H W N'] | Annotated[tv.Mask, 'N H W'] | Annotated[tv.Mask, 'H W'], channel_last: bool = True, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RleObjs

Encode binary masks using RLE.

Note

Requires channel last order input.

Warning

This functions differs from the original implementation and always returns a list of encoded masks.

Parameters

bimask : Annotated[tv.Mask, 'H W N'] | Annotated[tv.Mask, 'N H W'] | Annotated[tv.Mask, 'H W'] — The binary mask to encode.
channel_last : bool — Whether the mask is in channel last order (h,w,n). Requires (n,h,w) if False.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

RleObjs — The encoded mask.

Raises

ValueError

source decode(rleObjs: RleObj | RleObjs, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → Annotated[tv.Mask, 'H W N']

Decode binary masks encoded via RLE.

Note

Returns channel last order output.

Warning

This functions differs from the original implementation and always returns a Tensor batch of decoded masks.

Parameters

rleObjs : RleObj | RleObjs — The encoded masks.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

Annotated[tv.Mask, 'H W N'] — The decoded mask.

source area(rleObjs: RleObj | RleObjs, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → list[int]

Compute area of encoded masks.

Warning

This functions differs from the original implementation and always returns a list of areas.

Parameters

rleObjs : RleObj | RleObjs — The encoded masks.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

list[int] — The areas of the masks.

source toBbox(rleObjs: RleObj | RleObjs, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → tv.BoundingBoxes

Get bounding boxes surrounding encoded masks.

Warning

This functions differs from the original implementation and always returns a Tensor batch of bounding boxes.

Parameters

rleObjs : RleObj | RleObjs — The encoded masks.
device : TorchDevice | None — The desired device of the bounding boxes.
requires_grad : bool | None — Whether the bounding boxes require gradients.

Returns

tv.BoundingBoxes — The bounding boxes.