Skip to content

pytorchcocotools.internal.mask_api

source package pytorchcocotools.internal.mask_api

Functions

  • bbIou Compute intersection over union between bounding boxes.

  • bbNms Compute non-maximum suppression between bounding boxes.

  • rleArea Compute area of encoded masks.

  • rleDecode Decode binary masks encoded via RLE.

  • rleEncode Encode binary masks using RLE.

  • rleFrBbox Convert bounding boxes to encoded masks.

  • rleFrPoly Convert polygon to encoded mask.

  • rleFrString Convert from compressed string representation of encoded mask.

  • rleIou Compute intersection over union between masks.

  • rleMerge Compute union or intersection of encoded masks.

  • rleNms Compute non-maximum suppression between bounding masks.

  • rleToBbox Get bounding boxes surrounding encoded masks.

  • rleToString Get compressed string representation of encoded mask.

source bbIou(dt: tv.BoundingBoxes, gt: tv.BoundingBoxes, iscrowd: list[bool])Tensor

Compute intersection over union between bounding boxes.

Parameters

  • dt : tv.BoundingBoxes Detection bounding boxes (shape: [m, 4]).

  • gt : tv.BoundingBoxes Ground truth bounding boxes (shape: [n, 4]).

  • iscrowd : list[bool] List indicating if a ground truth bounding box is a crowd.

Returns

  • IoU values for each detection and ground truth pair (shape [m, n]).

source bbNms(dt: tv.BoundingBoxes, thr: float)list[bool]

Compute non-maximum suppression between bounding boxes.

Uses torchvision.ops.nms for vectorized NMS.

Parameters

  • dt : tv.BoundingBoxes The detected bounding boxes (shape: [n, 4]).

  • thr : float The IoU threshold for non-maximum suppression.

Returns

  • list[bool] A list of bools indicating which boxes to keep.

source rleArea(rles: RLEs, *, device: TorchDevice | None = None, requires_grad: bool = False)list[int]

Compute area of encoded masks.

Parameters

  • rles : RLEs The run length encoded masks.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • list[int] A list of areas of the encoded masks.

source rleDecode(rles: RLEs, *, device: TorchDevice | None = None, requires_grad: bool = False)Annotated[tv.Mask, 'H W N']

Decode binary masks encoded via RLE.

Uses vectorized repeat_interleave to decode all masks without Python loops over individual RLE segments.

Parameters

  • rles : RLEs The run length encoded masks to decode.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • Annotated[tv.Mask, 'H W N'] The decoded binary masks in H×W×N format.

source rleEncode(mask: Annotated[tv.Mask, 'N H W'], *, device: TorchDevice | None = None, requires_grad: bool = False)RLEs

Encode binary masks using RLE.

Uses vectorized transition detection and per-mask grouping via pre-sorted indices to avoid repeated boolean scans.

Parameters

  • mask : Annotated[tv.Mask, 'N H W'] The binary masks to encode.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • RLEs Run length encoded masks.

source rleFrBbox(bb: tv.BoundingBoxes, *, device: TorchDevice | None = None, requires_grad: bool = False)RLEs

Convert bounding boxes to encoded masks.

Parameters

  • bb : tv.BoundingBoxes The bounding boxes.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • RLEs The RLE encoded masks.

source rleFrPoly(xy: Polygon, *, device: TorchDevice | None = None, requires_grad: bool = False)RLE

Convert polygon to encoded mask.

Parameters

  • xy : Polygon The polygon vertices.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • RLE The RLE encoded mask.

source rleFrString(s: bytes, h: int, w: int, *, device: TorchDevice | None = None, requires_grad: bool = False)RLE

Convert from compressed string representation of encoded mask.

Parameters

  • s : bytes Byte string of run length encoded mask.

  • h : int Height of the encoded mask.

  • w : int Width of the encoded mask.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • RLE The RLE encoded mask.

source rleIou(dt: RLEs, gt: RLEs, iscrowd: list[bool])Tensor

Compute intersection over union between masks.

Uses the two-pointer RLE merge algorithm with pure Python ints to avoid per-iteration tensor allocation overhead.

Parameters

  • dt : RLEs The RLE encoded detection masks.

  • gt : RLEs The RLE encoded ground truth masks.

  • iscrowd : list[bool] The crowd label for each ground truth mask.

Returns

  • Tensor The intersection over union between the masks.

source rleMerge(rles: RLEs, intersect: bool, *, device: TorchDevice | None = None, requires_grad: bool = False)RLE

Compute union or intersection of encoded masks.

Uses two-pointer merge with pure Python ints to avoid per-iteration tensor allocation overhead. Pairwise merges all input RLEs.

Parameters

  • rles : RLEs The masks to merge.

  • intersect : bool Whether to compute the intersection.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • RLE The merged mask.

source rleNms(dt: RLEs, n: int, thr: float)list[bool]

Compute non-maximum suppression between bounding masks.

Parameters

  • dt : RLEs The detected masks

  • n : int The number of detected masks.

  • thr : float The IoU threshold for non-maximum suppression.

Returns

  • list[bool] The mask indices to keep.

source rleToBbox(rles: RLEs, *, device: TorchDevice | None = None, requires_grad: bool = False)tv.BoundingBoxes

Get bounding boxes surrounding encoded masks.

Batched: pads all RLE count vectors to the same length, then computes all bounding boxes in a single set of vectorised tensor operations.

Parameters

  • rles : RLEs The RLE encoded masks.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Returns

  • tv.BoundingBoxes List of bounding boxes in format [x y w h]

source rleToString(rle: RLE, *, device: TorchDevice | None = None, requires_grad: bool = False)bytes

Get compressed string representation of encoded mask.

Parameters

  • rle : RLE Run length encoded string mask.

  • device : TorchDevice | None The desired device of the bounding boxes.

  • requires_grad : bool Whether the bounding boxes require gradients.

Note

Similar to LEB128 but using 6 bits/char and ascii chars 48-111.

Returns

  • bytes Byte string of run length encoded mask.