pytorchcocotools.internal.mask_api
source package pytorchcocotools.internal.mask_api
Functions
-
bbIou — Compute intersection over union between bounding boxes.
-
bbNms — Compute non-maximum suppression between bounding boxes.
-
rleArea — Compute area of encoded masks.
-
rleDecode — Decode binary masks encoded via RLE.
-
rleEncode — Encode binary masks using RLE.
-
rleFrBbox — Convert bounding boxes to encoded masks.
-
rleFrPoly — Convert polygon to encoded mask.
-
rleFrString — Convert from compressed string representation of encoded mask.
-
rleIou — Compute intersection over union between masks.
-
rleMerge — Compute union or intersection of encoded masks.
-
rleNms — Compute non-maximum suppression between bounding masks.
-
rleToBbox — Get bounding boxes surrounding encoded masks.
-
rleToString — Get compressed string representation of encoded mask.
source bbIou(dt: tv.BoundingBoxes, gt: tv.BoundingBoxes, iscrowd: list[bool]) → Tensor
Compute intersection over union between bounding boxes.
Parameters
-
dt : tv.BoundingBoxes — Detection bounding boxes (shape: [m, 4]).
-
gt : tv.BoundingBoxes — Ground truth bounding boxes (shape: [n, 4]).
-
iscrowd : list[bool] — List indicating if a ground truth bounding box is a crowd.
Returns
-
IoU values for each detection and ground truth pair (shape — [m, n]).
source bbNms(dt: tv.BoundingBoxes, thr: float) → list[bool]
Compute non-maximum suppression between bounding boxes.
Parameters
-
dt : tv.BoundingBoxes — The detected bounding boxes (shape: [n, 4]).
-
thr : float — The IoU threshold for non-maximum suppression.
Returns
-
list[bool] — description
source rleArea(rles: RLEs, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → list[int]
Compute area of encoded masks.
Parameters
-
rles : RLEs — The run length encoded masks.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
list[int] — A list of areas of the encoded masks.
source rleDecode(rles: RLEs, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → Annotated[tv.Mask, H W N]
Decode binary masks encoded via RLE.
Parameters
-
rles : RLEs — The run length encoded masks to decode.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
Annotated[tv.Mask, H W N] — description
source rleEncode(mask: tv.Mask, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RLEs
Encode binary masks using RLE.
Parameters
-
mask : tv.Mask — The binary masks to encode.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
RLEs — description
source rleFrBbox(bb: tv.BoundingBoxes, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RLEs
Convert bounding boxes to encoded masks.
Parameters
-
bb : tv.BoundingBoxes — The bounding boxes.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
RLEs — The RLE encoded masks.
source rleFrPoly(xy: Polygon, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RLE
Convert polygon to encoded mask.
Parameters
-
xy : Polygon — The polygon vertices.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
RLE — The RLE encoded mask.
source rleFrString(s: bytes, h: int, w: int, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RLE
Convert from compressed string representation of encoded mask.
Parameters
-
s : bytes — Byte string of run length encoded mask.
-
h : int — Height of the encoded mask.
-
w : int — Width of the encoded mask.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
RLE — The RLE encoded mask.
source rleIou(dt: RLEs, gt: RLEs, iscrowd: list[bool]) → Tensor
Compute intersection over union between masks.
Parameters
-
dt : RLEs — The RLE encoded detection masks.
-
gt : RLEs — The RLE encoded ground truth masks.
-
iscrowd : list[bool] — The crowd label for each ground truth mask.
Returns
-
Tensor — The intersection over union between the masks.
source rleMerge(rles: RLEs, intersect: bool, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → RLE
Compute union or intersection of encoded masks.
Parameters
-
rles : RLEs — The masks to merge.
-
intersect : bool — Whether to compute the intersection.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
RLE — The merged mask.
source rleNms(dt: RLEs, n: int, thr: float) → list[bool]
Compute non-maximum suppression between bounding masks.
Parameters
-
dt : RLEs — The detected masks
-
n : int — The number of detected masks.
-
thr : float — The IoU threshold for non-maximum suppression.
Returns
-
list[bool] — The mask indices to keep.
source rleToBbox(rles: RLEs, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → tv.BoundingBoxes
Get bounding boxes surrounding encoded masks.
Parameters
-
rles : RLEs — The RLE encoded masks.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Returns
-
tv.BoundingBoxes — List of bounding boxes in format [x y w h]
source rleToString(rle: RLE, *, device: TorchDevice | None = None, requires_grad: bool | None = None) → bytes
Get compressed string representation of encoded mask.
Parameters
-
rle : RLE — Run length encoded string mask.
-
device : TorchDevice | None — The desired device of the bounding boxes.
-
requires_grad : bool | None — Whether the bounding boxes require gradients.
Note
Similar to LEB128 but using 6 bits/char and ascii chars 48-111.
Returns
-
bytes — Byte string of run length encoded mask.