Easy-Bbox documentation

bbox - A Python library for manipulating bounding boxes in various coordinate formats.

This library provides the Bbox class and utility functions for manipulating bounding boxes in various coordinate formats (Pascal VOC, COCO, YOLO, etc.). Supports transformations, geometric operations, and conversions.

Classes:

Bbox: A class to represent a bounding box.

Functions:

nms: Perform Non-Maximum Suppression on a list of bounding boxes. bbox_intersection: Returns the intersection of a sequence of Bboxes. bbox_union: Returns the union of a sequence of Bboxes.

class easy_bbox.Bbox(**data)[source]

Bases: BaseModel

A class to represent a Bbox (inherits from Pydantic BaseModel).

The bbox is stored in Pascal_VOC format: top-left, bottom-right with a top-left origin (PIL coord system). (meaning that top < bottom)

The bottom and right edges are considered excluded from the Bbox for compatibility with array slicing and PIL image cropping features (in case of Int Bboxes).

left

The left coordinate of the bounding box.

Type:

float

top

The top coordinate of the bounding box.

Type:

float

right

The right coordinate of the bounding box.

Type:

float

bottom

The bottom coordinate of the bounding box.

Type:

float

property area: float

The area of the Bbox

property aspect_ratio: float

The aspect ratio of the Bbox (width over height).

bottom: float
property center: Tuple[float, float]

The center of the Bbox in (x, y) format.

check_bbox_validity()[source]

Checks that the Bbox is valid.

Return type:

Self

clip_to_img(img_w, img_h)[source]

Returns a clipped Bbox to the image dimensions.

Remember that the bottom and right edges are considered excluded from the bbox, so Bbox(left=-10, top=-20, right=100, bottom=120).clip_to_img(img_w=32, img_h=64) returns Bbox(left=0, top=0, right=32, bottom=64)

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The clipped Bbox.

Return type:

Bbox

contains(other)[source]

Checks if the other Bbox is contained by this one.

Parameters:

other (Bbox) – The other bounding box.

Returns:

True if this bounding box contains the other one.

Return type:

bool

contains_point(x, y)[source]

Checks if a point is inside the bounding box.

Parameters:
  • x (float) – The x-coordinate of the point.

  • y (float) – The y-coordinate of the point.

Returns:

True if the point is inside the bounding box, False otherwise.

Return type:

bool

distance_to_bbox(other, dist=DistanceMetric.L2)[source]

Calculates the distance between the edges of this Bbox and another Bbox. Distance is zero if the boxes overlap or touch.

Parameters:
  • other (Bbox) – The other bounding box.

  • dist (DistanceMetric) – The distance metric to use. Defaults to DistanceMetric.L2.

Returns:

The distance between the two bounding boxes.

Return type:

float

Raises:

ValueError – If a wrong distance metric is provided.

distance_to_point(x, y, dist=DistanceMetric.L2)[source]

Calculates the distance from the bounding box to a point.

Parameters:
  • x (float) – The x-coordinate of the point.

  • y (float) – The y-coordinate of the point.

  • dist (DistanceMetric) – The distance metric to use. Defaults to DistanceMetric.L2.

Returns:

The distance from the bounding box to the point.

Return type:

float

Raises:

ValueError – If a wrong distance metric is provided.

expand(left=0, top=0, right=0, bottom=0)[source]

Return an expanded Bbox by the specified padding for each side.

Parameters:
  • left (float, optional) – The amount to expand the left side of the bounding box by. Defaults to 0.

  • top (float, optional) – The amount to expand the top side of the bounding box by. Defaults to 0.

  • right (float, optional) – The amount to expand the right side of the bounding box by. Defaults to 0.

  • bottom (float, optional) – The amount to expand the bottom side of the bounding box by. Defaults to 0.

Returns:

The expanded Bbox instance.

Return type:

Bbox

expand_uniform(padding)[source]

Return an expanded Bbox by the specified padding.

Parameters:

padding (float) – The amount to expand the bounding box by.

Returns:

The expanded Bbox instance.

Return type:

Bbox

classmethod from_coco(tlwh)

Initializes the bounding box from top-left and width-height coordinates.

Parameters:

tlwh (Sequence[float]) – A sequence containing the top-left and width-height coordinates of the bounding box in the format (left, top, width, height).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie width < 0 or height < 0).

Example

>>> bbox = Bbox.from_tlwh((10, 20, 20, 30))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 50
classmethod from_cwh(cwh)[source]

Initializes the bounding box from center and width-height coordinates.

Parameters:

cwh (Sequence[float]) – A sequence containing the center and width-height coordinates of the bounding box in the format (center_x, center_y, width, height).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie width < 0 or height < 0).

Example

>>> bbox = Bbox.from_cwh((20, 35, 20, 30))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 50
classmethod from_list(tlbr)

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
classmethod from_pascal_voc(tlbr)

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
classmethod from_tlbr(tlbr)[source]

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
classmethod from_tlwh(tlwh)[source]

Initializes the bounding box from top-left and width-height coordinates.

Parameters:

tlwh (Sequence[float]) – A sequence containing the top-left and width-height coordinates of the bounding box in the format (left, top, width, height).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie width < 0 or height < 0).

Example

>>> bbox = Bbox.from_tlwh((10, 20, 20, 30))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 50
classmethod from_xyxy(tlbr)

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
property height

The height of the Bbox.

intersection(other)[source]

Calculates the intersection with another Bbox. If the resulting Bbox is not valid (ie left > right or top > bottom, returns None.

Parameters:

other (Bbox) – The other bounding box to calculate the intersection with.

Returns:

The intersection of the two bounding boxes if valid.

Return type:

Optional[Bbox]

iou(other)[source]

Calculates the Intersection over Union (IoU) with another bounding box.

Parameters:

other (Bbox) – The other Bbox.

Returns:

The IoU between the two bounding boxes.

Return type:

float

is_inside(other)[source]

Checks if this Bbox is contained by the other one.

Parameters:

other (Bbox) – The other bounding box.

Returns:

True if this bounding box is contained by the other one.

Return type:

bool

left: float
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

overlaps(other)[source]

Checks if the current bounding box overlaps with another bounding box.

Two bboxes are considered as overlapping if they intersect with a non-zero area.

Parameters:

other (Bbox) – The other bounding box to check for overlap.

Returns:

True if the bounding boxes overlap, False otherwise.

Return type:

bool

pad_to_aspect_ratio(target_ratio)[source]

Returns a padded Bbox to achieve the target aspect ratio.

Parameters:

target_ratio (float) – The target aspect ratio.

Returns:

A Bbox instance padded to the correct ratio.

Return type:

Bbox

Raises:

ValueError – If target_ratio is <= 0.

pad_to_square()[source]

Returns a padded Bbox to make it a square.

Return type:

Bbox

right: float
scale(scale_factor)[source]

Return a scaled Bbox by the specified scale factor. The scaling will be from the center.

Parameters:

scale_factor (float) – The factor to scale the bounding box by. Width and height will be scaled by this factor.

Returns:

The scaled Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the scale is strictly negative.

scale_area(scale_factor)[source]

Return a scaled Bbox such that new area/old area == scale_factor. The scaling will be from the center.

Parameters:

scale_factor (float) – The factor to scale the bounding box by. The area will be scaled by this factor. (and width and height will be scaled by the square root of this factor.)

Returns:

The scaled Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the scale is strictly negative.

shift(horizontal_shift=0, vertical_shift=0)[source]

Return a shifted Bbox by the specified horizontal and vertical amounts.

Parameters:
  • horizontal_shift (float, optional) – The amount to shift the bounding box horizontally. Defaults to 0.

  • vertical_shift (float, optional) – The amount to shift the bounding box vertically. Defaults to 0.

Returns:

The shifted Bbox instance.

Return type:

Bbox

to_albu(img_w, img_h)

Returns the bounding box coordinates in Top-Left, Bottom-Right format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The bounding box coordinates (x_min, y_min, x_max, y_max).

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

All the returned values are NORMALIZED based on the image dimensions.

Return type:

Tuple[float, float, float, float]

to_coco()

Returns the bounding box coordinates in Top-Left, Width-Height format.

Returns:

The bounding box coordinates (x_min, y_min, width, height).

x_min and y_min are coordinates of the top-left corner of the bounding box.

Return type:

Tuple[float, float, float, float]

to_cwh()[source]

Returns the bounding box coordinates in Center, Width-Height format.

Returns:

The bounding box coordinates (x_center, y_center, width, height).

Return type:

Tuple[float, float, float, float]

to_int_tuple(rounding_method=RoundingMethod.ROUND)[source]

Returns the bounding box coordinates in Top-Left, Bottom-Right format, with values rounded to int.

Parameters:

rounding_method (RoundingMethod) – The rounding method to use. Defaults to RoundingMethod.ROUND.

Returns:

The coordinates as integers.

Return type:

Tuple[int, int, int, int]

Raises:

ValueError – If a wrong rounding method is provided.

to_list()[source]

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

List[float]

to_norm_cwh(img_w, img_h)[source]

Returns the bounding box coordinates in Center, Width-Height format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The NORMALIZED bounding box coordinates (x_center, y_center, width, height).

Return type:

Tuple[float, float, float, float]

to_norm_tlbr(img_w, img_h)[source]

Returns the bounding box coordinates in Top-Left, Bottom-Right format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The bounding box coordinates (x_min, y_min, x_max, y_max).

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

All the returned values are NORMALIZED based on the image dimensions.

Return type:

Tuple[float, float, float, float]

to_norm_tlwh(img_w, img_h)[source]

Returns the bounding box coordinates in Top-Left, Width-Height format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The bounding box coordinates [x_min, y_min, width, height].

x_min and y_min are the coordinates of the top-left corner of the bounding box.

All the returned values are NORMALIZED based on the image dimensions.

Return type:

Tuple[float, float, float, float]

to_pascal_voc()

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates (x_min, y_min, x_max, y_max).

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

Tuple[float, float, float, float]

to_polygon()[source]

Returns the bounding box corners as points.

Return type:

Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float], Tuple[float, float]]

Returns:

The corners coordinates in (x, y) format. The order is top_left > top_right > bottom_right > bottom_left

to_tlbr()[source]

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates (x_min, y_min, x_max, y_max).

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

Tuple[float, float, float, float]

to_tlwh()[source]

Returns the bounding box coordinates in Top-Left, Width-Height format.

Returns:

The bounding box coordinates (x_min, y_min, width, height).

x_min and y_min are coordinates of the top-left corner of the bounding box.

Return type:

Tuple[float, float, float, float]

to_xyxy()

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates (x_min, y_min, x_max, y_max).

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

Tuple[float, float, float, float]

to_yolo(img_w, img_h)

Returns the bounding box coordinates in Center, Width-Height format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The NORMALIZED bounding box coordinates (x_center, y_center, width, height).

Return type:

Tuple[float, float, float, float]

top: float
union(other)[source]

Calculates the minimal Bbox that englobes this one AND the other.

Parameters:

other (Bbox) – The other bounding box to calculate the union with.

Returns:

The minimal englobing Bbox.

Return type:

Bbox

property width: float

The width of the Bbox.

easy_bbox.bbox_intersection(bboxes)[source]

Calculate the intersection of a list of bounding boxes.

Parameters:

bboxes (Sequence[Bbox]) – A sequence of bounding boxes.

Returns:

The bounding box that represents the intersection of all input bounding

boxes. If the resulting bounding box is not valid, returns None.

Return type:

Optional[Bbox]

Raises:

ValueError – If the input list of bounding boxes is empty.

easy_bbox.bbox_union(bboxes)[source]

Calculate the union of a list of bounding boxes.

Parameters:

bboxes (Sequence[Bbox]) – A sequence of bounding boxes.

Returns:

The bounding box that represents the union of all input bounding boxes.

Return type:

Bbox

Raises:

ValueError – If the input list of bounding boxes is empty.

easy_bbox.nms(bboxes, scores, iou_threshold=0.5)[source]

Perform Non-Maximum Suppression on a list of bounding boxes.

Parameters:
  • bboxes (List[Bbox]) – List of bounding boxes.

  • scores (List[float]) – List of confidence scores for each bounding box.

  • iou_threshold (float, optional) – IoU threshold for suppression. Defaults to 0.5.

Returns:

List of selected bounding boxes and their scores.

Return type:

List[Tuple[Bbox, float]]

Raises:

ValueError – If the length of bboxes and scores do not match.