Easy-Bbox documentation

bbox - A Python library for manipulating bounding boxes in various coordinate formats.

This library provides the Bbox class and utility functions for manipulating bounding boxes in various coordinate formats (Pascal VOC, COCO, YOLO, etc.). Supports transformations, geometric operations, and conversions.

Classes:

Bbox: A class to represent a bounding box.

Functions:

nms: Perform Non-Maximum Suppression on a list of bounding boxes.

class easy_bbox.Bbox(**data)[source]

Bases: BaseModel

A class to represent a Bbox (inherits from Pydantic BaseModel).

The bbox is stored in Pascal_VOC format: top-left, bottom-right with a top-left origin (PIL coord system). (meaning that top < bottom)

The bottom and right edges are considered excluded from the Bbox for compatibility with array slicing and PIL image cropping features (in case of Int Bboxes).

left

The left coordinate of the bounding box.

Type:

float

top

The top coordinate of the bounding box.

Type:

float

right

The right coordinate of the bounding box.

Type:

float

bottom

The bottom coordinate of the bounding box.

Type:

float

property area: float

The area of the Bbox

property aspect_ratio: float

The aspect ratio of the Bbox (width over height).

bottom: float
property center: Tuple[float, float]

The center of the Bbox in (x, y) format.

check_passwords_match()[source]
Return type:

Self

clip_to_img(img_w, img_h)[source]

Returns a clipped Bbox to the image dimensions.

Remember that the bottom and right edges are inclusive, so Bbox(left=-10, top=-20, right=100, bottom=120).clipt_to_img(img_w=32, img_h=64) returns Bbox(left=0, top=0, right=31, bottom=63)

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The clipped Bbox.

Return type:

Bbox

contains_point(x, y)[source]

Checks if a point is inside the bounding box.

Parameters:
  • x (float) – The x-coordinate of the point.

  • y (float) – The y-coordinate of the point.

Returns:

True if the point is inside the bounding box, False otherwise.

Return type:

bool

distance_to_point(x, y)[source]

Calculates the distance from the bounding box to a point.

Parameters:
  • x (float) – The x-coordinate of the point.

  • y (float) – The y-coordinate of the point.

Returns:

The distance from the bounding box to the point.

Return type:

float

expand(left=0, top=0, right=0, bottom=0)[source]

Return an expanded Bbox by the specified padding for each side.

Parameters:
  • left (float, optional) – The amount to expand the left side of the bounding box by. Defaults to 0.

  • top (float, optional) – The amount to expand the top side of the bounding box by. Defaults to 0.

  • right (float, optional) – The amount to expand the right side of the bounding box by. Defaults to 0.

  • bottom (float, optional) – The amount to expand the bottom side of the bounding box by. Defaults to 0.

Returns:

The expanded Bbox instance.

Return type:

Bbox

expand_uniform(padding)[source]

Return an expanded Bbox by the specified padding.

Parameters:

padding (float) – The amount to expand the bounding box by.

Returns:

The expanded Bbox instance.

Return type:

Bbox

classmethod from_coco(tlwh)

Initializes the bounding box from top-left and width-height coordinates.

Parameters:

tlwh (Sequence[float]) – A sequence containing the top-left and width-height coordinates of the bounding box in the format (left, top, width, height).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie width < 0 or height < 0).

Example

>>> bbox = Bbox.from_tlwh((10, 20, 20, 30))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 50
classmethod from_cwh(cwh)[source]

Initializes the bounding box from center and width-height coordinates.

Parameters:

cwh (Sequence[float]) – A sequence containing the center and width-height coordinates of the bounding box in the format (center_x, center_y, width, height).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie width < 0 or height < 0).

Example

>>> bbox = Bbox.from_cwh((20, 35, 20, 30))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 50
classmethod from_list(tlbr)

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
classmethod from_pascal_voc(tlbr)

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
classmethod from_tlbr(tlbr)[source]

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
classmethod from_tlwh(tlwh)[source]

Initializes the bounding box from top-left and width-height coordinates.

Parameters:

tlwh (Sequence[float]) – A sequence containing the top-left and width-height coordinates of the bounding box in the format (left, top, width, height).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie width < 0 or height < 0).

Example

>>> bbox = Bbox.from_tlwh((10, 20, 20, 30))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 50
classmethod from_xyxy(tlbr)

Initializes the bounding box from top-left and bottom-right coordinates.

Parameters:

tlbr (Sequence[float]) – A sequence containing the top-left and bottom-right coordinates of the bounding box in the format (left, top, right, bottom).

Returns:

The Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the length of the sequence is not 4, or if the Bbox is not valid (ie left > right or top > bottom).

Example

>>> bbox = Bbox.from_tlbr((10, 20, 30, 40))
>>> print(bbox.left, bbox.top, bbox.right, bbox.bottom)
10 20 30 40
property height

The height of the Bbox.

intersection(other)[source]

Calculates the intersection with another Bbox. If the resulting Bbox is not valid (ie left > right or top > bottom, returns None.

Parameters:

other (Bbox) – The other bounding box to calculate the intersection with.

Returns:

The intersection of the two bounding boxes if valid.

Return type:

Optional[Bbox]

iou(other)[source]

Calculates the Intersection over Union (IoU) with another bounding box.

Parameters:

other (Bbox) – The other Bbox.

Returns:

The IoU between the two bounding boxes.

Return type:

float

left: float
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

overlaps(other)[source]

Checks if the current bounding box overlaps with another bounding box.

Two bboxes are considered as overlapping if they intersect with a non-zero area.

Parameters:

other (Bbox) – The other bounding box to check for overlap.

Returns:

True if the bounding boxes overlap, False otherwise.

Return type:

bool

pad_to_aspect_ratio(target_ratio)[source]

Returns a padded Bbox to achieve the target aspect ratio.

Parameters:

target_ratio (float) – The target aspect ratio.

Returns:

A Bbox instance padded to the correct ratio.

Return type:

Bbox

Raises:

ValueError – If target_ratio is <= 0.

pad_to_square()[source]

Returns a padded Bbox to make it a square.

Return type:

Bbox

right: float
scale(scale_factor)[source]

Return a scaled Bbox by the specified scale factor. The scaling will be from the center.

Parameters:

scale_factor (float) – The factor to scale the bounding box by. Width and height will be scaled by this factor.

Returns:

The scaled Bbox instance.

Return type:

Bbox

Raises:

ValueError – If the scale is strictly negative.

shift(horizontal_shift=0, vertical_shift=0)[source]

Return a shifted Bbox by the specified horizontal and vertical amounts.

Parameters:
  • horizontal_shift (float, optional) – The amount to shift the bounding box horizontally. Defaults to 0.

  • vertical_shift (float, optional) – The amount to shift the bounding box vertically. Defaults to 0.

Returns:

The shifted Bbox instance.

Return type:

Bbox

to_albu(img_w, img_h)

Returns the bounding box coordinates in Top-Left, Bottom-Right format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box. All the returned values are NORMALIZED based on the image dimensions.

Return type:

List[float]

to_coco()

Returns the bounding box coordinates in Top-Left, Width-Height format.

Returns:

The bounding box coordinates [x_min, y_min, width, height].

x_min and y_min are coordinates of the top-left corner of the bounding box.

Return type:

List[float]

to_cwh()[source]

Returns the bounding box coordinates in Center, Width-Height format.

Returns:

The bounding box coordinates [x_center, y_center, width, height].

Return type:

List[float]

to_list()

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

List[float]

to_norm_cwh(img_w, img_h)[source]

Returns the bounding box coordinates in Center, Width-Height format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The NORMALIZED bounding box coordinates [x_center, y_center, width, height].

Return type:

List[float]

to_norm_tlbr(img_w, img_h)[source]

Returns the bounding box coordinates in Top-Left, Bottom-Right format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box. All the returned values are NORMALIZED based on the image dimensions.

Return type:

List[float]

to_norm_tlwh(img_w, img_h)[source]

Returns the bounding box coordinates in Top-Left, Width-Height format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The bounding box coordinates [x_min, y_min, width, height].

x_min and y_min are the coordinates of the top-left corner of the bounding box. All the returned values are NORMALIZED based on the image dimensions.

Return type:

List[float]

to_pascal_voc()

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

List[float]

to_polygon()[source]

Returns the bounding box corners as points.

Returns:

The corners coordinates in (x, y) format. The order is top_left > top_right > bottom_right > bottom_left

Return type:

List[Tuple[float, float]]

to_tlbr()[source]

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

List[float]

to_tlwh()[source]

Returns the bounding box coordinates in Top-Left, Width-Height format.

Returns:

The bounding box coordinates [x_min, y_min, width, height].

x_min and y_min are coordinates of the top-left corner of the bounding box.

Return type:

List[float]

to_xyxy()

Returns the bounding box coordinates in Top-Left, Bottom-Right format.

Returns:

The bounding box coordinates [x_min, y_min, x_max, y_max].

x_min and y_min are the coordinates of the top-left corner of the bounding box. x_max and y_max are the coordinates of the bottom-right corner of the bounding box.

Return type:

List[float]

to_yolo(img_w, img_h)

Returns the bounding box coordinates in Center, Width-Height format, normalized based on the image dimensions.

Parameters:
  • img_w (int) – The image width in pixels.

  • img_h (int) – The image height in pixels.

Returns:

The NORMALIZED bounding box coordinates [x_center, y_center, width, height].

Return type:

List[float]

top: float
union(other)[source]

Calculates the minimal Bbox that englobes this one AND the other.

Parameters:

other (Bbox) – The other bounding box to calculate the union with.

Returns:

The minimal englobing Bbox.

Return type:

Bbox

property width: float

The width of the Bbox.

easy_bbox.nms(bboxes, scores, iou_threshold=0.5)[source]

Perform Non-Maximum Suppression on a list of bounding boxes.

Parameters:
  • bboxes (List[Bbox]) – List of bounding boxes.

  • scores (List[float]) – List of confidence scores for each bounding box.

  • iou_threshold (float, optional) – IoU threshold for suppression. Defaults to 0.5.

Returns:

List of selected bounding boxes and their scores.

Return type:

List[Tuple[Bbox, float]]

Raises:

ValueError – If the length of bboxes and scores do not match.