The Perception team is responsible for translating the boat’s surroundings, position, and orientation into information useful for decision making. This encompasses both computer vision and sensors. Members working on computer vision spend time researching neural networks, building and training an object detection model, augmenting and annotating data, and integrating the model with our ZED 2i camera. Members working on our sensor suite integrate our GNSS, compass, temperature, and leakage sensors.


Future goals: build our CV model from scratch, incorporate LiDAR and other more advanced sensors.


Roboflow was used to create a custom dataset. We uploaded videos of the buoys in different environments and from a variety of angles. The video is parsed into images based on the number of frames per second we choose. The images are then uploaded in Roboflow to manually draw bounding boxes around the objects we want to label (red, blue, green, and yellow buoys). These labels are used to train the model.

Model Architecture

YOLOv5 is a computer vision model. We compared the versions of YOLOv5 (nano, small, medium, large, and xlarge). We compared the performance of the models which vary in terms of accuracy and time to train. 

Concepts and Techniques Used

  • ZED SDK & Python API
  • Python 3
  • Pytorch
  • Google Colab
  • OpenCV
  • CUDA
  • Apex