Model Zoo
FEATURED
Mobile Object Localizer
YoloV6 Nano
MegaDepth
YOLOP
YoloV6 Tiny
YoloV6 Nano
Coneslayer YoloV7 Tiny
YoloV7 Tiny
YoloV5 Nano
FastDepth
YoloV4-tiny
Mask R-CNN
YoloV7
ready-to-use, open source models
Mobile Object Localizer
A class-agnostic mobile object detector.
Resolution
192x192x3
Task type
detection
FPS
41.02
Learn more
YoloV6 Nano
Real-time Object detection with YoloV6n pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
/
Learn more
MegaDepth
Estimate depth from a RGB image.
Resolution
256x192x3
Task type
monocular_depth_estimation
FPS
4.8
Learn more
YOLOP
You Only Look at Once for Panoptic driving Perception.
Resolution
320x320x3
Task type
detection
FPS
11.4
Learn more
YoloV6 Tiny
Real-time Object detection with YoloV6t pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
/
Learn more
YoloV6 Nano
Real-time Object detection with YoloV6n pre-trained on COCO data set.
Resolution
640x640x3
Task type
detection
FPS
/
Learn more
Coneslayer YoloV7 Tiny
A lightweight neural-network for rapid detection of orange traffic cones
Resolution
416x416x3
Task type
detection
FPS
30.0
Learn more
YoloV7 Tiny
Real-time Object detection with YoloV7-tiny pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
/
Learn more
YoloV5 Nano
Real-time Object detection with YoloV5n pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
/
Learn more
FastDepth
Estimate depth from RGB images using FastDepth from MIT.
Resolution
320x256x3
Task type
monocular_depth_estimation
FPS
40.32
Learn more
YoloV4-tiny
Real-time Object detection with YoloV4-tiny pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
39.8
Learn more
Mask R-CNN
Instance Segmentation with Mask R-CNN pre-trained on COCO data set.
Resolution
300x300x3
Task type
instance_segmentation
FPS
3.11
Learn more
YoloV7
Real-time Object detection with YoloV7 pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
/
Learn more
FastDepth
Estimate depth from RGB images using FastDepth from MIT.
Resolution
640x480x3
Task type
monocular_depth_estimation
FPS
???
Learn more
person-reidentification-retail-0031
This is a person reidentification model for a general scenario. It uses a whole body image as an input and outputs an embedding vector to match a pair of images by the cosine distance. The model is based on the RMNet backbone developed for fast inference. A single reidentification head from the 1/16 scale feature map outputs an embedding vector of 256 floats.
Resolution
48x96x3
Task type
named_entity_recognition
FPS
60
Learn more
YoloV3
Object detection with YoloV3 pre-trained on COCO data set.
Resolution
416x416x3
Task type
detection
FPS
4.16
Learn more
WeChat's QR code detection model
WeChat's QR code detection model
Resolution
384x384x3
Task type
detection
FPS
0
Learn more
Face-mask-detection
Face mask detection model, trained by the notebook available at depthai-ml-training repository.
Resolution
300x300x3
Task type
detection
FPS
/
Learn more
YoloV4
Object detection with YoloV4 pre-trained on COCO data set.
Resolution
608x608x3
Task type
detection
FPS
1.31
Learn more
DM-Count
Count dense or sparse crowds using density maps.
Resolution
960x540x3
Task type
feature_extraction
FPS
0.22
Learn more
Deeplab-V3+ person segmentation model
Deeplab-V3+ person segmentation model
Resolution
256x256x3
Task type
semantic_segmentation
FPS
0
Learn more
Deeplab-V3+ person segmentation model
Deeplab-V3+ person segmentation model
Resolution
513x513x3
Task type
semantic_segmentation
FPS
0
Learn more
SBD-mask classification
Face mask classification model
Resolution
224x224x3
Task type
classification
FPS
0
Learn more
YuNet face detection model
YuNet face detection model
Resolution
160x120x3
Task type
detection
FPS
0
Learn more
Face recognition MobileFaceNet ArcFace
Deep face recognition net with MobileFaceNet backbone and Arcface loss <https://arxiv.org/abs/1801.07698>
Resolution
112x112x3
Task type
face_recognition
FPS
0
Learn more
Facial landmarks 68 detection
Detect 68 facial landmarks.
Resolution
160x160x3
Task type
head_pose_estimation
FPS
0
Learn more
Depth MobileNetV2
Estimate depth from a RGB image.
Resolution
320x240x3
Task type
monocular_depth_estimation
FPS
20.03
Learn more
Image Quality Assesment Classification
Image quality assessment from RGB image using EdgeSegNet-Classifier.
Resolution
256x256x3
Task type
classification
FPS
13.65
Learn more
GhostNet
Image classification with GhostNet pretrained on ImageNet.
Resolution
256x320x3
Task type
classification
FPS
53.13
Learn more
Depth MobileNetV2
Depth Estimation of a given input image.
Resolution
640x480x3
Task type
monocular_depth_estimation
FPS
/
Learn more
SC-Depth
Depth estimation from a RGB image using SC-Depth model.
Resolution
512x256x3
Task type
monocular_depth_estimation
FPS
12.89
Learn more
EAST
Detect text on images using EAST model.
Resolution
256x256x3
Task type
detection
FPS
22.5
Learn more
PaDiM-Wood
PaDiM anomaly detection model trained to detect anomalies in wood
Resolution
256x256x3
Task type
detection
FPS
9.62
Learn more
MediaPipe Facemesh - 468 facial landmarks
MediaPipe Facemesh model that provides 468 facial landmarks
Resolution
192x192x3
Task type
object_attributes
FPS
/
Learn more
MicroNet-M0
Image classification with MicroNet-M0 pretrained on ImageNet.
Resolution
224x224x3
Task type
classification
FPS
34.36
Learn more
MobileNetV2
Feature extractor with MobileNetV2 pretrained on ImageNet.
Resolution
224x224x3
Task type
feature_extraction
FPS
47.27
Learn more
ShuffleNetV2
Image classification with ShuffleNetV2 pretrained on ImageNet.
Resolution
224x224x3
Task type
classification
FPS
151.72
Learn more
Mediapipe's Palm detection model
Mediapipe's Palm detection model
Resolution
128x128x3
Task type
detection
FPS
0
Learn more
HR-Depth
Depth estimation from RGB image using HR-Depth model.
Resolution
256x192x3
Task type
monocular_depth_estimation
FPS
4.8
Learn more
InceptionV4
Image classification with InceptionV4 pretrained on ImageNet.
Resolution
299x299x3
Task type
classification
FPS
8.02