Intel/dpt-hybrid-midas

Model Details: dpt-Hybrid

Dense Prediction Transformer (DPT) model trained on 1.4 million images for monocular Depth Estimation.
It was introduced in the paper vision Transformers for Dense Prediction by Ranftl et al. (2021) and first released in this repository.
DPT uses the Vision Transformer (ViT) as backbone and adds a neck + head on top for monocular depth estimation.
model image

This repository hosts the “hybrid” version of the model as stated in the paper. DPT-Hybrid diverges from DPT by using ViT-hybrid as a backbone and taking some activations from the backbone.

The model card has been written in combination by the Hugging Face team and Intel.

Model Detail	Description
Model Authors – Company	Intel
Date	December 22, 2022
Version	1
Type	Computer Vision – Monocular Depth Estimation
Paper or Other Resources	Vision Transformers for Dense Prediction and GitHub Repo
License	Apache 2.0
Questions or Comments	Community Tab and Intel Developers Discord

Intended Use	Description
Primary intended uses	You can use the raw model for zero-shot monocular depth estimation. See the model hub to look for fine-tuned versions on a task that interests you.
Primary intended users	Anyone doing monocular depth estimation
Out-of-scope uses	This model in most cases will need to be fine-tuned for your particular task. The model should not be used to intentionally create hostile or alienating environments for people.

How to use

Here is how to use this model for zero-shot depth estimation on an image:

from PIL import Image
import numpy as np
import requests
import torch
from transformers import DPTForDepthEstimation, DPTFeatureExtractor
model = DPTForDepthEstimation.from_pretrained("Intel/dpt-hybrid-midas", low_cpu_mem_usage=True)
feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-hybrid-midas")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
# prepare image for the model
inputs = feature_extractor(images=image, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    predicted_depth = outputs.predicted_depth
# interpolate to original size
prediction = torch.nn.functional.interpolate(
    predicted_depth.unsqueeze(1),
    size=image.size[::-1],
    mode="bicubic",
    align_corners=False,
)
# visualize the prediction
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)
depth.show()

For more code examples, we refer to the documentation.

Factors	Description
Groups	Multiple datasets compiled together
Instrumentation	–
Environment	Inference completed on Intel Xeon Platinum 8280 CPU @ 2.70GHz with 8 physical cores and an NVIDIA RTX 2080 GPU.
Card Prompts	Model deployment on alternate hardware and software will change model performance

Metrics	Description
Model performance measures	Zero-shot Transfer
Decision thresholds	–
Approaches to uncertainty and variability	–

Training and Evaluation Data	Description
Datasets	The dataset is called MIX 6, and contains around 1.4M images. The model was initialized with ImageNet-pretrained weights.
Motivation	To build a robust monocular depth prediction network
Preprocessing	“We resize the image such that the longer side is 384 pixels and train on random square crops of size 384. … We perform random horizontal flips for data augmentation.” See Ranftl et al. (2021) for more details.

Quantitative Analyses

Model	Training set	DIW WHDR	ETH3D AbsRel	Sintel AbsRel	KITTI δ>1.25	NYU δ>1.25	TUM δ>1.25
DPT – Large	MIX 6	10.82 (-13.2%)	0.089 (-31.2%)	0.270 (-17.5%)	8.46 (-64.6%)	8.32 (-12.9%)	9.97 (-30.3%)
DPT – Hybrid	MIX 6	11.06 (-11.2%)	0.093 (-27.6%)	0.274 (-16.2%)	11.56 (-51.6%)	8.69 (-9.0%)	10.89 (-23.2%)
MiDaS	MIX 6	12.95 (+3.9%)	0.116 (-10.5%)	0.329 (+0.5%)	16.08 (-32.7%)	8.71 (-8.8%)	12.51 (-12.5%)
MiDaS [30]	MIX 5	12.46	0.129	0.327	23.90	9.55	14.29
Li [22]	MD [22]	23.15	0.181	0.385	36.29	27.52	29.54
Li [21]	MC [21]	26.52	0.183	0.405	47.94	18.57	17.71
Wang [40]	WS [40]	19.09	0.205	0.390	31.92	29.57	20.18
Xian [45]	RW [45]	14.59	0.186	0.422	34.08	27.00	25.02
Casser [5]	CS [8]	32.80	0.235	0.422	21.15	39.58	37.18

Table 1. Comparison to the state of the art on monocular depth estimation. We evaluate zero-shot cross-dataset transfer according to the
protocol defined in [30]. Relative performance is computed with respect to the original MiDaS model [30]. Lower is better for all metrics. (Ranftl et al., 2021)

Ethical Considerations	Description
Data	The training data come from multiple image datasets compiled together.
Human life	The model is not intended to inform decisions central to human life or flourishing. It is an aggregated set of monocular depth image datasets.
Mitigations	No additional risk mitigation strategies were considered during model development.
Risks and harms	The extent of the risks involved by using the model remain unknown.
Use cases	–

Caveats and Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. There are no additional caveats or recommendations for this model.

BibTeX entry and citation info

@article{DBLP:journals/corr/abs-2103-13413,
  author    = {Ren{\'{e}} Ranftl and
               Alexey Bochkovskiy and
               Vladlen Koltun},
  title     = {Vision Transformers for Dense Prediction},
  journal   = {CoRR},
  volume    = {abs/2103.13413},
  year      = {2021},
  url       = {https://arxiv.org/abs/2103.13413},
  eprinttype = {arXiv},
  eprint    = {2103.13413},
  timestamp = {Wed, 07 Apr 2021 15:31:46 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2103-13413.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

数据统计

数据评估

Intel/dpt-hybrid-midas浏览人数已经达到1,271，如你需要查询该站的相关权重信息，可以点击"5118数据""爱站数据""Chinaz数据"进入；以目前的网站数据参考，建议大家请以爱站数据为准，更多网站价值评估因素如：Intel/dpt-hybrid-midas的访问速度、搜索引擎收录以及索引量、用户体验等；当然要评估一个站的价值，最主要还是需要根据您自身的需求以及需要，一些确切的数据则需要找Intel/dpt-hybrid-midas的站长进行洽谈提供。如该站的IP、PV、跳出率等！

特别声明

本站Ai导航提供的Intel/dpt-hybrid-midas都来源于网络，不保证外部链接的准确性和完整性，同时，对于该外部链接的指向，不由Ai导航实际控制，在2023年5月15日下午3:12收录时，该网页上的内容，都属于合规合法，后期网页的内容如出现违规，可以直接联系网站管理员进行删除，Ai导航不承担任何责任。

Ai导航致力于优质、实用的网络站点资源收集与分享！本文地址https://www.ainavpro.com/sites/3238.html转载请注明

暂无评论

暂无评论...

Intel/dpt-hybrid-midas

AI资源交易平台

Model Details: dpt-Hybrid

How to use

Quantitative Analyses

BibTeX entry and citation info

数据统计

数据评估

相关导航

暂无评论

热门标签

随机网址

Intel/dpt-hybrid-midas

AI资源交易平台

Model Details: dpt-Hybrid

How to use

Quantitative Analyses

BibTeX entry and citation info

数据统计

数据评估

相关导航

暂无评论

热门标签

随机网址

广告位