Microsoft Vision Model ResNet-50: Pretrained vision model built with web-scale data

Pretrained computer vision models, combined with transfer learning, can dramatically bring down the cost and time it takes to build a model that performs a vision task, such as image classification, object detection, and image retrieval. The Microsoft Vision Model ResNet-50 is a large pretrained vision model, built with Microsoft Bing web-scale image data, that sets state-of-the-art across seven popular computer vision benchmarks.

In this webinar, join Junwon Park and Zygmunt Lenyk, the Program Manager and the Software Engineer behind the Microsoft Vision Model respectively, to learn how the team built Microsoft Vision Model using multi-task learning and web-supervised datasets. You’ll examine how the model lowers cost in a production setting while achieving state-of-the-art performance by using multi-task learning and optimizing separately for multiple classification tasks, including ImageNet-22k, COCO, and two web-supervised datasets of 40 million image-label pairs collected from image search engines.

The researchers will take you through installing and using the Microsoft Vision Model to build an image classification model. Then, they will go over how Microsoft Vision Model uses multi-task learning and web-supervision to achieve a robust performance across multiple computer vision benchmarks.

Together, you’ll explore:

How Microsoft Bing leverages Microsoft Vision Model to achieve cost savings and accuracy improvement simultaneously.
How we built the model using multi-task learning and web-supervision.
How you can download and use Microsoft Vision Model to build an image classifier model for any scenario.

Resource list: