Microsoft Vision Model ResNet-50: Pretrained vision model built with web-scale data
Pretrained computer vision models, combined with transfer learning, can dramatically bring down the cost and time it takes to build a model that performs a vision task, such as image classification, object detection, and image retrieval. The Microsoft Vision Model ResNet-50 is a large pretrained vision model, built with Microsoft Bing web-scale image data, that sets state-of-the-art across seven popular computer vision benchmarks.
In this webinar, join Junwon Park and Zygmunt Lenyk, the Program Manager and the Software Engineer behind the Microsoft Vision Model respectively, to learn how the team built Microsoft Vision Model using multi-task learning and web-supervised datasets. You’ll examine how the model lowers cost in a production setting while achieving state-of-the-art performance by using multi-task learning and optimizing separately for multiple classification tasks, including ImageNet-22k, COCO, and two web-supervised datasets of 40 million image-label pairs collected from image search engines.
The researchers will take you through installing and using the Microsoft Vision Model to build an image classification model. Then, they will go over how Microsoft Vision Model uses multi-task learning and web-supervision to achieve a robust performance across multiple computer vision benchmarks.
Together, you’ll explore:
- How Microsoft Bing leverages Microsoft Vision Model to achieve cost savings and accuracy improvement simultaneously.
- How we built the model using multi-task learning and web-supervision.
- How you can download and use Microsoft Vision Model to build an image classifier model for any scenario.
Resource list:
- Project Page: https://aka.ms/microsoftvision (opens in new tab)
- Career Page: https://aka.ms/bingmultimediajobs (opens in new tab)
- Announcement Blog: Microsoft Vision Model ResNet-50 combines web-scale data and multi-task learning to achieve state of the art (opens in new tab)
*This on-demand webinar features a previously recorded Q&A session and open captioning.
Explore more Microsoft Research webinars > (opens in new tab)
- Date:
- Speakers:
- Junwon Park, Zygmunt Lenyk
- Affiliation:
- Microsoft
Watch Next
-
-
-
-
-
-
-
-
-
Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
Speakers:- Daniela Massiceti
-
MEGA: Multi-lingual Evaluation of Generative AI
Speakers:- Kabir Ahuja,
- Millicent Ochieng