November 15, 2024
Watch Meta's SAM 2 Model Identify Objects in Videos Using AI
Meta released a new artificial intelligence (AI) model on Monday that can perform complex computer vision tasks. Dubbed Segment Anything Model 2 (SAM 2), it follows after its predecessor that was launched last year and was incorporated in Instagram’s Backdrop and Cutouts tool. The successor to the model now comes with advanced capabilities and the company said it ca...

Meta released a new artificial intelligence (AI) model on Monday that can perform complex computer vision tasks. Dubbed Segment Anything Model 2 (SAM 2), it follows after its predecessor that was launched last year and was incorporated in Instagram’s Backdrop and Cutouts tools. The successor to the model now comes with advanced capabilities and the company said it can perform segment identification and tracking even on videos. Like most of Meta’s large language models (LLMs), SAM 2 is also an open-source AI model.

In a newsroom post, Meta announced the new AI model which focuses on segment analysis on videos primarily, while improving its image segmentation capabilities. Highlighting the accomplishments of its predecessor, Meta said the AI model was used in Instagram’s Backdrop and Cutouts features, while marine scientists used it to “segment sonar images and analyse coral reefs, satellite imagery analysis for disaster relief, and in the medical field, segmenting cellular images and aiding in detecting skin cancer”.

SAM 2 is capable of object segmentation in an image and video as well as track it across different frames of a video in real-time. The AI can also track and segment objects in scenarios where the objects move fast, change in appearance, or are concealed by other objects or an entirely different scene.

The foundation model for prompt-based visual segmentation is built on a simple transformer architecture. It has a streaming memory that allows it to process videos in real-time. The company also claimed that the model was trained on its largest video segmentation dataset dubbed SA-V dataset.

Meta said the AI model can help ease the process of video editing or AI-based video generation, as well as to power new experiences in the company’s mixed-reality ecosystem. The object tracking capability in videos can also assist in faster annotation of visual data to train other computer vision systems, the company added.

Since it is an open-source AI model, the company has hosted its weights on its GitHub page. Interested individuals can download and test out the AI model. Notably, it is licenced under the Apache 2.0 licence which allows for research, academic, and non-commercial usage.