January 22, 2025
Robots Learn New Skills to Tackle Real-World Tasks with WildLMa Framework
Researchers from UC San Diego developed the WildLMa framework to boost the abilities of quadruped robots. Designed with imitation learning and AI, the system enables robots to complete multi-step tasks like picking up items and cleaning spaces. The framework incorporates Vision-Language Models and Large Language Models for enhanced adaptability in dynamic environments

A team of researchers from the University of California, San Diego, has unveiled a framework aimed at advancing the real-world capabilities of quadruped robots equipped with manipulators. As outlined in their study, published on the arXiv preprint server, the framework, named WildLMa, seeks to improve robots’ ability to perform loco-manipulation tasks in dynamic and unstructured environments.
According to the research, tasks such as collecting household trash, retrieving specific items, and delivering them to designated locations can be executed by robots combining locomotion with object manipulation. While imitation learning techniques have previously been employed to train robots for such operations, challenges in translating these skills to real-world scenarios have persisted.

In an interview with Tech Xplore, Yuchen Song, lead researcher of the study, explained, “The rapid progress in imitation learning has enabled robots to learn from human demonstrations. However, these systems often focus on isolated, specific skills and they struggle to adapt to new environments.” The framework, according to Song, was designed to address these shortcomings by employing Vision-Language Models (VLMs) and Large Language Models (LLMs) for skill acquisition and task decomposition.

Key Features of the WildLMa Framework

The researchers highlighted several innovative elements of their framework. A virtual reality-based teleoperation system was employed to simplify the collection of demonstration data, enabling human operators to control the robots with a single hand. Pre-trained control algorithms were used to streamline these operations.

Additionally, LLMs were integrated to break complex tasks into smaller, actionable steps. “The result is a robot capable of executing long, multi-step tasks efficiently and intuitively,” Song stated. Attention mechanisms were also incorporated to enhance adaptability and focus on target objects during task execution.

Demonstrated Applications and Future Goals

The potential of the framework was demonstrated through real-world experiments. Tasks such as clearing hallways, retrieving deliveries, and rearranging items were successfully performed. However, as per Song, unexpected disturbances, such as moving individuals, can impact the system’s performance. Efforts to enhance robustness in dynamic environments are ongoing, with a vision of creating accessible, affordable home-assistant robots.