Relevance of Generative Artificial Intelligence in Robotics
Automation solutions and robots have been used successfully for many years for clearly defined tasks in structured environments. However, in unstructured scenarios or when interacting with diverse objects and people, classic systems reach their limits. The underlying algorithms and AI models are usually designed for specific tasks and cannot generalize to unknown situations and objects.
Generative AI models such as large language models (LLMs) and vision language models (VLMs) extend classic AI algorithms with the ability to generalize. These so-called foundation models are trained on huge amounts of data from the internet and are able to generate new content such as texts or images from what they have learned. This gives robots a detailed understanding of their environment and enables them to plan specific actions based on this understanding.
By building agent systems, we are developing intelligent tools that improve both robots' understanding of situations and their ability to act. While these models often deliver impressive results in virtual applications, transferring them to the physical world—especially in robotics and specialized domains such as logistics and intralogistics—poses a particular challenge.
We combine interdisciplinary expertise in robotics, AI, and logistics and utilize our infrastructure—from motion capture in the PACE Lab to high-performance computing clusters—to successfully integrate and adapt models to specific domains, including fine-tuning. This allows us to combine the latest research trends with concrete industrial requirements.