Bringing Robots to Life: Meet Humanoid Robot Sophia

Members: Teng Xu, Taotao Zhou

We are entering the great era with the rise of robotics, embodiment AI and LLMs! Our project pushes the boundaries of what humanoid robots can achieve, especially in how they express and interact with us. We present you one of the most advanced humanoid robots, Sophia, developed by Hanson Robotics, which is at the forefront of this innovation. Let's see how we're making Sophia more lifelike and engaging than ever with our novel facial expression control and generation scheme.

Demonstration of Sophia's advanced facial expression capabilities

The Magic of Facial Expressions

Facial expressions are a fundamental part of human communication. They convey emotions and reactions that words alone can't always express. For robots like Sophia to interact with us effectively, they need to replicate these expressions convincingly. This is where our project comes in.

Sophia can already play a wide range of human facial expressions, but her existing system depends on predefined, artist-created expression animations. These are great for specific pre-programmed scenarios but lack the flexibility and dynamism needed for real-time interactions.

Three Exciting Demos

To showcase our advancements, we have prepared three demonstrations:

  • Control Humanoid Robot Sophia's Expression with iPhone ARKit: Using the advanced capabilities of iPhone's ARKit, we can now control Sophia's facial expressions in real-time. ARKit tracks facial movements using the phone's camera and sends this data to Sophia, allowing her to mirror these expressions instantly. This means you can smile at Sophia, and she'll smile right back at you!
  • Imitating Actors' Performances from Movie Clips: We've taken it a step further by enabling Sophia to imitate the actors' performances from movie clips. With our facial expression mapping model, we are able to map actors' facial expressions from video to Sophia's motor parameters, ensuring she can accurately reproduce even the subtlest of emotions displayed by the actors.
  • Chatting with Sophia, Powered by GPT-4o and Microsoft Azure's TTS API: What's a humanoid robot without the ability to hold a conversation? Leveraging the power of GPT-4o, a cutting-edge language model, and Microsoft Azure's Text-to-Speech (TTS) API, Sophia can now engage in meaningful and dynamic conversations. As she talks, her facial expressions change in real-time, reflecting the emotions behind her words. Whether she's happy, curious, or contemplative, you'll see it all on her face.

The Technology Behind the Magic

Underneath these impressive demonstrations is a sophisticated transformer-based architecture. This system maps ARKit parameters to Sophia's motor parameters, ensuring accurate and real-time reproduction of facial expressions. Additionally, we are able to leverage the real-time translation pipeline to convert text inputs into corresponding facial expressions, making Sophia's conversations feel more natural and engaging.

Why This Matters

Our work is more than just a technical achievement; it's a step towards more intuitive and human-like interactions between people and robots. By enhancing Sophia's ability to express emotions and respond in real-time, we are making it easier for humans to connect with robots on a personal level. This has far-reaching implications, from customer service to companionship and beyond.

We are thrilled to share these advancements with you and look forward to a future where humanoid robots like Sophia can seamlessly integrate into our daily lives, offering assistance, companionship, and a touch of humanity.

Thank you for joining us!