Continual Curiosity driven Skill Acquisition From High dimensional Video Inputs for Humanoid Robots

Developmental Robotics and Artificial Curiosity

Developmental Robotics. A developmental program is embodied in a robot, enabling it to learn by interacting with its environment. Developmental programs should be simple, general, and potentially powerful, driven by the (formally verified) insight that it is more likely for a simpler solution to a problem to be able to generalize well to slightly different problems compared to a complex solution. Instead of progamming every minute detail, some (ideally) basic (but, practically, much more) structure is provided, with which the robot can improve upon by learning. Very complex (and very impressive) robot demonstrations can break when the environment changes just a little bit. More intelligent robots could adapt to a wide range of environments, by ``absorbing structure'' from its interactions with its environment.

Artificial Curiosity. Developmental algorithms must be capable of compressing massive amounts of sensory and motor data into a useful form. The robot collects the data incrementally, often through its own actions. It is useful for the robot to have a drive to find the best data for its developmental program, i.e., which enables it to learn the quickest. The robot should be curious. Curiosity-driven agents use reinforcement learning to quickly adapt control policies to maximize intrinsic reward, which is a measurable improvement in the compressor or predictor or world-model (A brief review of artificial curiosity).

Projects

Our Katana robot arm curiously plays with wooden blocks, using vision, reaching, and grasping. It is intrinsically motivated to explore its world. As a by-product, it learns how to place blocks stably, and how to stack blocks.

The Upper Confidence Weighted Learning (UCWL) framework calculates intrinsic rewards through estimating the confidence intervals of the agent's predictions, which allows for efficient exploration in human-robot interaction scenarios with binary feedback.

Curiosity-Driven Modular Incremental Slow Feature Analysis (CD-MISFA) is a hierarchical curiosity-driven learning system that learns multiple abstract slow-feature based representations (in order of their complexity) from a robot's high-dimensional visual input stream. Such abstractions encode raw pixel inputs in a useful manner for the robot to learn skills, in a manner inspired by continual learning.

Planning to achieve intrinsic reward with ``perceptual'' and ``cognitive'' learning. While external reward signals are typically stationary, intrinsic signals based on artificial curiosity change rapidly. Model-based RL using intrinsic rewards adapts the state values quickly enough to respond effectively to an ever-changing intrinsic reward landscape.

Neuroevolution is combined with an unsupervised sensory pre-processor or compressor that is trained on images generated from the environment by the population of evolving recurrent neural network controllers. The compressor not only reduces the input dimension for the controllers, but also biases the search toward novel controllers by rewarding those controllers that discover images that it reconstructs poorly.

Publications

  • Hung Ngo, Matthew Luciw, Ngo Anh Vien, Jawad Nagi, Alexander Foerster, and Juergen Schmidhuber (2014). Efficient Interactive Multiclass Learning from Binary Feedback. ACM Transactions on Interactive Intelligent Systems. -[Preprint]- -[Video Demos]-

  • Hung Ngo, Matthew Luciw, Alexander Foerster, and Juergen Schmidhuber (2013). Confidence-Based Progress-Driven Self-Generated Goals for Skill Acquisition in Developmental Robots. Frontiers in Cognitive Science. -[Open Access]-

  • Hung Ngo, Matthew Luciw, Ngo Anh Vien, and Juergen Schmidhuber (2013). Upper Confidence Weighted Learning for Efficient Exploration in Multiclass Prediction with Binary Feedback. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI, Beijing).

  • Matthew Luciw*, Varun Raj Kompella*, Sohrob Kazerounian, and Juergen Schmidhuber (2013). An Intrinsic Value System for Developing Multiple Invariant Representations with Incremental Slowness Learning. Frontiers in Neurorobotics. *joint first authors

  • Varun Raj Kompella, Matthew Luciw, Marijn Stollenga, Leo Pape, and Juergen Schmidhuber (2012). Autonomous Learning of Abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis (Curious Dr. MISFA). In Proceedings of the International Conference on Development and Learning and Epigenetic Robotics (ICDL-EPIROB, San Diego). -[Video]-

  • Hung Ngo, Matthew Luciw, Alexander Foerster, and Juergen Schmidhuber (2012). Learning Skills from Play: Artificial Curiosity on a Katana Robot Arm. In Proceedings of the International Joint Conference on Neural Networks (IJCNN, Brisbane). -[Short "High Impact" Video]- -[Long Video With Music]- -[Slides]-

  • Matthew Luciw, Vincent Graziano, Mark Ring, and Juergen Schmidhuber (2011). Artificial Curiosity with Planning for Autonomous Perceptual and Cognitive Development. In Proceedings of the International Conference on Development and Learning and Epigenetic Robotics (ICDL-EPIROB, Frankfurt). -[Matlab Code]- -[Poster]-

  • Giuseppe Cuccu, Matthew Luciw, Juergen Schmidhuber, and Faustino Gomez (2011). Intrinsically Motivated NeuroEvolution for Vision-Based Reinforcement Learning. In Proceedings of the International Conference on Development and Learning and Epigenetic Robotics (ICDL-EPIROB, Frankfurt). -[A Few Slides]-

riosbehiden1998.blogspot.com

Source: https://people.idsia.ch/~luciw/devrob.html

0 Response to "Continual Curiosity driven Skill Acquisition From High dimensional Video Inputs for Humanoid Robots"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel