Saturday, January 22, 2022

g-f(2)846 THE NEW WORLD NEWS (1/22/2022), MIT Technology Review, Meta’s new learning algorithm can teach AI to multi-task

Meta’s new learning algorithm can teach AI to multi-task

  • The single technique for teaching neural networks multiple skills is a step towards general-purpose AI.
  • Deep neural networks have become very good at identifying objects in photos and conversing in natural language, but not at the same time: there are AI models that excel at one or the other, but not both. 
  • Part of the problem is that these models learn different skills using different techniques. 
  • A team at Meta AI (previously Facebook AI Research) wants to change that. 

      • The researchers have developed a single algorithm that can be used to train a neural network to recognize images, text, or speech. The algorithm, called Data2vec, not only unifies the learning process but performs at least as well as existing techniques in all three skills. “We hope it will change the way people think about doing this type of work,” says Michael Auli, a researcher at Meta AI.
      • Data2vec is part of a big trend in AI toward models that can learn to understand the world in more than one way. “It’s a clever idea,” says Ani Kembhavi at the Allen Institute for AI in Seattle, who works on vision and language. “It’s a promising advance when it comes to generalized systems for learning.”
      • The researchers were surprised to find that their approach actually performed better than existing techniques at recognizing images and speech, and performed as well as leading language models on text understanding.

      The first high-performance self-supervised algorithm that works for speech, vision, and text

      • Self-supervised learning — where machines learn by directly observing the environment rather than being explicitly taught through labeled images, text, audio, and other data sources — has powered many significant recent advances in AI. But while people appear to learn in a similar way regardless of how they get information — whether they use sight or sound, for example — there are currently big differences in the way self-supervised learning algorithms learn from images, speech, text, and other modalities.
      • This discrepancy has been a significant barrier to applying advances in self-supervised learning more broadly. Because a powerful algorithm designed for, say, understanding images can’t be directly applied to another modality, such as text, it is difficult to push several modalities ahead at the same rate.
      • This is why Meta AI developed and is excited to announce data2vec, the first high-performance self-supervised algorithm that works for multiple modalities. We apply data2vec separately to speech, images and text and it outperformed the previous best single-purpose algorithms for computer vision and speech and it is competitive on NLP tasks. It also represents a new paradigm of holistic self-supervised learning, where new research improves multiple modalities rather than just one. It also does not rely on contrastive learning or reconstructing the input example. In addition to helping accelerate progress in AI, data2vec brings us closer to building machines that learn seamlessly about different aspects of the world around them. It will enable us to develop more adaptable AI, which we believe will be able to perform tasks beyond what today’s systems can do.
      • As part of this announcement, we are sharing code and pretrained models on data2vec so that others in the research community can build upon our work.
      • How data2vec works
      • Toward machines that learn from observing the world around them


