InfoXLM/XLM-E: multilingual/cross-lingual pre-trained models for 100+ languagesĭeltaLM/mT6: encoder-decoder pre-training for language generation and translation for 100+ languages UniLM: unified pre-training for language understanding and generation ![]() The Big Convergence - Large-scale self-supervised pre-training across tasks (predictive and generative), languages (100+ languages), and modalities (language, image, audio, layout/format + language, vision + language, audio + language, etc.) Language & Multilingual MetaLM: Language Models are General-Purpose Interfaces Generality - Foundation Transformers (Magneto): towards true general-purpose modeling across tasks and modalities (including language, vision, speech, and multimodal)Ĭapability - A Length-Extrapolatable TransformerĮfficiency & Transferability - X-MoE: scalable & finetunable sparse Mixture-of-Experts (MoE) Foundation Models General-purpose Foundation Model ![]() Stability - DeepNet: scaling Transformers to 1,000 Layers and beyond We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on Foundation Models (aka large-scale pre-trained models) and AGI, NLP, MT, Speech, Document AI and Multimodal AI, please send your resume to AGI Fundamentals TorchScale - Transformers at (any) Scale ( repo)įundamental research to improve modeling generality and capability, as well as training stability and efficiency for Transformers at any scale.
0 Comments
Leave a Reply. |