Large Language Models
=====================

To learn more about using NeMo to train Large Language Models at scale, please refer to the `NeMo Framework User Guide <https://docs.nvidia.com/nemo-framework/user-guide/latest/index.html>`_.

* GPT-style models (decoder only)
* T5/BART/UL2-style models (encoder-decoder)
* BERT-style models (encoder only)
* RETRO model (decoder only)


.. toctree::
   :maxdepth: 1

   gpt/gpt_training
   batching
   positional_embeddings
   mcore_customization
   reset_learning_rate
   rampup_batch_size


References
----------

.. bibliography:: ../nlp_all.bib
    :style: plain
    :labelprefix: nlp-megatron
    :keyprefix: nlp-megatron-