Wanted to have a summary of he NLP acronyms and common tasks.

Acronyms in NLP

  • LM Language Modelling (predicting the next word given the previous context)
  • NLP Natural Language Processing
  • NLG Natural Language Generation
  • NLU Natural Language Understanding
  • NLI Natural Language Inference
  • NED Named Entity Disambiguation
  • NER Named Entity Recognition
  • NMT Neural Machine Translation
  • BPE Byte Pair Encoding
  • NSP Next Sentence Prediction (like Bert)
  • MLM Masked Language Model (Like Bert)
  • CLM Causal Language Model (like GPT2)
  • RTE Recognizing Textual Entailment (pair of sentences, and the task is to predict whether the first entails the second; part of GLUE)
  • MRPC Microsoft Research Paraphrase Corpus (pair of sentences, labeled as almost semantically equivalent, or no, part of GLUE)
  • PBSMT Phrase Based Statistical Machine Translation (consisting of 9 tasks)
  • WER Word Error Rate
  • BERT Bidirectional Encoder Representations from Transformers
  • GloVe Global Vectors
  • NEL Named Entity Linking
  • NED Named Entity Disambiguation
  • NERD Named Entity Recognition and Disambiguation
  • NEN Named Entity Normalization
  • CoLA Corpus of Linguistic Acceptability (possible or no, T5 uses this; part of GLUE)
  • SWAG Situations With Adversarial Generations
  • WNLI Winograd NLI
  • STS Semantic Textual Similarity
  • SST Stanford Sentiment Treebank (positive or negative movie review; part of GLUE)

Metrics

  • BLEU BiLingual Evaluation Understudy
  • ROUGE Recall Oriented Understudy for Gisting Evaluation
  • GLUE General Language Understanding Evaluation
  • SQuAD Stanford Question Answering Dataset v1.1 and v2.0.
  • GLEU Generalized Language Understanding Evaluation
  • SEQVAL Sequence labeling Evaluation.
  • XNLI Cross-lingual Natural Language Inference

Tasks in NLP:

  • Automatic summarization
  • Collocation extraction
  • Information extraction
  • Entity linking
  • Natural language parsing
  • Part-of-speech tagging
  • Phrase chunking
  • Question answering
  • Relationship extraction
  • Semantic parsing
  • Stemming
  • Shallow parsing
  • Text segmentation
  • Textual entailment
  • Text simplification
  • Truecasing
  • Terminology extraction
  • Lemmatisation
  • Downstream tasks (on pretrained models) aka finetuning
  • Word-sense disambiguation
  • POS part of speech tagging (assigning tags such as n, v, adj to tokens)
  • Constituent labeling
  • Dependency labeling
  • Named entity labeling
  • SRL Semantic role labeling
  • Coreference (Obama -> the former president)
  • Semantic proto-role (SPR)
  • Relation classification
  • http://nlpprogress.com/english/common_sense.html
  • https://paperswithcode.com/area/natural-language-processing
  • https://arxiv.org/abs/1905.06316