Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

CS 540 Introduction to Artificial Intelligence: Natural Language Processing, Study notes of Artificial Intelligence

An announcement for the CS 540 Introduction to Artificial Intelligence course at the University of Wisconsin-Madison for the Fall 2022 semester. The document covers topics related to Natural Language Processing (NLP), including language models, classic NLP tasks, word representations, and training issues. The document also briefly discusses the history of NLP and the progress made in the field. The document could be useful as study notes or a summary for a student preparing for an exam in an NLP-related course.

Typology: Study notes

2021/2022

Uploaded on 05/11/2023

arij
arij 🇺🇸

4.8

(8)

230 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 540 Introduction to Artificial Intelligence
Natural Language Processing
University of Wisconsin-Madison
Fall 2022
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download CS 540 Introduction to Artificial Intelligence: Natural Language Processing and more Study notes Artificial Intelligence in PDF only on Docsity!

CS 540 Introduction to Artificial Intelligence

Natural Language Processing

University of Wisconsin-Madison Fall 2022

Announcements

  • Homeworks :
    • HW3 in progress.
  • Class roadmap: Tuesday, Sept. 27 NLP Thursday, Sept. 29 ML Intro Tuesday, Oct. 4 ML Unsupervised I Thursday, Oct. 6 ML Unsupervised II Tuesday, Oct. 11 ML Linear Regression Machine Learning

Why is it hard? Many reasons:

  • Ambiguity: “ We saw her duck”. Several meanings.
  • Non-standard use of language
  • Segmentation challenges
  • Understanding of the world
    • “Bob and Joe are brothers”.
    • “Bob and Joe are fathers”.

Approaches to NLP A brief history

  • Symbolic NLP: 50’s to 90’s
  • Statistical/Probabilistic: 90’s to present
    • Neural: 2010’s to present Lots of progress! Lots more to work to do ELIZA program

Language Models

  • Basic idea: use probabilistic models to assign a probability to a sentence
  • Goes back to Shannon
    • Information theory: letters

Training The Model Recall the chain rule

  • How do we estimate these probabilities
    • Same thing as “training”
  • From data?
    • Yes, but not directly: too many sentences.
    • Can’t estimate reliably.

k=0: Uni gram Model

  • Full independence assumption:
    • (Present doesn’t depend on the past)
  • Example (from Dan Jurafsky’s notes) fifth, an, of, futures, the, an, incorporated, a, a, the, inflation, most, dollars, quarter, in, is, mass thrift, did, eighty, said, hard, 'm, july, bullish that, or, limited, the

k=1: Bi gram Model

  • Markov Assumption:
    • (Present depends on immediate past)
  • Example: texaco, rose, one, in, this, issue, is, pursuing, growth, in, a, boiler, house, said, mr., gurria, mexico, 's, motion, control, proposal, without, permission, from, five, hundred, fifty, five, yen outside, new, car, parking, lot, of, the, agreement, reached this, would, be, a, record, november

n- gram Training Issues:

  • 1. Multiply tiny numbers?
    • Solution : use logs; add instead of multiply
  • 2. n-grams with zero probability?
    • Solution : smoothing Dan Klein

For issue 2 , back-off methods

  • Use n-gram where there is lots of information, r- gram (with r << n ) elsewhere. (trigrams / bigrams) Interpolation
  • Mix different models: (tri- + bi- + unigrams) Other Solutions: Backoff & Interpolation

Vocabulary: open vs closed

  • Possible to estimate size of unknown vocabulary
    • Good-Turing estimator
  • Originally developed to crack the Enigma machine

Evaluating Language Models How do we know we’ve done a good job?

  • Observation
  • Train/test on separate data & measure metrics
  • Metrics :
      1. Extrinsic evaluation
      1. Perplexity

Intrinsic Evaluation: Perplexity Perplexity is a measure of uncertainty Lower is better! Examples:

  • WSJ corpus; 40 million words for training:
    • Unigram: 962, Bigram 170, Trigram 109

Further NLP Tasks Language modeling is not the only task. Two further types:

1. Auxilliary tasks: - Part-of-speech tagging, parsing, etc. 2. Direct tasks: - Question-answering, translation, summarization, classification (e.g., sentiment analysis)