Week 1,
Week 2,
Week 3,
(Course Project Proposal and Report Notes),
(In-class Exam #1 Notes),
Week 4,
Week 5,
Week 6,
Week 7,
(In-class Exam #2 Notes),
Week 8,
Week 9,
Week 10,
(Course Project Talk Notes),
(In-class Exam #3 Notes),
Week 11,
Week 12,
Week 13,
(end of diary)
mop | mops | [s] |
pot | pots | [s] |
pick | picks | [s] |
kiss | kisses | [(e)s] |
mob | mobs | [z] |
pod | pods | [z] |
pig | pigs | [z] |
pita | pitas | [z] |
razz | razzes | [(e)z] |
Note that the phonetic form of the plural-morpheme /s/ is a function of the last sound (and in particular, the voicing of the last sound) in the word being pluralized. A similar voicing of /s/ often (but not always) occurs between vowels, e.g., "Stasi" vs. "Streisand".
dal | dallar | ``branch'' |
kol | kollar | ``arm'' |
kul | kullar | ``slave'' |
yel | yeller | ``wind'' |
dis | disler | ``tooth'' |
g"ul | g"uller | ``race'' |
The form of the vowel in the plural-morpheme /lar/ is a function of the vowel in the word being pluralized.
In Turkish, complete utterances can consist of a single word in which the subject of the utterance is a root-morpheme (in this case, [c"op], "garbage") and all other (usually syntactic, in languages like English) relations are indicated by suffix-morphemes. As noted above, the vowels in the suffix morphemes are all subject to vowel harmony. Given the in principle unbounded number of possible word-utterance in Turkish, it is impossible to store them (let alone their versions as modified by vowel harmony) in a lexicon.
These rules can also hold when a Cantonese speaker writes English text, e.g., "Phone Let Kloss."
dosutoru | ``duster'' |
sutoroberri` | `strawberry'' |
kurippaa` | `clippers'' |
sutoraiki | ``strike'' |
katsuretsu | ``cutlet'' |
parusu | ``pulse'' |
gurafu | ``graph'' |
Japanese has a single phoneme /r/ for the liquid-phones [l] and [r]; moreover, it also allows only very restricted types of multi-consonant clusters. Hence, when words that violate these constraints are borrowed from another language, those words are changed by the modification to [r] or deletion of [l] and the insertion of vowels to break up invalid multi-consonant clusters.
/fikas/ | "strong" | /fumikas/ | "to be strong" |
/kilad/ | "red" | /kumilad/ | "to be red" |
/fusl/ | "enemy" | /fumusl/ | "to be an enemy" |
katab | kutib | "to write" |
kattab | kuttib | "cause to write" |
ka:tab | ku:tib | "correspond" |
taka:tab | tuku:tib | "write each other" |
/adus/ | "take a bath" | /odasadus/ |
/bali/ | "return" | /bolabali/ |
/bozen/ | "tired of" | /bozanbozen/ |
/dolan/ | "recreate" | /dolandolen/ |
(MEANINGFUL WHISPER: Do you see anything that makes reduplication qualitatively different from all other phonological and morphological processes that we have considered here? (Hint: a^n b^n => ww))
"[[the [quick [brown [fox]]]] [jumped over [the [lazy [dog]]]]]".
SOV | "She him loves." | 45% | Pashto, Latin, Japanese, Afrikaans |
SVO | "She loves him." | 42% | English, Hausa, Mandarin, Russian |
VSO | "Loves she him." | 9% | Hebrew, Irish, Zapotec, Tuareg |
VOS | "Loves him she." | 3% | Malagasy, Baure |
OVS | "Him loves she." | 1% | Apalai, Hixkaryana |
OSV | "Him she loves." | < 1% | Warao |
(MEANINGFUL WHISPER: Is such embedding of constituents qualitatively different from other syntactic patterns considered above? (Hint: a^n b^n))
Hence, ambiguity inherent in grammars and lexicons adds yet another type of ambiguity into natural language processing (BKL, pp. 317-318).
Once a course project is approved, you must write and submit a project proposal due at 9am on Thursday, November 2, and worth 5 course marks. This proposal should be about a page long and consist of a 3-paragraph proposal text (3.5 marks total) (with the first paragraph motivating why your topic is of interest (1 mark), the second summarizing previous work on this topic (1.5 marks), and the third summarizing the focus of your survey or the approach you will use in constructing your software system (1 mark)) followed by at least five full-information literature references particular to the topic of your project which must each be cited at least once in your proposal text (1.5 marks total) (0.3 marks per reference up to 5 references).
Once the project proposal is submitted, you have until noon on Friday, December 1, to do your project. In general, In general, this project will entail (as a minimum) a 15-20 page report (double-spaced) with 10-30 literature references (for a literature-survey project) or a 5-8 page report (double-spaced) with 5-15 references in addition to the software (for a software project). All included references MUST be cited in the body of the report. Though purely Web-based references (e.g., blogs, reference manuals) are acceptable, please try where at all possible to obtain literature-based references, e.g., books, book chapters, journal papers, conference papers; in those cases, You MUST give full references listing all reference information appropriate to the type of reference, author names, publication year, full paper title, book title with editors, journal name, journal volume and number, publisher, page numbers.
The length and reference-number requirements above may vary depending on the nature of the chosen project and whether or not this project is being done by one or more people; this should be settled with the course instructor. Ideally, the submitted project should be on the same topic as that described in your project proposal. However, it being an imperfect world, if any difficulties do arise, chat with your course instructor as soon as possible so appropriate action, e.g., revision of stated goals and/or scope of project, can be taken.
Each project will also have an associated short in-class presentation scheduled in late November. Details of talk format and scheduling will be posted in early November after project proposals are submitted.
Here's to each of you choosing and carrying out a fun course project!
S -> aA A -> bS A -> b
S -> aC C -> aC C -> bC C -> b
S -> aC C -> aC C -> bC C -> E E -> b
S -> aA A -> aA A -> bB A -> b B -> aA B -> bB B -> b
S -> {*vcd}V S -> {*unvcd}U V -> {*vcd}V V -> {*unvcd}U V -> {P}z U -> {*vcd}V U -> {*unvcd}U U -> {P}s
I hope the above helps, and I wish you all the best of luck with this exam.
Probabilistic finite-state automata and transducers are special cases of weighted finite-state automata and transducers (Mohri (2004)).
checkAccept(s, q, M) if s is empty and q is a final state in M then return accept else if s is empty and q is not a final state in M then return reject else result = reject for transition (q,x,q') in M such that s = x + s' do result = checkAccept(s', q', M) if result == accept then break return result checkAccept(s, start-state of M, M)
Note implicit handling of epsilon-transitions in this algorithm.
As the MR so created is an NFA, determinization may be a good idea.
As the MR so created is an NFA, determinization may be a good idea.
As the MR so created may be an NFA, determinization may be a good idea.
findWord(w, T) tnode = root-node of trie T i = 0 while i < length(w) do if tnode.children[w[i]] == nil then return notPresent else tnode = tnode.children[w[i]] i = i + 1 return tnode.value
addWord(w, value, T) tnode = root-node of trie T i = 0 while i < length(w) do if tnode.children[w[i]] != nil then tnode = tnode.children[w[i]] i = i + 1 else break while i < length(w) do tnode.child[w[i]] = new node tnode = tnode.child[w[i]] tnode.value = notWord i = i + 1 tnode.value = value return
checkAccept(l, u, q, M) if l and u are both empty and q is a final state in M then return accept else if l and u are both empty and q is not a final state in M then return reject else result = reject for transition (q,x/y,q') in M such that l = x + l' and u = y + u' do result = checkAccept(l', u', q', M) if result == accept then break return result checkAccept(l, u, start-state of M, M)
reconstructUpper(l, u, q, M) if l is empty and q is a final state in M then print u return else if l is empty and q is not a final state in M then return else for transition (q,x/y,q') in M such that l = x + l' do reconstructUpper(l', append(u, y), q', M) return reconstructUpper(l, epsilon, start-state of M, M)
An analogous algorithm reconstructs the lower string associated with a given upper string.
S -> aSb | ab
S -> Np Vp Np -> "the cat" | "the dog" | "the rat" | "the elephant" Vp -> V Np V -> "admired" | "bit" | "chased"
S -> Np Sp "died" | NP "died" Np -> "the cat" | "the dog" | "the rat" | "the elephant" Sp -> that" Vp Sp | "that" Vp Vp -> V Np V -> "admired" | "bit" | "chased"
S -> Np Sp "likes tuna fish" | Np "likes tuna fish" Np -> "the cat" | "the dog" | "the rat" | "the elephant" Sp -> Np Sp V | Np V V -> "admired" | "bit" | "chased"
S -> Np Vp Np -> "John" | "Mary" | "Bob" | Det N | Det N Pp Det -> "the" | "my" N -> "man" | "dog" | "cat" | "telescope" | "park" Pp -> P Np P -> "in" | "on" | "by" | "with" Vp -> V Np | V Np Pp
Unlike the other sample grammars above, this grammar is structurally ambiguous because there may be more than one parse for certain utterances involving prepositional phrases (Pp) as it is not obvious which noun phrase (Np) a Pp is attached to, e.g., in "the dog saw the man in the park", is the dog or the man in the park?
The process terminates when no shift or reduce operations can be applied, and recognizes the given sentence as valid relative to G if only the grammar sentence / start symbol S is all that remains on the stack.
I hope the above helps, and I wish you all the best of luck with this exam.
Problems that are solvable by recursive decomposition have the optimal substructure property, i.e, an optimal solution for an instance of the problem can be constructed using optimal solutions for instances of that problem that are of smaller size (subproblems).
FibR(n) if ((n == 1) || (n == 2)) return(1) else return(FibR(n - 1) + FibR(n - 2))
InitFibT(n,FibT) Allocate n-element table and assign to FibT FibT[1] = FibT[2] = 1 for (i = 3; i <= n; i++) FibT[i] = INFINITY FibM(n, FibT) if (FibT[n] == INFINITY) FibT[n] = FibM(n - 1, FibT) + FibM(n - 2, FibT) return(FibT[n])
FibI(n) if ((n == 1) || (n== 2)) return(1) else Allocate n-element table FibT FibT[1] = FibT[2] = 1 for (i = 3; i <= n; i++) FibT[i] = FibT[i - 1] + FibT[i - 2] return(FibT[n])
S -> Np Vp Np -> Det N Det -> "the" | "a" N -> "man" | "dog" | "saw" | "bit" Vp -> V Np V -> "saw" | "bit"
function CKY-DParse-1(words, G) # Create and initialize DP table Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N] = empty array # Fill in base individual-word parses in table. for j = 1 to Length(words) do for each rule N -> words[j] in G do append (j, N -> words[j]) to P[j-1][j].NT[N] # Fill in remaining cases in table, progressing # upwards to the right diagonal-fashion from the # base-cases diagonal, by running the rules in # given grammar G in reverse. for j = 2 to Length(words) do for i = j-2 downto 0 do # Fill in table entry (i,j) by checking each of the # k possibilities for splitting the sentence-fragment # defined by markers i through j into two pieces # and seeing if you can run a rule in G in reverse wrt # the previously-computed parses for those pieces. for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A] is nonempty and P[k][j].NT[B] is nonempty do append (k, N -> A B) to P[i][j].NT[N] return P
S -> Vp Vp -> V S -> Np Vp Vp -> V Np Np -> Det N Vp -> Vp Pp Np -> Np Pp N -> "book" | "flight" Np -> PN PN -> "Houston" Pp -> P Np Det -> "the" Det -> "the" | "a" V -> "book" P -> "through"
Parse #1:
Parse #2:
function CKY-DParse_2(words, G) # Create and initialize DP table Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N] = empty array # Fill in base individual-word parses in table. for j = 1 to Length(words) do # Apply all possible rules of the form N -> word. for each rule N -> words[j] in G do append (j, N -> words[j]) to P[j-1][j].NT[N] # While possible, apply all possible rules of the form # N -> N' to basic word-parses. change = true while change do change = false for each non-terminal N in G do if P[j-1][j].NT[N] is not empty and there is a rule N' -> N in G and (j, N' -> N) is not in P[j-1][j].NT[N'] then append (j, N' -> N) to P[j-1][j].NT[N'] change = true # Fill in remaining cases in table, progressing # upwards to the right diagonal-fashion from the # base-cases diagonal, by running the rules in # given grammar G in reverse. for j = 2 to Length(words) do for i = j-2 downto 0 do # Fill in table entry (i,j) by checking each of the # k possibilities for splitting the sentence-fragment # defined by markers i through j into two pieces # and seeing if you can run a rule in G in reverse wrt # the previously-computed parses for those pieces. # Apply all possible rules of the form N -> N' N''. for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A] is nonempty and P[k][j].NT[B] is nonempty do append (k, N -> A B) to P[i][j].NT[N] # While possible, apply all possible rules of the form # N -> N' to basic sentence-fragment-parses. change = true while change do change = false for each non-terminal N in G do if P[i][j].NT[N] is not empty and there is a rule N' -> N in G and (j, N' -> N) is not in P[i][j].NT[N'] then append (j, N' -> N) to P[i][j].NT[N'] change = true return P
function CKY-PParse-1(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N].bestIndex = -1 P[i][j].NT[N].bestProb = 0 for j = 1 to Length(words) do for each rule N -> words[j] in G do P[j-1][j].NT[N].bestIndex = (j, N-> words[j]) P[j-1][j].NT[N].bestProb = prob(N -> words[j])) for j = 2 to Length(words) do for i = j-2 downto 0 do for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A].bestProb > 0 and P[k][j].NT[B].bestProb > 0 do if P[i][j].NT[N].bestProb < prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then P[i][j].NT[N].bestIndex = (k, N -> A B) P[i][j].NT[N].bestProb = prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then return P
function CKY-PParse-2(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N].bestIndex = -1 P[i][j].NT[N].bestProb = 0 for j = 1 to Length(words) do for each rule N -> words[j] in G do P[j-1][j].NT[N].bestIndex = (j, N -> words[j]) P[j-1][j].NT[N].bestProb = prob(N -> words[j])) change = true while change do change = false for each non-terminal N in G do if P[j-1][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[j-1][j].NT[N'].bestIndex == -1 then P[j-1][j].NT[N'].bestIndex = (j, N' -> N) P[j-1][j].NT[N'].bestProb = P[j-1][j].NT[N].bestProb * prob(N' -> N) change = true for j = 2 to Length(words) do for i = j-2 downto 0 do for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A].bestProb > 0 and P[k][j].NT[B].bestProb > 0 do if P[i][j].NT[N].bestProb < prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then P[i][j].NT[N].bestIndex = (k, N -> A B) P[i][j].NT[N].bestProb = prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then change = true while change do change = false for each non-terminal N in G do if P[i][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[i][j].NT[N'].bestIndex == -1 then P[i][j].NT[N'].bestIndex = (j, N' -> N) P[i][j].NT[N'].bestProb = P[i][j].NT[N].bestProb * prob(N' -> N) change = true return P
function CKY-PParse-3(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N].bestIndex = -1 P[i][j].NT[N].bestProb = 0 for j = 1 to Length(words) do for each rule N -> words[j] in G do P[j-1][j].NT[N].bestIndex = (j, N -> words[j]) P[j-1][j].NT[N].bestProb = log(prob(N -> words[j]))) change = true while change do change = false for each non-terminal N in G do if P[j-1][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[j-1][j].NT[N'].bestIndex == -1 then P[j-1][j].NT[N'].bestIndex = (j, N' -> N) P[j-1][j].NT[N'].bestProb = P[j-1][j].NT[N].bestProb + log(prob(N' -> N)) change = true for j = 2 to Length(words) do for i = j-2 downto 0 do for k = 1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A].bestProb > -INFINITY and P[k][j].NT[B].bestProb > -INFINITY then if P[i][j].NT[N].bestProb < log(prob(N -> A B)) + P[i][k].NT[A].bestProb + P[k][j].NT[B].bestProb then P[i][j].NT[N].bestIndex = (k, N -> A B) P[i][j].NT[N].bestProb = log(prob(N -> A B)) + P[i][k].NT[A].bestProb + P[k][j].NT[B].bestProb then change = true while change do change = false for each non-terminal N in G do if P[i][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[i][j].NT[N'].bestProb < P[i][j].NT[N].bestProb + log(prob(N' -> N)) then P[i][j].NT[N'].bestIndex = (j, N' -> N) P[i][j].NT[N'].bestProb = P[i][j].NT[N].bestProb + log(prob(N' -> N)) change = true return P
Pronoun | Non-Pronoun | |
Subject | 91% | 9% |
Object | 34% | 66% |
We will look briefly at each of them in turn wrt the following basic operations:
Though machine learning NLP techniques are important in many applications (see Chapter 7 in Kedia and Rasu (2020)), we shall focus in this course on the currently state-of-the-art neural network NLP mechanisms.
These problems have been dealt with in more advanced types of neural networks by adding additional structures and modules in the network architecture such that embedded FF-NN are smaller and flatter.
Indeed, the above suggests that rule-based NLP systems may be much more applicable than is sometimes thought, and that a fusion of NN- and rule-based NLP system components (or a fundamental re-thinking of NLP system architecture that incorporates the best features of both) may be necessary to create the practical real-world NLP systems of the future.
Each one- and two-person talk is alloted 10 and 15 minutes, respectively. Multi-person talks are expected to divide the time up approximately equally between the people involved. You may use the in-class blackboard and AV facilities as you please during your talks. You may also structure your talks as you please; a general guideline would be to not try to cram all the details of your project into the talk but rather to give an introductory "advertisement" stressing (as in your Course Proposals) why your topic is interesting and previous work on that topic.
Here's to each of you having fun working on your course project!
Task | Vocabulary | Error Rate (%) |
TI Digits | 11 | 0.5 |
Wall Street Journal read speech | 5,000 | 0.3 |
Wall Street Journal read speech | 20,000 | 3 |
Broadcast news | 64,000+ | 10 |
Conversational Telephone Speech (CTS) | 64,000+ | 20 |
These rates are currently decreasing by about 10% every year due to algorithmic and hardware improvements (J&M, p. 287); however, other factors can still dramatically increase errors rates over and above those in the table (e.g., 3-4x by strong accent and 2-4x with noisy environment).
Concatenative synthesis is the most popularly-used technique, and is typically done relative to either diphones (which cover the middle of one phone to the middle of the next phone, to accommodate phone co-articulation effects) or arbitrary-length units (which accommodate word co-articulation effects).
These algorithms have differing performances, knowledge-base requirements, and running times and space requirements. Hence, no POS tagger is good in all applications.
Note that the last two characteristics must be stated relative to a particular application and semantic domain.
Many of these shortcomings can be mitigated using more complex representations and representation-manipulation mechanisms; however, given the additional processing costs associated with these more complex schemes, the best ways of encoding and manipulating the semantics of utterances is still a very active area of research.
I hope the above helps, and I wish you all the best of luck with this exam.
as well as use of sentence-forms that establish a common entity-focus, e.g. (J&M, Examples 21.6 and 21.7),
More complex mixed-initiative DM structure dialogues in terms of semantic frames whose slots can be filled in a user-directed order, e.g., Tellman (2023).
(see also The Personality Forge and The Chatterbot Collection). Though many older chatbots rely on modifications of the mechanisms pioneered in ELIZA and PARRY, modern chatbots are often built using Seq2Seq modeling as implemented by neural network models (Vajjala et al (2020), Chapter 6).
The extreme versions of each school can be seen as the endpoints of a continuum, and it is generally acknowledged that the mechanisms underlying actual language acquisition probably lie somewhere in the middle.
All of these algorithms employ depth-first search on lexical transducer, with various types of mechanisms for pruning relative to cut-off error distance.
Created: July 1, 2023
Last Modified: June 5, 2024