Week 1,
Week 2,
Week 3,
(In-class Exam #1 Notes),
(Course Project Proposal and Report Notes),
Week 4,
Week 5,
Week 6,
Week 7,
(In-class Exam #2 Notes),
Week 8,
Week 9,
Week 10,
(In-class Exam #3 Notes),
(Course Project Talk Notes),
Week 11,
(end of diary)
mop | mops | [s] |
pot | pots | [s] |
pick | picks | [s] |
mob | mobs | [z] |
pod | pods | [z] |
pig | pigs | [z] |
pita | pitas | [z] |
Note that the phonetic form of the plural-morpheme /s/ is a function of the last sound (and in particular, the voicing of the last sound) in the word being pluralized. A similar voicing of /s/ often (but not always) occurs between vowels, e.g., "Stasi" vs. "Streisand".
dal | dallar | ``branch'' |
kol | kollar | ``arm'' |
kul | kullar | ``slave'' |
yel | yeller | ``wind'' |
dis | disler | ``tooth'' |
g"ul | g"uller | ``race'' |
The form of the vowel in the plural-morpheme /lar/ is a function of the vowel in the word being pluralized.
In Turkish, complete utterances can consist of a single word in which the subject of the utterance is a root-morpheme (in this case, [c"op], "garbage") and all other (usually syntactic, in languages like English) relations are indicated by suffix-morphemes. As noted above, the vowels in the suffix morphemes are all subject to vowel harmony. Given the in principle unbounded number of possible word-utterance in Turkish, it is impossible to store them (let alone their versions as modified by vowel harmony) in a lexicon.
These rules can also hold when a Cantonese speaker writes English text, e.g., "Phone Let Kloss."
dosutoru | ``duster'' |
sutoroberri` | `strawberry'' |
kurippaa` | `clippers'' |
sutoraiki | ``strike'' |
katsuretsu | ``cutlet'' |
parusu | ``pulse'' |
gurafu | ``graph'' |
Japanese has a single phoneme /r/ for the liquid-phones [l] and [r]; moreover, it also allows only very restricted types of multi-consonant clusters. Hence, when words are borrowed from another language that violates these constraints, the words are changed by the modification to [r] or deletion of [l] and the insertion of vowels to break up invalid multi-consonant clusters.
/fikas/ | "strong" | /fumikas/ | "to be strong" |
/kilad/ | "red" | /kumilad/ | "to be red" |
/fusl/ | "enemy" | /fumusl/ | "to be an enemy" |
katab | kutib | "to write" |
kattab | kuttib | "cause to write" |
ka:tab | ku:tib | "correspond" |
taka:tab | tuku:tib | "write each other" |
/adus/ | "take a bath" | /odasadus/ |
/bali/ | "return" | /bolabali/ |
/bozen/ | "tired of" | /bozanbozen/ |
/dolan/ | "recreate" | /dolandolen/ |
(MEANINGFUL WHISPER: Do you see anything that makes reduplication qualitatively different from all other phonological and morphological processes that we have considered here? (Hint: a^n b^n => ww))
"[[the [quick [brown [fox]]]] [jumped over [the [lazy [dog]]]]]".
SOV | "She him loves." | 45% | Pashto, Latin, Japanese, Afrikaans |
SVO | "She loves him." | 42% | English, Hausa, Mandarin, Russian |
VSO | "Loves she him." | 9% | Hebrew, Irish, Zapotec, Tuareg |
VOS | "Loves him she." | 3% | Malagasy, Baure |
OVS | "Him loves she." | 1% | Apalai, Hixkaryana |
OSV | "Him she loves." | < 1% | Warao |
(MEANINGFUL WHISPER: Is such embedding of constituents qualitatively different from other syntactic patterns considered above? (Hint: a^n b^n))
Hence, ambiguity inherent in grammars and lexicons adds yet another type of ambiguity into natural language processing (BKL, pp. 317-318).
S -> aA A -> bS A -> b
I hope the above helps, and I wish you all the best of luck with this exam.
Once a course project is approved, you must write and submit a project proposal due at noon on Thursday, November 3, and worth 5 course marks. This proposal should be about a page long and consist of a 3-paragraph proposal text (3.5 marks total) (with the first paragraph motivating why your topic is of interest (1 mark), the second summarizing previous work on this topic (1.5 marks), and the third summarizing the focus of your survey or the approach you will use in constructing your software system (1 mark)) followed by at least five full-information literature references particular to the topic of your project which must each be cited at least once in your proposal text (1.5 marks total) (0.3 marks per reference up to 5 references).
Once the project proposal is submitted, you have until midnight on Friday, December 2, to do your project. In general, In general, this project will entail (as a minimum) a 15-20 page report (double-spaced) with 10-30 literature references (for a literature-survey project) or a 5-8 page report (double-spaced) with 5-15 references in addition to the software (for a software project). All included references MUST be cited in the body of the report. Though purely Web-based references (e.g., blogs, reference manuals) are acceptable, please try where at all possible to obtain literature-based references, e.g., books, book chapters, journal papers, conference papers; in those cases, You MUST give full references listing all reference information appropriate to the type of reference, author names, publication year, full paper title, book title with editors, journal name, journal volume and number, publisher, page numbers.
The length and reference-number requirements above may vary depending on the nature of the chosen project and whether or not this project is being done by one or more people; this should be settled with the course instructor Ideally, the submitted project should be on the same topic as that described in your project proposal. However, it being an imperfect world, if any difficulties do arise, chat with your course instructor as soon as possible so appropriate action, e.g., revision of stated goals and/or scope of project, can be taken.
Each project will also have an associated short in-class presentation scheduled in late November and early December. Details of talk format and scheduling will be posted in early November after project proposals are submitted.
Here's to each of you choosing and carrying out a fun course project!
S -> aC C -> aC C -> bC C -> b
S -> aC C -> aC C -> bC C -> E E -> b
S -> aA A -> aA A -> bB B -> aA B -> bB B -> b
S -> {*vcd}V S -> {*unvcd}U V -> {*vcd}V V -> {*unvcd}U V -> {NP}z U -> {*vcd}V U -> {*unvcd}U U -> {NP}s
Probabilistic finite-state automata and transducers are special cases of weighted finite-state automata and transducers (Mohri (2004)).
S -> aSb | ab
S -> Np Vp Np -> "the cat" | "the dog" | "the rat" | "the elephant" Vp -> V Np V -> "admired" | "bit" | "chased"
S -> Np Sp "died" | NP "died" Np -> "the cat" | "the dog" | "the rat" | "the elephant" Sp -> that" Vp Sp | "that" Vp Vp -> V Np V -> "admired" | "bit" | "chased"
S -> Np Sp "likes tuna fish" | Np "likes tuna fish" Np -> "the cat" | "the dog" | "the rat" | "the elephant" Sp -> Np Sp V | Np V V -> "admired" | "bit" | "chased"
S -> Np Vp Np -> "John" | "Mary" | "Bob" | Det N | Det N Pp Det -> "the" | "my" N -> "man" | "dog" | "cat" | "telescope" | "park" Pp -> P Np P -> "in" | "on" | "by" | "with" Vp -> V Np | V Np Pp
Unlike the other sample grammars above, this grammar is structurally ambiguous because there may be more than one parse for certain utterances involving prepositional phrases (Pp) as it is not obvious which noun phrase (Np) a Pp is attached to, e.g., in "the dog saw the man in the park", is the dog or the man in the park?
checkAccept(s, q, F) if s is empty and q is a final state in F then return accept else if s is empty and q is not a final state in F then return reject else result = reject for transition (q,x,q') in F such that s = x + s' do result = checkAccept(s', q', F) if result == accept then break return result checkAccept(s, start-state of F, F)
Note implicit handling of epsilon-transitions in this algorithm.
As the FR so created is an NFA, determinization may be a good idea.
As the FR so created is an NFA, determinization may be a good idea.
As the FR so created may be an NFA, determinization may be a good idea.
findWord(w, T) tnode = root-node of trie T i = 0 while i < length(w) do if tnode.children[w[i]] == nil then return notPresent else tnode = tnode.children[w[i]] i = i + 1 return tnode.value
addWord(w, value, T) tnode = root-node of trie T i = 0 while i < length(w) do if tnode.children[w[i]] != nil then tnode = tnode.children[w[i]] i = i + 1 else break while i < length(w) do tnode.child[w[i]] = new node tnode = tnode.child[w[i]] tnode.value = notWord i = i + 1 tnode.value = value return
checkAccept(l, u, q, F) if l and u are both empty and q is a final state in F then return accept else if l and u are both empty and q is not a final state in F then return reject else result = reject for transition (q,x/y,q') in F such that l = x + l' and u = y + u' do result = checkAccept(l'/u', q', F) if result == accept then break return result checkAccept(l, u, start-state of F, F)
reconstructUpper(l, u, q, F) if l is empty and q is a final state in F then print u return else if l is empty and q is not a final state in F then return else for transition (q,x/y,q') in F such that l = x + l' do reconstructUpper(l', append(u, y), q', F) return reconstructUpper(l, epsilon, start-state of F, F)
An analogous algorithm reconstructs the lower string associated with a given upper string.
The process terminates when no shift or reduce operations can be applied, and recognizes the given sentence as valid relative to G if only the grammar sentence / start symbol S is all that remains on the stack.
Problems that are solvable by recursive decomposition have the optimal substructure property, i.e, an optimal solution for an instance of the problem can be constructed using optimal solutions for instances of that problem that are of smaller size (subproblems).
FibR(n) if ((n == 1) || (n == 2)) return(1) else return(FibR(n - 1) + FibR(n - 2))
InitFibT(n,FibT) Allocate n-element table and assign to FibT FibT[1] = FibT[2] = 1 for (i = 3; i <= n; i++) FibT[i] = INFINITY FibM(n, FibT) if (FibT[n] == INFINITY) FibT[n] = FibM(n - 1, FibT) + FibM(n - 2, FibT) return(FibT[n])
FibI(n) if ((n == 1) || (n== 2)) return(1) else Allocate n-element table FibT FibT[1] = FibT[2] = 1 for (i = 3; i <= n; i++) FibT[i] = FibT[i - 1] + FibT[i - 2] return(FibT[n])
I hope the above helps, and I wish you all the best of luck with this exam.
function CKY-DParse-1(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N] = empty array for j = 1 to Length(words) do for each rule N -> words[j] in G do append (j, N -> words[j]) to P[j-1][j].NT[N] for j = 2 to Length(words) do for i = j-2 downto 0 do for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A] is nonempty and P[k][j].NT[B] is nonempty do append (k, N -> A B) to P[i][j].NT[N] return P
function CKY-DParse_2(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N] = empty array for j = 1 to Length(words) do for each rule N -> words[j] in G do append (j, N -> words[j]) to P[j-1][j].NT[N] change = true while change do change = false for each non-terminal N in G do if P[j-1][j].NT[N] is not empty and there is a rule N' -> N in G and (j, N' -> N) is not in P[j-1][j].NT[N'] then append (j, N' -> N) to P[j-1][j].NT[N'] change = true for j = 2 to Length(words) do for i = j-2 downto 0 do for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A] is nonempty and P[k][j].NT[B] is nonempty do append (k, N -> A B) to P[i][j].NT[N] change = true while change do change = false for each non-terminal N in G do if P[i][j].NT[N] is not empty and there is a rule N' -> N in G and (j, N' -> N) is not in P[i][j].NT[N'] then append (j, N' -> N) to P[i][j].NT[N'] change = true return P
function CKY-PParse-1(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N].bestIndex = -1 P[i][j].NT[N].bestProb = 0 for j = 1 to Length(words) do for each rule N -> words[j] in G do P[j-1][j].NT[N].bestIndex = (j, N-> words[j]) P[j-1][j].NT[N].bestProb = prob(N -> words[j])) for j = 2 to Length(words) do for i = j-2 downto 0 do for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A].bestProb > 0 and P[k][j].NT[B].bestProb > 0 do if P[i][j].NT[N].bestProb < prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then P[i][j].NT[N].bestIndex = (k, N -> A B) P[i][j].NT[N].bestProb = prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then return P
function CKY-PParse-2(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N].bestIndex = -1 P[i][j].NT[N].bestProb = 0 for j = 1 to Length(words) do for each rule N -> words[j] in G do P[j-1][j].NT[N].bestIndex = (j, N -> words[j]) P[j-1][j].NT[N].bestProb = prob(N -> words[j])) change = true while change do change = false for each non-terminal N in G do if P[j-1][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[j-1][j].NT[N'].bestIndex == -1 then P[j-1][j].NT[N'].bestIndex = (j, N' -> N) P[j-1][j].NT[N'].bestProb = P[j-1][j].NT[N].bestProb * prob(N' -> N) change = true for j = 2 to Length(words) do for i = j-2 downto 0 do for k = i+1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A].bestProb > 0 and P[k][j].NT[B].bestProb > 0 do if P[i][j].NT[N].bestProb < prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then P[i][j].NT[N].bestIndex = (k, N -> A B) P[i][j].NT[N].bestProb = prob(N -> A B) * P[i][k].NT[A].bestProb * P[k][j].NT[B].bestProb then change = true while change do change = false for each non-terminal N in G do if P[i][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[i][j].NT[N'].bestIndex == -1 then P[i][j].NT[N'].bestIndex = (j, N' -> N) P[i][j].NT[N'].bestProb = P[i][j].NT[N].bestProb * prob(N' -> N) change = true return P
function CKY-PParse-3(words, G) Set P to a (n+1) x (n+1) matrix for j = 1 to Length(words) do for i = j-1 downto 0 do for each non-terminal N in G do P[i][j].NT[N].bestIndex = -1 P[i][j].NT[N].bestProb = 0 for j = 1 to Length(words) do for each rule N -> words[j] in G do P[j-1][j].NT[N].bestIndex = (j, N -> words[j]) P[j-1][j].NT[N].bestProb = log(prob(N -> words[j]))) change = true while change do change = false for each non-terminal N in G do if P[j-1][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[j-1][j].NT[N'].bestIndex == -1 then P[j-1][j].NT[N'].bestIndex = (j, N' -> N) P[j-1][j].NT[N'].bestProb = P[j-1][j].NT[N].bestProb + log(prob(N' -> N)) change = true for j = 2 to Length(words) do for i = j-2 downto 0 do for k = 1 to j-1 do for each rule N -> A B in G such that P[i][k].NT[A].bestProb > -INFINITY and P[k][j].NT[B].bestProb > -INFINITY then if P[i][j].NT[N].bestProb < log(prob(N -> A B)) + P[i][k].NT[A].bestProb + P[k][j].NT[B].bestProb then P[i][j].NT[N].bestIndex = (k, N -> A B) P[i][j].NT[N].bestProb = log(prob(N -> A B)) + P[i][k].NT[A].bestProb + P[k][j].NT[B].bestProb then change = true while change do change = false for each non-terminal N in G do if P[i][j].NT[N].bestIndex != -1 and there is a rule N' -> N in G and P[i][j].NT[N'].bestProb < P[i][j].NT[N].bestProb + log(prob(N' -> N)) then P[i][j].NT[N'].bestIndex = (j, N' -> N) P[i][j].NT[N'].bestProb = P[i][j].NT[N].bestProb + log(prob(N' -> N)) change = true return P
Pronoun | Non-Pronoun | |
Subject | 91% | 9% |
Object | 34% | 66% |
We will look briefly at each of them in turn wrt the following basic operations:
Task | Vocabulary | Error Rate (%) |
TI Digits | 11 | 0.5 |
Wall Street Journal read speech | 5,000 | 0.3 |
Wall Street Journal read speech | 20,000 | 3 |
Broadcast news | 64,000+ | 10 |
Conversational Telephone Speech (CTS) | 64,000+ | 20 |
These rates are currently decreasing by about 10% every year due to algorithmic and hardware improvements (J&M, p. 287); however, other factors can still dramatically increase errors rates over and above those in the table (e.g., 3-4x by strong accent and 2-4x with noisy environment).
Concatenative synthesis is the most popularly-used technique, and is typically done relative to either diphones (which cover the middle of one phone to the middle of the next phone, to accommodate phone co-articulation effects) or arbitrary-length units (which accommodate word co-articulation effects).
These algorithms have differing performances, knowledge-base requirements, and running times and space requirements. Hence, no POS tagger is good in all applications.
I hope the above helps, and I wish you all the best of luck with this exam.
Note that the last two characteristics must be stated relative to a particular application and semantic domain.
Many of these shortcomings can be mitigated using more complex representations and representation-manipulation mechanisms; however, given the additional processing costs associated with these more complex schemes, the best ways of encoding and manipulating the semantics of utterances is still a very active area of research.
Each one- and two--person talk is alloted 15 and 20 minutes, respectively. Five minutes of each talk at the end will be for questions, and multi-person talks are expected to divide the remaining time up approximately equally between the people involved. You may use the in-class blackboard and AV facilities as you please during your talks. You may also structure your talks as you please; a general guideline would be to not try to cram all the details of your project into the talk but rather to give an introductory "advertisement" stressing (as in your Course Proposals) why your topic is interesting and previous work on that topic.
Here's to each of you having fun working on your course project!
as well as use of sentence-forms that establish a common entity-focus, e.g. (J&M, Examples 21.6 and 21.7),
(see also The Personality Forge and The Chatterbot Collection); however, they all rely on modifications of the mechanisms pioneered in ELIZA and PARRY.
The extreme versions of each school can be seen as the endpoints of a continuum, and it is generally acknowledged that the mechanisms underlying actual language acquisition probably lie somewhere in the middle.
All of these algorithms employ depth-first search on lexical transducer, with various types of mechanisms for pruning relative to cut-off error distance.
Created: June 28, 2016
Last Modified: November 9, 2016