Techniques and algorithms for associating relatively surface-level structures and information with natural language corpora, including POS tagging, morphological analysis, preprocessing/segmentation named-entity recognition, chunk parsing, and word-sense disambiguation. Examines linguistic resources that can be leveraged for these tasks (e.g., WordNet). Prerequisite: a minimum grade of 2.7 in each of CSE 326 or equivalent, STAT 391 or equivalent, and LING 473 or passing score on the placement exam. Offered: A.
Student learning goals
General method of instruction
Prerequisites: (1) if you are a CLMA student, you need to pass either the CLMA placement test or LING473 first, and you should register for Section C. (2) if you are a NLT (Natural Language Technology Certificate) student, you need to take and pass LING473 first, and you should register for Section A. (3) for everyone else, before taking the course, you need to know the following: programming (C/C++, Java, Perl, or Python), unix, a college-level course on statistics and probability, finite-state automaton, regular expression, regular grammar, and context-free grammar. You should register for Section B.
If you believe you meet the prerequisites, please email Joyce at firstname.lastname@example.org for the add code. In your email, please specify the following: - Which section are you registering for?
- explain how you meet the prerequisites: e.g., including your grades for the placement test and/or LING 473 for CLMA/NLT students, and an unofficial transcripts for non-CLMA/NLT students.
Class assignments and grading