自然语言处理 / Natural Language Processing

本课程介绍自然语言处理的基本概念、技术和应用,包括词法分析、句法分析、语义理解、机器翻译等内容。This course introduces the basic concepts, techniques, and applications of natural language processing, including lexical analysis, syntactic analysis, semantic understanding, machine translation, etc.

Instructor: 董兴波 / Dong Xingbo

Term: 春季 / Spring

Location: 教学楼 / Teaching Building, 博南C103

Time: 周2和周4 / 10:00-11:30 AM

课程概述 / Course Overview

本课程提供自然语言处理的全面介绍,涵盖从基础概念到前沿技术的各个方面。学生将:

This course provides a comprehensive introduction to natural language processing, covering various aspects from basic concepts to cutting-edge technologies. Students will:

  • 了解自然语言处理的基本原理和方法
  • 掌握自然语言处理的核心技术和工具
  • 能够设计和实现简单的自然语言处理系统
  • 了解自然语言处理在各个领域的应用

  • Understand the basic principles and methods of natural language processing
  • Master the core technologies and tools of natural language processing
  • Be able to design and implement simple natural language processing systems
  • Understand the applications of natural language processing in various fields

先决条件 / Prerequisites

  • 基础编程知识(优选Python)
  • 线性代数和概率统计基础
  • 人工智能基础概念

  • Basic programming knowledge (preferably Python)
  • Basic linear algebra and probability statistics
  • Basic concepts of artificial intelligence

教材 / Textbooks

  • 《自然语言处理导论》,宗成庆,清华大学出版社
  • “Speech and Language Processing” by Daniel Jurafsky and James H. Martin
  • “Natural Language Processing with Python” by Steven Bird, Ewan Klein, and Edward Loper

评分标准 / Grading

  • 作业:30%
  • 项目:40%
  • 考试:20%
  • 参与:10%

  • Assignments: 30%
  • Project: 40%
  • Exam: 20%
  • Participation: 10%

Schedule

Week Date Topic Materials
1 Sep 1 课程介绍 / Course Introduction

自然语言处理概述,课程安排和要求。Overview of natural language processing, course schedule and requirements.

2 Sep 8 词法分析 / Lexical Analysis

分词、词性标注等词法分析技术。Tokenization, part-of-speech tagging and other lexical analysis techniques.

3 Sep 15 句法分析 / Syntactic Analysis

短语结构分析、依存分析等句法分析方法。Phrase structure analysis, dependency parsing and other syntactic analysis methods.

4 Sep 22 语义理解 / Semantic Understanding

语义角色标注、情感分析等语义理解技术。Semantic role labeling, sentiment analysis and other semantic understanding techniques.

5 Sep 29 机器翻译 / Machine Translation

统计机器翻译、神经机器翻译等翻译技术。Statistical machine translation, neural machine translation and other translation techniques.

6 Oct 6 问答系统 / Question Answering Systems

基于规则和基于统计的问答系统。Rule-based and statistical question answering systems.

7 Oct 13 文本摘要 / Text Summarization

抽取式和生成式文本摘要方法。Extractive and abstractive text summarization methods.

8 Oct 20 语言模型 / Language Models

n-gram模型、神经语言模型等。n-gram models, neural language models, etc.

9 Oct 27 预训练模型 / Pre-trained Models

BERT、GPT等预训练语言模型。BERT, GPT and other pre-trained language models.

10 Nov 3 对话系统 / Dialogue Systems

任务型对话系统、闲聊系统等。Task-oriented dialogue systems, chit-chat systems, etc.

11 Nov 10 文本分类 / Text Classification

情感分析、主题分类等文本分类任务。Sentiment analysis, topic classification and other text classification tasks.

12 Nov 17 信息抽取 / Information Extraction

命名实体识别、关系抽取等信息抽取任务。Named entity recognition, relation extraction and other information extraction tasks.

13 Nov 24 自然语言生成 / Natural Language Generation

文本生成、机器翻译等生成任务。Text generation, machine translation and other generation tasks.

14 Dec 1 自然语言处理应用 / NLP Applications

教育、医疗、金融等领域的自然语言处理应用。NLP applications in education, healthcare, finance and other fields.

15 Dec 8 课程总结 / Course Summary

课程内容总结,未来发展趋势。Course content summary, future development trends.

16 Dec 15 期末考试 / Final Exam

课程期末考试。Course final exam.