Natural Language Processing & Speech Recognition

Course details

Natural Language Processing (NLP) and Speech Recognition are at the forefront of modern AI applications, from chatbots to voice assistants. This course will provide students with a comprehensive understanding of the foundational and advanced concepts of NLP and Speech Recognition. Through a mix of theoretical concepts and hands-on labs, students will be prepared to develop and optimize real-world applications.

Start Date

TBD

Duration

14-15 weeks

Prerequisites

Foundational knowledge in AI & ML, and basic programming skills.

Learning Objectives

Grasp the fundamental concepts of NLP and its application in AI.
Understand and implement Speech Recognition techniques.
Engage with text and speech data, preprocessing, and analysis.
Create models for sentiment analysis, entity recognition, and machine translation.
Apply advanced techniques in deep learning to NLP & Speech Recognition tasks.

Key Features

Practical labs using popular NLP libraries such as NLTK, spaCy, and HuggingFace.
Analysis of real-world data sets.
Exposure to the latest in voice technology applications.
Exploration of ethical considerations in NLP & Speech Recognition.

Learning path

Unit 1: Introduction to NLP (3)
Unit 2: Classic NLP Techniques (5)
Unit 3: Advanced NLP with Deep Learning (4)
Unit 4: Introduction to Speech Recognition (3)
Unit 5: Deep Learning in Speech Recognition (4)
Unit 6: Applications & Use Cases (4)
Unit 7: Challenges & Future Directions (4)
Unit 8: Wrap-up and Review (1)

Topics covered

Overview of NLP
Language Models: n-grams, Bag of Words
Text Preprocessing and Normalization

Part-of-Speech Tagging
Parsing and Syntax Trees
Named Entity Recognition
Sentiment Analysis
Topic Modeling with LDA

Word Embeddings: Word2Vec, GloVe
RNNs for Text
Transformers & BERT
Machine Translation

Fundamentals of Speech Processing
Feature Extraction in Speech: MFCCs
Hidden Markov Models (HMMs)

CNNs for Speech Data
DeepSpeech & RNN Transducer Models
Attention Mechanisms in Speech Recognition
State-of-the-Art Models and Benchmarks

Chatbots and Virtual Assistants
Voice Assistants: Siri, Alexa, Google Assistant
Audio-to-Text Services
Multimodal Models (Text, Image, Speech)

Handling Multilingual and Diverse Dialects
Bias and Fairness in NLP
Ethical Considerations in Speech Recognition
The Future of Interactive AI

Course Review and Exploration of Current Research

Grading:

Class Participation: 10%
Weekly Assignments: 30%
Group Projects: 20%
Mid-Term Examination: 20%
Final Examination: 20%

Reading Material: Combination of standard textbooks, recent research papers, and articles from the domain of NLP & Speech Recognition.

Assignments: Hands-on labs and projects, focusing on real-world problems and data sets.

Faculty: Expert in the fields of NLP and Speech Recognition.

Teaching Instructors: TBD