fastai numerical linear algebra - course logistics (archived)
This post contains my notes and highlights on the original content. I marked my notes with ‘Baris:’, for differentiating them from the original text.
Notebook
You can read an overview of this Numerical Linear Algebra course in this blog post. The course was originally taught in the University of San Francisco MS in Analytics graduate program. Course lecture videos are available on YouTube (note that the notebook numbers and video numbers do not line up, since some notebooks took longer than 1 video to cover).
You can ask questions about the course on our fast.ai forums.
0. Course Logistics
Ask Questions
Let me know how things are going. This is particularly important since I'm new to MSAN, I don't know everything you've seen/haven't seen.
Intro
My background and linear algebra love:
Swarthmore College: linear algebra convinced me to be a math major! (minors in CS & linguistics) I thought linear algebra was beautiful, but theoretical
Duke University: Math PhD. Took numerical linear algebra. Enjoyed the course, but not my focus
Research Triangle Institute: first time using linear algebra in practice (healthcare economics, markov chains)
Quant: first time working with lots of data, decided to become a data scientist
Uber: data scientist
Hackbright: taught software engineering. Overhauled ML and collaborative filtering lectures
fast.ai: co-founded to make deep learning more accessible. Deep Learning involves a TON of linear algebra
Teaching
Teaching Approach
I'll be using a top-down teaching method, which is different from how most math courses operate. Typically, in a bottom-up approach, you first learn all the separate components you will be using, and then you gradually build them up into more complex structures. The problems with this are that students often lose motivation, don't have a sense of the "big picture", and don't know what they'll need.
If you took the fast.ai deep learning course, that is what we used. You can hear more about my teaching philosophy in this blog post or in this talk.
Harvard Professor David Perkins has a book, Making Learning Whole in which he uses baseball as an analogy. We don't require kids to memorize all the rules of baseball and understand all the technical details before we let them play the game. Rather, they start playing with a just general sense of it, and then gradually learn more rules/details as time goes on.
All that to say, don't worry if you don't understand everything at first! You're not supposed to. We will start using some "black boxes" or matrix decompositions that haven't yet been explained, and then we'll dig into the lower level details later.
To start, focus on what things DO, not what they ARE.
People learn by:
doing (coding and building)
explaining what they've learned (by writing or helping others)
Text Book
The book Numerical Linear Algebra by Trefethen and Bau is recommended. The MSAN program has a few copies on hand.
A secondary book is Numerical Methods by Greenbaum and Chartier.
Basics
Office hours: 2:00-4:00 on Friday afternoons. Email me if you need to meet at other times.
My contact info: rachel@fast.ai
Class Slack: #numerical_lin_alg
Email me if you will need to miss class.
Jupyter Notebooks will be available on Github at: https://github.com/fastai/numerical-linear-algebra Please pull/download before class. Some parts are removed for you to fill in as you follow along in class. Be sure to let me know THIS WEEK if you are having any problems running the notebooks from your own computer. You may want to make a separate copy, because running Jupyter notebooks causes them to change, which can create github conflicts the next time you pull.
Check that you have MathJax running (which renders LaTeX, used for math equations) by running the following cell:
$$
e^{\theta i} = \cos(\theta) + i \sin(\theta)
$$
check that you can import:
In [2]:
import numpy as np
import sklearn
Grading Rubric:
Assignment | Percent |
---|---|
Attendance | 10% |
Homework | 20% |
Writing: proposal | 10% |
Writing: draft | 15% |
Writing: final | 15% |
Final Exam | 30% |
Honor Code
No cheating nor plagiarism is allowed, please see below for more details.
On Laptops
I ask you to be respectful of me and your classmates and to refrain from surfing the web or using social media (facebook, twitter, etc) or messaging programs during class. It is absolutely forbidden to use instant messaging programs, email, etc. during class lectures or quizzes.
Syllabus
Topics Covered:
1. Why are we here?
Matrix and Tensor Products
Matrix Decompositions
Accuracy
Memory use
Speed
Parallelization & Vectorization
2. Topic Modeling with NMF and SVD
Topic Frequency-Inverse Document Frequency (TF-IDF)
Singular Value Decomposition (SVD)
Non-negative Matrix Factorization (NMF)
Stochastic Gradient Descent (SGD)
Intro to PyTorch
Truncated SVD, Randomized SVD
3. Background Removal with Robust PCA
Robust PCA
Randomized SVD
LU factorization
4. Compressed Sensing for CT scans with Robust Regression
- L1 regularization
5. Predicting Health Outcomes with Linear Regression
Linear regression
Polynomial Features
Speeding up with Numba
Regularization and Noise
Implementing linear regression 4 ways
6. PageRank with Eigen Decompositions
Power Method
QR Algorithm
Arnoldi Iteration
7. QR Factorization
Gram-Schmidt
Householder
Stability
Writing Assignment
Writing Assignment: Writing about technical concepts is a hugely valuable skill. I want you to write a technical blog post related to numerical linear algebra. A blog is like a resume, only better. Technical writing is also important in creating documentation, sharing your work with co-workers, applying to speak at conferences, and practicing for interviews. (You don't actually have to publish it, although I hope you do, and please send me the link if you do.)
Always cite sources, use quote marks around quotes. Do this even as you are first gathering sources and taking notes. If you plagiarize parts of someone else's work, you will fail.
Can be done in a Jupyter Notebook (Jupyter Notebooks can be turned into blog posts) or a Kaggle Kernel
For the proposal, write a brief paragraph about the problem/topic/experiment you plan to research/test and write about. You need to include 4 sources that you plan to use: these can include Trefethen, other blog posts, papers, or books. Include a sentence about each source, stating what it's in it.
Feel free to ask me if you are wondering if your topic idea is suitable!
Excellent Technical Blogs
Examples of great technical blog posts:
Peter Norvig (more here)
Julia Evans (more here)
find more on twitter
Deadlines
Assignment | Dates |
---|---|
Homeworks | TBA |
Writing: proposal | 5/30 |
Writing: draft | 6/15 |
Writing: final | 6/27 |
Final Exam | 6/29 |
Linear Algebra
We will review some linear algebra in class. However, if you find there are concepts you feel rusty on, you may want to review on your own. Here are some resources:
3Blue1Brown Essence of Linear Algebra videos about geometric intuition (fantastic! gorgeous!)
Lectures 1-6 of Trefethen
Immersive linear algebra free online textbook with interactive graphics
Chapter 2 of Ian Goodfellow's Deep Learning Book
Incoming Internal References (0)
Outgoing Internal References (0)
Outgoing Web References (26)
-
www.fast.ai/2017/07/17/num-lin-alg
- this blog post
-
www.usfca.edu/arts-sciences/graduate-programs/analytics
- University of San Francisco MS in Analytics
-
www.youtube.com/playlist?list=PLtmWHNX-gukIc92m1K0P6bIOnZb-mg0hY
- available on YouTube
-
forums.fast.ai/c/lin-alg
- our fast.ai forums
-
www.fast.ai/2016/10/08/teaching-philosophy
- in this blog post
-
vimeo.com/214233053
- in this talk
-
www.amazon.com/Making-Learning-Whole-Principles-Transform/dp/0470633719
- Making Learning Whole
-
www.amazon.com/Numerical-Linear-Algebra-Lloyd-Trefethen/dp/0898713617
- **Numerical Linear Algebra**
-
www.amazon.com/Numerical-Methods-Analysis-Implementation-Algorithms/dp/0691151229
- **Numerical Methods**
-
github.com/fastai/numerical-linear-algebra
- https://github.com/fastai/numerical-linear-algebra
-
www.fast.ai/2017/04/06/alternatives
- A blog is like a resume, only better
-
nbviewer.org/github/fastai/numerical-linear-algebra/blob/master/nbs/Project_ideas.txt
- List of ideas here
-
www.kaggle.com/xenocide/content-based-anime-recommender
- Kaggle Kernel
-
nbviewer.jupyter.org/url/norvig.com/ipython/ProbabilityParadox.ipynb
- Peter Norvig
-
norvig.com/ipython
- here
-
merity.com/articles/2017/deepcoder_and_ai_hype.html
- Stephen Merity
-
codewords.recurse.com/issues/five/why-do-neural-networks-think-a-panda-is-a-vulture
- Julia Evans
-
jvns.ca/blog/2014/08/12/what-happens-if-you-write-a-tcp-stack-in-python
- here
-
blog.juliaferraioli.com/2016/02/exploring-world-using-vision-twilio.html
- Julia Ferraioli
-
blog.echen.me/2014/10/07/moving-beyond-ctr-better-recommendations-through-human-evaluation
- Edwin Chen
-
blog.slavv.com/picking-an-optimizer-for-style-transfer-86e7b8cba84b
- Slav Ivanov
-
ackernoon.com/non-artistic-style-transfer-or-how-to-draw-kanye-using-captain-picards-face-c4a50256b814
- Brad Kenstler
-
witter.com/math_rachel
- more on twitter
-
www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
- 3Blue1Brown Essence of Linear Algebra
-
immersivemath.com/ila
- Immersive linear algebra
-
www.deeplearningbook.org/contents/linear_algebra.html
- Chapter 2