borea17
Paper Summaries
ML101
On this page
Probability Theory
Engineering
ML101
This is a loose collection of questions that I stumbled over during my studies.
Probability Theory
Why do we use the cross entropy loss in classification?
How can the variance of the score function estimator be decreased?
What is the difference between Pearson and Spearman correlation?
How is linear regression connected to maximum likelihood?
What is a score function estimator (REINFORCE estimator)?
What is the ELBO?
No matching items
Engineering
How to map numbers in an interval
\([a, b]\)
onto another interval
\([c, d]\)
?
Why should a MSE loss be avoided after a sigmoid layer?
How does the attention mechanism (
Attention is all you need
) work?
How can the average loss be calculated with batch means?
No matching items