The NLP Student ⁽⁷⁾

May 26, 2023

Summary of the "Generalization Bounds via Convex Analysis" Paper [Draft]

The "Generalization Bounds via Convex Analysis" paper by Gergely Neu, and Gábor Lugosi, discusses the generalization error of supervised learning algorithms and how it can be bounded regarding the mutual information between their input and output. The authors generalize this result beyond the standard choice…

April 9, 2023

"Deep Reinforcement Learning From Human Preferences" Paper Explained

This paper is the work of a collaboration with Deep Mind and Open AI, improving the field of Deep Reinforcement Learning. The ideas discussed in this paper are also a key component in the training of GPT-4. RL agents need good reward functions to learn complex tasks. However, it…

April 2, 2023

What is Information Entropy?

In this article we would be going over the concept of information entropy, a vital topic in machine learning and information theory. Before we understand entropy, a prerequisite is understanding the expected value, a topic in probability theory. Expected value is a mathematical concept that…

March 31, 2023

One Shot Learning, Few Shot Learning, and Similarity

In this article we would be going over one shot learning, and few shot learning. Let's say you take a trip to the zoo with a kid. The kid is excited to learn about the various animals in the zoo, their names, and how they…

October 23, 2021

The Multi-Armed Bandit Problem

In solving the problem of exploration vs. exploitation in reinforcement learning, we use bandit problems to understand and apply algorithms that balance RL agents' exploration and exploitation behaviors. What exactly is the multi-armed bandit problem, and how can you solve this type of problem in…