Posts

Showing posts from February, 2019

Week 8 - Muddiest Point

Between Max and Average pooling which is the better of the two and why? In a CNN what is the function of the softmax layer?

Week 8 - Reading

Chapter 9 This hidden layer is, in turn, used to calculate a corresponding output, yt. Sequences are processed by presenting one element at a time to the network. The key difference from a feed  forward network lies in the recurrent link shown in the figure with the dashed line. This link augments the input to the hidden layer with the activation value of the hidden layer from the preceding point in time. In the commonly encountered case of soft classification, finding yt consists of a softmax computation that provides a normalized probability distribution over the sequential nature of simple recurrent networks can be illustrated by unrolling the network. For applications that involve much longer input sequences, such as speech recognition, character-by-character sentence processing, or streaming of continuous inputs, unrolling an entire input sequence may not be feasible. In these cases, we can unroll the input into manageable fixed-length segments and treat each segment ...

Week 7 - Muddiest Points

In Maximum A Posteriori after seeing a particular number of inputs we choose a hypothesis, but as more data come in do we change the hypothesis?

Week 6 - Muddiest Point

Apart from Alex net are there any other architectures worth mentioning in CNN's?

Week 7 - Reading

Chapter 20 This kind of feedback is called a reward, or reinforcement. In games like chess, the  reinforcement is received only at the end of the game.  We call this a terminal state in the state history sequence.  The agent can be a passive learner or an active learner. A passive learner simply watches the world going by, and tries to learn the utility of being in various states; an active learner must also act using the learned information, and can use its problem generator to suggest explorations of unknown portions of the environment.  The agent learns an action-value function giving the expected utility of taking a given action in a given state. This is called Q-learning.  We define the reward-to-go of a state as the sum of the rewards from that state until a terminal state is reached. Given this definition, it is easy to see that the expected utility of a state is the expected reward-to-go of that state.  A simple method for updating uti...

Week 5 - Muddiest Point

What are some optimal data structures you can use to store the Bayesian network graph?

Week 6 - Reading

Chapter 18 18.1 - 18.6 This chapter (and most of current machine learning research) covers inputs that form a factored representation—a vector of attribute values—and outputs that can be either a continuous numerical value or a discrete value.  We say that learning a (possibly incorrect) general function or rule from specific input–output pairs is called inductive learning.  In unsupervised learning the agent learns patterns in the input even though no explicit feedback is supplied.  In reinforcement learning the agent learns from a series of reinforcements—rewards or punishments.  In supervised learning the agent observes some example input–output pairs and learns a function that maps from input to output. In semi-supervised learning we are given a few labeled examples and must make what we can of a large collection of unlabeled examples.  When the output y is one of a finite set of values (such as sunny, cloudy or rainy), the learning problem is cal...

Week 5 - Reading

Chapter 14 Agents almost never have access to the whole truth about their environment. The right thing to do, the rational decision, therefore, depends on both the relative importance of various goals and the likelihood that, and degree to which, they will be achieved. Probability provides a way of summarizing the uncertainty that comes from our laziness and ignorance. Probability theory makes the same ontological commitment as logic, namely, that facts either do or do not hold in the world. Degree of truth, as opposed to degree of belief, is the subject of fuzzy logic. Prior or unconditional probability; after the evidence is obtained, we talk about posterior or conditional probability. An agent is rational if and only if it chooses the action that yields the highest expected utility, averaged over all the possible outcomes of the action If Agent 1 expresses a set of degrees of belief that violate the axioms of probability theory then there is a betting strategy for Agent 2...

Week 4 - Muddiest Point

Under the umbrella and rain example, In smoothing what does R1 and r2 mean in the equation , is it the day? Also the other question is that when we added 1 to smoothing wont all the smoothing values be the same? Or do we change the value of 1 Could you give some more iterations on backward smoothing