Machine Learning TODEPLETE and ERASE

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Machine Learning

The science of getting computers to learn without being explicitly programmed (Arthur Samuel).

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E (Tom Mitchell)

Neural Networks

Learning algorithms that mimic how the brain works.

Natural Language Processing (NLP)

Learning Algorithms

Supervised Learning

In supervised learning, we have at our disposal a dataset that tell us which is the "correct answer".

Input variables, output (target) variables.

A pair of one input variable and one output variable is called a training example.

The dataset is called a training set.

Definition: given a training set, learn a function h: X -> Y so that h(x) is a "good" predictor for the corresponding value of y. For historical reason, this function is called hypothesis. The learning algorithm outputs a function (h - hypothesis). The hypothesis function maps input variables to output variables. When the target variable we're trying to predict is continuous, the learning problem is a regression problem. When the target variable can only take a number of discrete values, we call the problem a classification problem.

The accuracy of the hypothesis is measured by a cost function. A common cost function is the squared error function or mean squared error.

SquaredErrorCostFunction.png

We are trying to minimize the cost function.

Gradient descent is an algorithm to minimize the cost function (and other functions). Local minimum (local optimum). Alpha is the learning rate.

GradientDescent.png

"Batch" gradient descent - the algorithm uses the entire training set.

Gradient descent is iterative.

Regression Problem

The goal is to predict a continuous value.

See:

Regression

Classification Problem

A classification problem is the problem of identifying which category (out of a set of categories) an example belongs to. The goal of a classification problem is to predict a discrete value out of a set of possible discrete values.

See:

Classification

Unsupervised Learning

  • Clustering problems.

Reinforcement Learning

Recommender System

Feature

Synonymous with attribute.

Also input

Infinite number of features.

Feature vector.

Parameters

Theta