Machine Learning: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 25: Line 25:
Definition: given a training set, learn a function h: X -> Y so that h(x) is a "good" predictor for the corresponding value of y. For historical reason, this function is called hypothesis. The learning algorithm outputs a function (h - hypothesis). The hypothesis function maps input variables to output variables. When the target variable we're trying to predict is continuous, the learning problem is a [[#Regression_Problem|regression problem]]. When the target variable can only take a number of discrete values, we call the problem a [[#Classification_Problem|classification problem]].
Definition: given a training set, learn a function h: X -> Y so that h(x) is a "good" predictor for the corresponding value of y. For historical reason, this function is called hypothesis. The learning algorithm outputs a function (h - hypothesis). The hypothesis function maps input variables to output variables. When the target variable we're trying to predict is continuous, the learning problem is a [[#Regression_Problem|regression problem]]. When the target variable can only take a number of discrete values, we call the problem a [[#Classification_Problem|classification problem]].


Cost function: Squared error function (common used one for regression functions).
The accuracy of the hypothesis is measured by a cost function. A common cost function is the ''squared error function''.
 
We are trying to minimize the cost function.


===Regression Problem===
===Regression Problem===

Revision as of 22:27, 18 December 2017

Machine Learning

The science of getting computers to learn without being explicitly programmed (Arthur Samuel).

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E (Tom Mitchell)

Neural Networks

Learning algorithms that mimic how the brain works.

Natural Language Processing (NLP)

Learning Algorithms

Supervised Learning

In supervised learning, we have at our disposal a dataset that tell us which is the "correct answer".

Input variables, output (target) variables.

A pair of one input variable and one output variable is called a training example.

The dataset is called a training set.

Definition: given a training set, learn a function h: X -> Y so that h(x) is a "good" predictor for the corresponding value of y. For historical reason, this function is called hypothesis. The learning algorithm outputs a function (h - hypothesis). The hypothesis function maps input variables to output variables. When the target variable we're trying to predict is continuous, the learning problem is a regression problem. When the target variable can only take a number of discrete values, we call the problem a classification problem.

The accuracy of the hypothesis is measured by a cost function. A common cost function is the squared error function.

We are trying to minimize the cost function.

Regression Problem

The goal is to predict a continuous value.

See:

Regression

Classification Problem

A classification problem is the problem of identifying which category (out of a set of categories) an example belongs to. The goal of a classification problem is to predict a discrete value out of a set of possible discrete values.

See:

Classification

Unsupervised Learning

  • Clustering problems.

Reinforcement Learning

Recommender System

Feature

Synonymous with attribute.

  • Infinite number of features.

Organizatorium