Machine Learning: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
=External=
* http://svmlight.joachims.org, https://cornell.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx#folderID=%227e009023-a44a-4594-bb65-362a66b3985f%22
=Internal=
=Internal=


* [[MATLAB Octave|MATLAB/Octave]]
* [[MATLAB Octave|MATLAB/Octave]]
* [[Lakehouse]]


=Machine Learning=
=Subjects=
 
The science of getting computers to learn without being explicitly programmed ([https://en.wikipedia.org/wiki/Arthur_Samuel Arthur Samuel]).
 
A computer program is said to ''learn'' from ''experience'' E with respect to some ''task'' T and some ''performance measure'' P, if its performance on T, as measured by P, improves with experience E (Tom Mitchell)
 
=Neural Networks=
 
Learning algorithms that mimic how the brain works.
 
=Natural Language Processing (NLP)=
 
=Learning Algorithms=
 
==Supervised Learning==
 
In supervised learning, we have at our disposal a dataset that tell us which is the "correct answer".
 
Input variables, output (target) variables.
 
A pair of one input variable and one output variable is called a training example.
 
The dataset is called a training set.
 
Definition: given a training set, learn a function h: X -> Y so that h(x) is a "good" predictor for the corresponding value of y. For historical reason, this function is called hypothesis. The learning algorithm outputs a function (h - hypothesis). The hypothesis function maps input variables to output variables. When the target variable we're trying to predict is continuous, the learning problem is a [[#Regression_Problem|regression problem]]. When the target variable can only take a number of discrete values, we call the problem a [[#Classification_Problem|classification problem]].
 
The accuracy of the hypothesis is measured by a cost function. A common cost function is the ''squared error function'' or ''mean squared error''.
 
:[[Image:SquaredErrorCostFunction.png]]
 
We are trying to minimize the cost function.
 
Gradient descent is an algorithm to minimize the cost function (and other functions). Local minimum (local optimum). Alpha is the ''learning rate''.
 
:[[Image:GradientDescent.png]]
 
"Batch" gradient descent - the algorithm uses the entire training set.
 
Gradient descent is iterative.
 
===Regression Problem===
 
The goal is to predict a continuous value.
 
See: {{Internal|Regression|Regression}}
 
===Classification Problem===
 
A ''classification problem'' is the problem of identifying which category (out of a set of categories) an example belongs to. The goal of a classification problem is to predict a discrete value out of a set of possible discrete values.
 
See: {{Internal|Classification|Classification}}
 
==Unsupervised Learning==
 
* Clustering problems.
 
==Reinforcement Learning==
 
==Recommender System==
 
=Feature=
 
Synonymous with ''attribute''.
 
Also ''input''
 
Infinite number of features.


Feature vector.
* [[Machine Learning Concepts and Conventions|Concepts and Conventions]]
* [[Logistic Regression]]
* [[Neural Networks]]
* [[Minimization Algorithms]]
* [[Data Mesh]]
=Tools=
* [[Superset]]
* [[Ray]]


=Parameters=
=TODO=


Theta
* [[Machine Learning TODEPLETE and ERASE]]
* Machine learning systems: TensorFlow, PyTorch, XGBoost, https://dvc.org

Latest revision as of 17:18, 19 July 2023