Author: Prateek Sharma
K- Nearest Neighbor, popular as K-Nearest Neighbor (KNN), is an algorithm that helps to assess the properties of a new variable with the help of the properties of existing variables. KNN is applicable in classification as well as regression predictive problems.
analyse with SPSS, classification in supervised learning, Supervised learning, trend discovery
Instrumental variable is a third variable that estimates causal relationships in the regression analysis when an endogenous variable is present. Instrumental variables are useful when the independent variable in the regression model correlates with the error term in the model.
detection in supervised learning, Supervised learning
In statistics, to increase the prediction accuracy and interpret-ability of the model, LASSO (Least Absolute Shrinkage and Selection Operator) is extremely popular. It is a regression procedure that involves selection and regularisation and was developed in 1989. Lasso regression is an extension of linear regression that uses shrinkage. The lasso imposes a constraint on the sum of the absolute values of the model parameters. Here the sum has a specific constant as an upper bound.
exploratory model analysis, regressions in supervised learning, Supervised learning
Missing data is one of the most common problems in almost all statistical analyses. If the data is not available for all the observations of variables in the model, then it is a case of ‘missing data’.
estimation in supervised learning, Supervised learning, trend analysis
Markov chain is one of the most important tests in order to deal with independent trials processes. There are two major principal theorems for these processes. The first one is the ‘Law of Large Numbers’ and the second one is the ‘Central Limit Theorem’.
estimation in supervised learning, Supervised learning
Bootstrap and jackknife are superficially similar statistical techniques that involve re-sampling the data. They are nonparametric and specific resampling techniques that can estimate standard errors and confidence intervals of a population parameter.
estimation in supervised learning, Supervised learning