Receiver Operating Curve (ROC) is an extension of such classifications. Performance of binary classifier system in the case of ROC analysis can be tested.
By
Prateek Sharma and Priya Chetty on July 16, 2018 2 Comments
K- Nearest Neighbor, popular as K-Nearest Neighbor (KNN), is an algorithm that helps to assess the properties of a new variable with the help of the properties of existing variables. KNN is applicable in classification as well as regression predictive problems.
By
Prateek Sharma and Priya Chetty on May 4, 2018 2 Comments
Instrumental variable is a third variable that estimates causal relationships in the regression analysis when an endogenous variable is present. Instrumental variables are useful when the independent variable in the regression model correlates with the error term in the model.
By
Prateek Sharma and Priya Chetty on April 3, 2018 1 Comment
In statistics, to increase the prediction accuracy and interpret-ability of the model, LASSO (Least Absolute Shrinkage and Selection Operator) is extremely popular. It is a regression procedure that involves selection and regularisation and was developed in 1989. Lasso regression is an extension of linear regression that uses shrinkage. The lasso imposes a constraint on the sum of the absolute values of the model parameters. Here the sum has a specific constant as an upper bound.
By
Prateek Sharma and Priya Chetty on March 9, 2018 No Comments
Missing data is one of the most common problems in almost all statistical analyses. If the data is not available for all the observations of variables in the model, then it is a case of ‘missing data’.
By
Prateek Sharma and Priya Chetty on February 27, 2018 2 Comments
Markov chain is one of the most important tests in order to deal with independent trials processes. There are two major principal theorems for these processes. The first one is the ‘Law of Large Numbers’ and the second one is the ‘Central Limit Theorem’.
By
Prateek Sharma and Priya Chetty on February 26, 2018 No Comments
Bootstrap and jackknife are superficially similar statistical techniques that involve re-sampling the data. They are nonparametric and specific resampling techniques that can estimate standard errors and confidence intervals of a population parameter.
Neural network, popularly known as Artificial Neural Network (ANN) is an information processing system with a large number of nodes and connections as part of a structure which helps in processing complex information.