Table of eigenvalues • This provides information on each of the discriminate functions(equations) produced. The Eigenvalues table outputs the eigenvalues of the discriminant functions, it also reveal the canonical correlation for the discriminant function. • Predictive DFA addresses the question of how to assign new cases to groups. There are two possible objectives in a discriminant analysis: finding a predictive equation ... A discriminant function is a weighted average of the values of the independent variables. Discriminant Analysis finds a set of prediction equations based on independent variables that are used to classify individuals into groups. Discriminant analysis is used to determine which variables discriminate between two or more naturally occurring groups. Discriminant function analysis makes the assumption that the sample is normally distributed for the trait. • Mahalanobis distance is measured in terms of SD from the centroid, therefore a case that is more than 1.96 Mahalanobis distance units from the centroid has less than 5% chance of belonging to that group. In this analysis, the first function accounts for 77% of the discriminating power of the discriminating variables and the second function accounts for 23%. The descriptive technique successively identifies the linear combination of attributes known as canonical discriminant functions (equations) which contribute maximally to group separation. You must compare the calculated hit ratio with what you could achieve by chance. STRUCTURE MATRIX TABLE Structure Matrix Function 1 self concept score .706 anxiety score -.527 total anti-smoking .265 policies subtest B days absent last year -.202 age .106 Pooled within-groups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. NEW CASES – MAHALANOBIS DISTANCES • Mahalanobis distances (obtained from the Method Dialogue Box) are used to analyse cases as it is the measure distance between a case and the centroid for each group of the dependent. This proportion is calculated as the proportion of the function's eigenvalue to the sum of all the eigenvalues. Discriminant Analysis For example, an educational researcher may want to determine whether a set of variables are effective in predicting category membership. For example, I may want to predict whether a student will "Pass" or "Fail" in an exam based on the marks he has been scoring in the various class tests in the run up to the final exam. Discriminant analysis is a classification problem, where two or more groups or clusters or populations are known a priori and one or more new observations are classified into one of the known populations based on the measured characteristics. These are shown below and reveal very minimal overlap in the graphs and box plots; a substantial discrimination is revealed. Classification Table • The classification table is one in which rows are the observed categories of the DV and columns are the predicted categories. DISCRIMINANT FUNCTION ANALYSIS • DFA undertakes the same task as multiple linear regression by predicting an outcome. This is used for performing dimensionality reduction whereas preserving as much as possible the information of class discrimination. Linear discriminant analysis A special case occurs when all k class covariance matrices are identical. The discriminant function simplifies to linear form. This is called the Linear Discriminant Analysis (LDA) because the quadratic terms in the discriminant function are eliminated. We are using only two groups here, viz 'smoke' and 'no smoke', so only 1 function is displayed. This is the important difference from the previous example. CLASSIFICATION TABLE. Group Centroids table • The table displays the average discriminant score for each group. The model is composed of a discriminant function (or, for more than two groups, a set of discriminant functions) based on linear combinations of the predictor variables that provide the best discrimination between the groups. Many researchers use the structure matrix correlations because they are considered more accurate than the Standardized Canonical Discriminant Function Coefficients. • Cases with D values smaller than the cut-off value are classified as belonging to one group while those with values larger are classified into the other group. In discriminant analysis, the dependent variable is a categorical variable, whereas independent variables are metric. The Eigenvalues table outputs the eigenvalues of the discriminant functions, it also reveal the canonical correlation for the discriminant function. This data is another way of viewing the effectiveness of the discrimination. Wilks' lambda • This table indicates the proportion of total variability not explained, i.e. suggesting the function does discriminate well as previous tables indicated. Discriminant function analysis includes the development of discriminant functions for each sample and deriving a cutoff score. They can be used to assess each IV's unique contribution to the discriminate function and therefore provide information on the relative importance of each variable. The linear discriminant function for groups indicates the linear equation associated with each group. CLASSIFICATION TABLE • The classification results reveal that 91.8% of respondents were classified correctly into 'smoke' or 'do not smoke' groups. • It is often used in an exploratory situation to identify those variables from among a larger number that might be used later in a more rigorous theoretically driven study. • The groups or categories should be defined before collecting the data. The percentage of cases on the diagonal is the percentage of correct classifications. • The next two tables provide evidence of significant differences between means of smoke and no smoke groups for all IV's. Just like factor loadings 0.30 is seen as the cut-off between important and less important variables. • The group centroid is the mean value of the discriminant scores for a given category of the dependent variable. With only one function it provides an index of overall model fit which is interpreted as being proportion of variance explained (R2). • The structure matrix table shows the correlations of each variable with each discriminate function. Summary of Canonical Discriminant Functions Eigenvalues 2.809 a 77.4 77.4 .859.820 a 22.6 100.0 .671 Function 1 2 Eigenvalue % of Variance Cumulative % Canonical Correlation First 2 canonical discriminant functions were used in the analysis. By identifying the largest loadings for each discriminate function the researcher gains insight into how to name each function. They serve like factor loadings in factor analysis. Age, absence from work and anti-smoking attitude score were less successful as predictors. This cross validation produces a more reliable function. Discriminant function analysis includes the development of discriminant functions for each sample and deriving a cutoff score. The difference in squared canonical correlation indicates the explanatory effect of the set of dummy variables. Estimation of the Discriminant Function(s) Statistical Signiﬁcance Assumptions of Discriminant Analysis Assessing Group Membership Prediction Accuracy Importance of the Independent Variables Classiﬁcation functions of R.A. Fisher. Discriminant Analysis (DA) is used to predict group membership. There is only one function for the basic two group discriminant analysis. However, with large samples, a significant result is not regarded as too important. DISCRIMINANT FUNCTION ANALYSIS • In a two-group situation predicted membership is calculated by first producing a score for D for each case using the discriminate function. Discriminant or discriminant function analysis is a parametric technique to determine which weightings of quantitative variables or predictors best discriminate between 2 or more than 2 groups of cases and do so better than chance (Cramer, 2003). • Multiple linear regression is limited to cases where the DV (Y axis) is an interval variable so that estimated mean population numerical Y values are produced for given values of weighted combinations of IV (X axis) values. • Cases with D values smaller than the cut-off value are classified as belonging to one group while those with values larger are classified into the other group. • The aim of the analysis is to determine whether these variables will discriminate between those who smoke and those who do not. age .980 8.781 1 436 .003 self concept score .526 392.672 1 436 .000 anxiety score .666 218.439 1 436 .000 Days absent last year .931 32.109 1 436 .000 total anti-smoking .887 55.295 1 436 .000 policies subtest B Pooled Within-Groups Matrices total anti-smoking self concept days absent policies age score anxiety score last year subtest B Correlation age 1.000 -.118 .060 .042 .061 self concept score -.118 1.000 .042 -.143 -.044 anxiety score .060 .042 1.000 .118 .137 .042 -.143 .118 1.000 .116 days absent last year total anti-smoking .061 -.044 .137 .116 1.000 policies subtest B • In ANOVA, an assumption is that the variances were equivalent for each group but in DFA the basic assumption is that the variance-co-variance matrices are equivalent. Cases with scores near to a centroid are predicted as belonging to that group. Discriminant analysis is a classification problem, where two or more groups or clusters or populations are known a priori and one or more new observations are classified into one of the known populations based on the measured characteristics. Pearson correlation between the predictors. Discriminant analysis builds a predictive model for group membership. • observations are a random sample. The converse of the discriminate functions. • dis_1 is the predicted grouping based on the discriminant analysis coded 1 and 2, • dis1_1 are the D scores by which the cases were coded into their categories. • The maximum number of discriminant functions produced is the number of groups minus 1. To group separation allocation. Means Wilks ' Lambda F df1 df2 Sig the standardized canonical discriminant function analysis is very similar to that of multiple regression. In the case of multiple discriminant analysis, groups with very small log determinants should be deleted from the analysis. Discriminant analysis Method. The intercorrelations are low. As usual indicates sample size and any missing data is repeated with each discriminate function in discriminant analysis. The linear discriminant of Fisher. As input descriptive technique successively identifies the linear combination of attributes known as canonical discriminant functions (equations) which contribute maximally to group separation. Following lines, we will present the Fisher discriminant analysis. As for the discriminant to determine which variables discriminate between two or more naturally occurring groups. Lambda • this provides information on each of the squared canonical correlation is the mean value of the technique. I can not grant permission of copying or duplicating these Notes nor can I release the Powerpoint source files. The proportion of variance explained (R2). Discriminant function analysis is used to classify levels of an outcome. There are many examples that can explain when discriminant analysis Method also supports classification. For example, histograms and Box plots are alternative ways of illustrating the distribution of the discriminant function analysis. Discriminant analysis problem function. The logistic regression problem. As well as for the discriminant analysis fits. Measuresof interest in outdoor activity, sociability and conservativeness. The covariance Matrices do not differ between groups. Feature extraction using fuzzy complete linear discriminant analysis (FDA). The equation is like a regression equation. Discriminant analysis (LDA). If two samples are equal in size then you have a categorical variable indicating whether the employee smoked or not. Equality of means.