In preparation for this assignment, read the "Maladaptive Perfectionism as a Mediator and Moderator Between Adult Attachment and Depressive Mood" article located in the Topic 3 readings.Write a 750-1,
British Journal of Science 55 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 M ultivariate Statistical Analysis Mushtak A.K. Shiker Mathematical Department Education College for Pure Science – Babylon University Abstract Multivariate Analysis contain many Techniques which can be used to analyze a set of data. In this paper we deal with these techniques with its useful and difficult . And we provide an executive understanding of these multivariate analysis techniques, resulting in an understanding of the appropriate uses for each of them, which help researchers to understanding th e types of research questions that can be formulated and the capabilities and limitations of each technique in answering those questions. Then we gave an application of one of these techniques . Historical View In 1889 Galton gave us the normal di stribution, this statistical methods used in traditional, established the correlation coefficient and linear regression . In thirty's of 20th century Fisher proposed analysis of variance and discriminant analysis, SS Wilkes developed the multivariate analys is of variance, and H. Hotelling determined principal component analysis and canonical correlation. General ly, in t he first half of the 20th century, most of the theory of multivariate analysis has been established. 60 years later, with the development of computer science, psychology, and multivariate analysis methods in the study of many other disciplines have been more widely used. Programs like SAS and SPSS, once restricted to mainframe utilization, are now readily available in Windows -based. The marketi ng research analyst now has access to a much broader array of sophisticated techniques in which to explore the data. The challenge becomes knowing which technique to select, and clearly understanding its strengths and weaknesses. Introduction : Multiv ariate statistical analysis is the use of mathematical statistics methods to study and solve the problem of multi -index theory and methods. The past 20 years, with the computer application technology and the urgent need for research and production, multiva riate statistical analysis techniques are widely used in geology, meteorology, hydrology, medicine, industry, agriculture and economic and many other fields, has become to solve practical problems in effective way. Simplified system architecture to explore the system kernel , can use principal component analysis, factor analysis, correspondence analysis and other methods, a number of factors in each variable to find the best subset of information from a subset of the description contained in the results of m ultivariable systems and the impact of various factors on the system. In multivariate analysis, controlling for the prediction of the model has two categories. One is the prediction model, often using multiple linear regression , stepwise regression an alysis, discriminant analysis or stepwise regression analysis of double screening modeling. The other is a descriptive model, commonly used cluster analysis modeling techniques. In multivariate analysis system, the system requires a similar nature of th ings or phenomena grouped together , to identify the links between them and the inherent regularity , many previous studies are mostly qualitative treatment by a single factor, so the results do not reflect the general characteristics of the system. For exam ple numerical classification, general classification model constructed using cluster analysis and discriminant analysis techniques. British Journal of Science 56 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 In which fields we can use the multivariate analysis? Multivariate techniques are used to study datasets in consumer a nd market research, quality control and quality assurance, process optimization and process control, and research and development. These techniques are particularly important in social science research because social researchers are generally unable to use randomized laboratory experiments, like those used in medicine and natural sciences. Here multivariate techniques can statistically estimate relationships between different variables, and correlate how important each one is to the final outcome and where dependencies exist between them. Why we use multivariate techniques? Because most data analysis tries to answer complex questions involving more than two variables, these questions are best addressed by multivariate statistical techniques. There are several different multivariate techniques to choose from, based on assumptions about the nature of the data and the type of association under analysis. Each technique tests the theoretical models of a research question about associations against the observ ed data. The theoretical models are based on facts plus new hypotheses about plausible associations between variables. Multivariate techniques allow researchers to look at relationships between variables in an overarching way and to quantify the relationsh ip between variables. They can control association between variables by using cross tabulation, partial correlation and multiple regressions, and introduce other variables to determine the links between the independent and dependent variables or to specify the conditions under which the association takes place. This gives a much richer and realistic picture than looking at a single variable and provides a powerful test of significance compared to univariate techniques. What is the difficult of using multiva riate techniques? Multivariate techniques are complex and involve high level mathematics that require a statistical program to analyze the data. These statistical programs are generally expensive. The results of multivariate analysis are not always ea sy to interpret and tend to be based on assumptions that may be difficult to assess. For multivariate techniques to give meaningful results, they need a large sample of data; otherwise, the results are meaningless due to high standard errors. Standard erro rs determine how confident you can be in the results, and you can be more confident in the results from a large sample than a small one. Running statistical programs is fairly straightforward but does require a statistician to make sense of the output. How to choose the appropriate methods to solve practical problems ? 1. The problem needs to be taken into account. Problem can be integrated on a variety of statistical methods used for analysis. For example, a prediction model can be based on the biological, eco logical principles, to determine the theoretical models and experimental design; based on test results, test data collection; preliminary extraction of data; and then apply statistical analysis methods (such as correlation analysis, and gradually regressio n analysis, principal component analysis) to study the correlation between the variables, select the best subset of variables; on this basis construct forecasting models, the final diagnosis of the model and optimization, and applied to actual production. Multivariate analysis, taking into account the multiple response variables of statistical analysis methods. British Journal of Science 57 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 Its main contents include two mean vectors of hypothesis testing, multivariate analysis of variance, principal component analysis, factor analys is, cluster analysis and model -related analysis. 2. So , Over several multivariate analysis methods they have advantages and limitations, each method has its specific assumptions, conditions and data requirements, such as normality, linearity, and the same va riance and so on. Therefore, in the application of multivariate analysis methods, should be examined in the planning stage to determine the theoretical framework to determine what data to collect, how to collect, and how to analyze data. Linear model appro ach Multivariate analysis of commonly used methods include three categories: 1. Multivariate analysis of variance, multiple regression analysis and analysis of covariance, known as the linear model approach to research to determine the independent v ariables and the relationship between the dependent variable. 2. Discriminant function analysis and poly -type of analysis to study the classification of things. 3. Principal component analysis, canonical correlation and factor analysis to study how a combina tion of factors with less instead of more of the original number of variables. Multivariate analysis of variance in total variance in accordance with its source (or experimental design) is divided into several parts, which test the various factors on the d ependent variable and the interaction between the factors of statistical methods. For example, in the analysis of 2 × 2 factorial design data, the total variance can be divided into two factors belong to two groups variation, the interaction between two f actors, and error (ie, variation within the group) and four parts, then interaction between group variation and the significance of the F test . The advantage of multivariate analysis of variance can be tested simultaneously in one study, a number of facto rs with multiple levels of each of the dependent variable and the interaction between various factors. Limits its application is a sample of each level of each factor must be independent random samples, the repeated observations of the data follow a normal distribution, and the population variances are equal. Some Types of Multivariate Analysis Techniques : 1. Multiple regression analysis : Multiple regression analysis used to assess and analyze a number of independent variables with the dependent variab le linear function of the relationship between the statistical methods. A dependent variable y and independent variables x1, x 2, ... x m is a linear regression relationship: Where α , β 1, ... , βm are parameters to be estimated, ε is a random variable that error. Obtained through experiments x1, x 2 , ... , xm of several sets of data and the corresponding y values . Multiple regression analysis has the advantage of a phe nomenon can be described quantitatively between certain factors and a linear function. The known values of each variable into the regression equation can be obtained estimates of the dependent variable (predictor), which can effectively predict the occur rence and development of a phenomenon. It can be used for continuous variables , it also can be used for dichotomous variables. )1( .......... ... 11 m mx x y British Journal of Science 58 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 In this technique we consider the linear relationship between one or more y’s (the dependent or response variables) and one or mo re x’s (the independent or predictor variables).We will use a linear model to relate the y’s to the x’s and will be concerned with estimation and testing of the parameters in the model. One aspect of interest will be choosing which variables to include in the model if this is not already known. We can distinguish three cases according to the number of variables: 1. Simple linear regression: one y and one x. For example, suppose we wish to predict college grade point average (GPA) based on an applicant ’s high school GPA. 2. Multiple linear regression: one y and several x’s. For example, w e could attempt to improve our prediction of college GPA by using more than one independent variable, such like high school GPA, standardized test scores (such as ACT or SAT), or rating of high school. 3. Multivariate multiple linear regression: several y’s and several x’s. In the preceding illustration, we may wish to predict several y’s (such as number of years of college the person will complete GPA in the sciences , arts, and humanities). 2. Logistic Regression Analysis : Sometimes referred to as “choice models,” this technique is a variation of multiple regression that allows for the prediction of an event. It is allowable to utilize nonmetric (typically binary) dep endent variables, as the objective is to arrive at a probabilistic assessment of a binary choice. The independent variables can be either discrete or continuous. A contingency table is produced, which shows the classification of observations as to whether the observed and predicted events match.The sum of events that were predicted to occur which actually did occur and the events that were predicted not to occur which actually did not occur, divided by the total number of events, is a measure of the effecti veness of the model. This tool helps predict the choices consumers might make when presented with alternatives. 3. Multivariate Analysis of Variance (MANOVA) : This technique examines the relationship between several categorical independent variables and tw o or more metric dependent variables. Whereas analysis of variance (ANOVA) assesses the differences between groups (by using T tests for 2 means and F tests between 3 or more means), MANOVA examines the dependence relationship between a set of dependent me asures across a set of groups. Typically this analysis is used in experimental design, and usually a hypothesized relationship between dependent measures is used. This technique is slightly different in that the independent variables are categorical and th e dependent variable is metric. Sample size is an issue, with 15 -20 observations needed per cell. However, too many observations per cell (over 30) and the technique loses its practical significance. Cell sizes should be roughly equal, with the largest cel l having less than 1.5 times the observations of the smallest cell. That is because, in this technique, normality of the dependent variables is important. The model fit is determined by examining mean vector equivalents across gr oups. If there is a signifi cant diff erence in the means, the null hypothesis can be rejected and treatment differences can be determined. British Journal of Science 59 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 4. Factor Analysis : When there are many variables in a research design, it is often helpful to reduce the variables to a smaller set of factors. This is an independence technique, in which there is no dependent variable. Rather, the researcher is looking for the underlying structure of the data matrix. Ideally, the independent variables are normal and continuous, with at least 3 to 5 variables load ing onto a factor. The sample size should be over 50 observations, with over 5 observations per variable. There are two main factor analysis methods: common factor analysis, which extracts factors based on the variance shared by the factors, and principal component analysis, which extracts factors based on the total variance of the factors. Common factor analysis is used to look for the latent (underlying) factors, where as principal components analysis is used to find the fewest number of variables that ex plain the most variance. The first factor extracted explains the most variance. Typically, factors are extracted as long as the eigenvalues are greater than 1.0 or the screw test visually indicates how many factors to extract . In factor analysis we repres ent the variables y1, y 2, . . . , y p as linear combinations of a few random variables f1, f 2, . . . , f m (m < p) called factors. The factors are underlying constructs or latent variables that “generate” the y’s. Like the original variables, the factors var y from individual to individual; but unlike the variables, the factors cannot be measured or observed. The goal of factor analysis is to reduce the redundancy among the variables by using a smaller number of factors. 5. Discriminant function analysis ( descr iption of group separation ) : Discriminant function analysis is used to determine the classification of individual statistical methods. The basic principle is: According to two or more samples of known classes of observational data to identify one or several linear discriminant function and discriminant index, then determine the discriminant function based on another indicator to determine which category an individual belongs. There are two major objectives in separation of groups: 1. Description of group separation, in which linear functions of the variables (discriminant functions) are used to describe or elucidate the differences between two or more groups.The goals of descriptive discriminant analysis include identifying the relative contribut ion of the p variables to separation of the groups and finding the optimal plane on which the points can be projected to best illustrate the configuration of the groups. 2. Prediction or allocation of observations to groups, in which linear or quadratic fu nctions of the variables (classification functions) are employed to assign an individual sampling unit to one of the groups. The measured values in the observation vector for an individual or object are evaluated by the classification functions to find the group to which the individual most likely belongs. Discriminant analysis is not only used for continuous variables and by means of number theory can be used for qualitative data , It helps to objectively determine the classification criteria. However, discriminant analysis can only be the case for the category have been identified. When the class itself is uncertain, we use the pre -separation of the first category with discriminant analysis or with cluster analysis. British Journal of Science 60 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 6. Cluster analysis : In cluste r analysis we search for patterns in a data set by grouping the (multivariate) observations into clusters. The goal is to find an optimal grouping for which the observations or objects within each cluster are similar, but the clusters are dissimilar to eac h other. We hope to find the natural groupings in the data, groupings that make sense to the researcher. So, c luster analysis is used to solve the problem of a statistical classification method. If a given observation of n objects, each object has p observ ed characteristics (variables), how they are clustered into several classes defined? If the object on the observed clustering, known as Q -type analysis; if the variables together class, called the R -type analysis , the basic principle of clustering is to ma ke the internal differences of similar small, but large differences between categories. One of cluster analysis was is the hierarchical clustering method , for example, to n objects into k classes, the first n objects into a class of their own, a total of n classes , then calculate the pairwise kind of "distance" to find the nearest two classes, into a new class , then repeat this process step by step, until the date for k classes. Cluster analysis differs fundamentally from classification analysis . In cl assification analysis, we allocate the observations to a known number of predefined groups or populations. In cluster analysis, neither the number of groups nor the groups themselves are known in advance. To group the observations into clusters, many techn iques begin with similarities between all pairs of observations. In many cases the similarities are based on some measure of distance. Other cluster methods use a preliminary choice for cluster centers or a comparison of within - and between -cluster variabi lity. It is also possible to cluster the variables, in which case the similarity could be a correlation. We can search for clusters graphically by plotting the observations. If there are only two variables (p = 2), we can do this in a scatter plot. For p > 2, we can plot the data in two dimensions using principal components or biplots . Cluster analysis has also been referred to as classification, pattern recognition (specifically, unsupervised learning), and numerical taxonomy. The techniques of clu ster analysis have been extensively applied to data in many fields, such as medicine, psychiatry, sociology, criminology, anthropology, archaeology, geology, geography, remote sensing, market research, economics, and engineering. When the sample size is large, the first n samples can be divided into k classes, and then gradually modified in accordance with the principles of a best until the classification is reasonable so far. Cluster analysis is based on the relationship between the individual or the number of variables to classify, strong objectivity, but a variety of clustering methods can only be achieved under certain conditions, local optimum, the final clustering result is established, the experts still need to identification. Necessary to compare several different methods to choose a more in line with professional requirements of the classification results. For example , the data matrix cab be written as : where is a row (observation vector) and is a column (corresponding to a variable). 7. Multidimensional Scaling (MDS) 2 .... .......... .......... ,..., , )( )2( )1( 2 1 p n y y y y y y y iy )(j y British Journal of Science 61 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 MDS is transform ing consumer judgments of similarity into distances represented in multidimensional space. Thi s is a decompositional approach that uses perceptual mapping to present the dimensions. As an exploratory technique, it is useful in examining unrecognized dimensions about products and in uncovering comparative evaluations of products when the basis for comparison is unknown. The multidimensional scaling is called a dimension reduction technique, In this scale we begin with the distances δi j between each pair of items. We wish to represent the n items in a low -dimensional coordinate system, in which the distances di j between items closely match the original distances δi j, that is, di j δi j for all i, j. The final distances di j are usually Euclidean. The original distances δi j may be actual measured distances between observati ons yi and yj in p dimensions, such as : δi j = …………….. (3) On the other hand, the distances δ i j may be only a proximity or similarity based on human judgment , for example, the perceived degree of similarity between all pairs of brands of a certain type of appliance . The goal of multidimensional scaling is a plot that exhibits information about how the items relate to each other or provides some other meaningful interpretation of the data. For example, the a im may be seriation or ranking; if the points lie close to a curve in two dimensions, then the ordering of points along the curve is used to rank the points. 8. Principal component analysis : In principal component analysis, we seek to maximize th e variance of a linear combination of the variables. For example, we might want to rank students on the basis of their scores on achievement tests in English, mathematics, reading, and so on. An average score would provide a single scale on which to compar e the students, but with unequal weights of these subjects we can spread the students out further on the scale and obtain a better ranking. Essentially, principal component analysis is a one -sample technique applied to data with no groupings among the observations and no partitioning of the variables into subsets y and x . All the linear combinations that we have considered previously were related to other variables or to the data structure. In regression, we have linear combinations of the independent variables that best predict the dependent variable(s) , in canonical correlation, we have linear combinations of a subset of variables that maximally correlate with linear combinations of another subset of variables , and discriminant analysis involves line ar combinations that maximally separate groups of observations. Principal components, on the other hand, are concerned only with the core structure of a single sample of observations on p variables. None of the variables is designated as dependent, and no grouping of observations is assumed. Principal component analysis can help identify the main factors affecting the dependent variable , and can also be applied to other multivariate analysis methods in the resolution of the main components of these after regression analysis . Principal component analysis can also serve as the first step in fac tor analysis . The disadvantage is that only involves a set of interdependencies between variables, to discuss the relationship between two variables required to use canonical correlation. ~ 1/2 j i j i )] y - (y ) y - [(y British Journal of Science 62 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 9. Correspondence Analysis : This technique provides for dimensional reduction of object ratings on a set of attributes, resulting in a perceptual map of the ratings. However, it like MDS, both independent variables and dependent var iables are examined at the same time. This technique is more similar in nature to factor analysis. It is a compositional technique, and is useful when there are many attributes and many companies. It is most often used in assessing the effectiveness of adv ertising campaigns. It is also used when the attributes are too similar for factor analysis to be meaningful. The main structural approach is the development of a contingency table. This means that the model can be assessed by examining the Chi -square valu e for the model. Correspondence analysis is difficult to interpret, as the dimensions are a combination of independent and dependent variables. It is a graphical technique for representing the information in a two -way contingency table, which contains the counts (frequencies) of items for a cross -classification of two categorical variables. With correspondence analysis, we construct a plot that shows the interaction of the two categorical variables along with the relationship of the rows to each other and o f the columns to each other. 10. Canonical correlation analysis : Canonical correlation is an extension of multiple correlation, which is the correlation between one y and several x’s , It is often a useful complement to a multivariate regression analysi s. It is wo rking through the intervening comprehensive description of the typical correlation coefficient between two sets of multivariate random variable statistical methods. Let x be a random variable , and let y be a random variable , how to descr ibe the degree of correlation between them? This tedious does not reflect the nature of things. If we use of canonical correlation analysis, the basic procedure is, from a linear function of two variables each taking one of each form a pair, they should be the maximum correlation coefficient of a pair, known as the first pair of canonical variables, similarly we can also find the first two pairs, 3 pairs, ... , between these pairs of variables unrelated, the correlation coefficient for a typical variable called canonical correlation coefficient. The resulting canonical correlation coefficient of more than the original sets of variables in any set number of variables . . Canonical co rrelation analysis of two sets of variables contribute to comprehensive description of the typical relationship between them . The condition is that the two variables are continuous variables, the information must obey the multivariate normal distribution. 11. Conjoint analysis : Conjoint analysis is often referred to “trade -off analysis,” in that it allows for the evaluation of objects and the various levels of the attributes to be examined. It is both a compositional technique and a dependence tec hnique, in that a level of preference for a combination of attributes and levels is developed. A part -worth, or utility, is calculated for each level of each attribute, and combinations of attributes at specific levels are summed to develop the overall pre ference for the attribute at each level. Models can be built which identify the ideal levels and combinations of attributes for products and services. British Journal of Science 63 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 12. Structural Equation Modeling (SEM) : Unlike the other multivariate techniques discussed, structu ral equation modeling (SEM) examines multiple relationships between sets of variables simultaneously. This represents a family of techniques. SEM can incorporate latent variables, which either are not or cannot be measured directly into the analysis. For ex ample, intelligence levels can only be inferred, with direct measurement of variables like test scores, level of education, grade point average, and other related measures. These tools are often used to evaluate many scaled attributes or build summated sca les. 13. Log – linear models : This techniques deals with classification data . Where there is dependent variable and one or more than one independent variables . The observed data classified in contingency table , Then we make the log linear mode ls depending on these data , and we find the expected values for each of these models , where each model have a formula to find these expected data , after that we calculate the ( Pearson 2 ) and (Likelihood – Ration Statistic G 2 ) , to compare them with table 2 to know if the choosing model make a good represent to the data or not , that is if the calculated value of ( 2 and G 2 ) less than the table 2 , that mean the calculated value is morale , and the model make a good represent to the data , and if not (that is the calculated value of ( 2 and G 2 ) not less than the table 2 ) then we have to chose another model . Application example : We took our data from Ibn al -Haytham Hospital which is the largest hospitals in Iraq for eye diseas e, the sample size N = 1285 patients, here we used the Log – linear models technique. Patients classified according to: 1. Type of disease ( which contain two categories :which is considered here as a dependent variable : A - retinal detachment B - inflammation of the optic sticks 2. Age in years: It has been divided into three categories: A – 15 -44 B – 44 - 64 C – more than 64 The researcher neglected age groups under 15 years old, because injury to these tw o diseases in this age period have been nonexistent, according to the hospital. The data classified in two – way contingency table : Table (1) Two -way contingency t able for opserved data of the patient More than 64 45 -64 15 - 44 Age Disease 162 350 91 retinal detachment British Journal of Science 64 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 205 387 90 inflammation of the optic sticks Now , we find expected value under the independent model which it's formula is : We put these expected values in the next table : Table (2) Two -way contingency table f or expected data of the patient (independent model) Log mij = u + u 1i + u 2j expected values matrix 172.23 345.84 85.93 194.79 391.15 96.06 Now – we calculate (2 and G 2 ) where their formulas are : The results are : 2 = 1.9171 and G2 = 0.001 The degree of freedom for the independent model is 2 according to the degrees of freedom table : Table (3) Degrees of freedom for ( saturated and unsaturated model ) for two -way contingency tables Degrees of freedom Terms of u 1 U r -1 u1 c -1 u2 ( r – 1 ) ( c – 1 ) u12 r c Sum So , with 2 degrees of freedom , under the morale = 0.05 , the value of table 2 = 5.99 )4( .......... ). (.) ( N j x xi mij )5( ... .......... .......... ) ( , 2 2 ji mij mij xij ji mij xij Log xij G , 2 )6( ....... .......... 2 British Journal of Science 65 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 Then by compare the calculated values of 2 and G 2 with the value of table 2 , we see that the calculated values less than table value , which mean that the calculated value moral e and the independent model makes good represent for the data . Conclusions Any of above techniques has strengths points and weaknesses points , that is mean that the analyst must be carful when he use these techniques , where he must understood the strengths and the weaknesses of each one of these techniques . Each of multivariate techniques describe some types of data different from the other techniques . Statistical programs such like ( spss , sas , and others ) make it easy to run a procedure . References : 1. Alvin C. Rencher.2002 : " Methods of Multivariate Analysis " , A. John Wiley & sons, inc.
publication , Second Edition. 2. Bryant and Yarnold . 1994 : "Principal components analysis and exploratory and onfirmatory factor analysis". American Psyc hological Association Books. ISBN 978 -1-55798 -273 -5 . 3. Ding Shi-Sheng . 1981 : "M ultiple analysis method and its applications", Jilin People's Publishing House, Changchun. 4. Feinstein, A. R. 1996 :"Multivariable Analysis ". New Haven, C .T. Yale University Press. 5. Garson David . 2009 :"Factor Analysis ". from Statnotes , Topics in Multivariate Analysis. Retrieved on April 13, from http://www2.chass.ncsu.edu/garson/pa765/statnote.htm . 6. Raubenheimer, J. E. 2004 : "An item selection procedure to maximize scale reliability and validity". South African Journal of Industrial Psychology, 30 (4), pages 59 –64. 7. Satish Chand ra & Dennis Menezes . 2001 : " Applications of Multivariate Analysis in International Tourism Research: The Marketing Strategy Perspective of NTOs " Journal of Economic and Social Research 3(1), pages 77 -98 . 8. Yoshimasa Tsuruoka , Junichi Tsujii and Sophia An aniadou tochastic. 2009 : " Gradient Descent Training for 1 -regularized Log -linear Models with Cumulative Penalty Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP" , pages 477 –485, Suntec, Singapore. British Journal of Science 66 July 2012, Vol. 6 (1) © 201 2 British Journals ISSN 2047 -3745 9. Zhang Yao Ting, with Fang Kaitai . 1982 : "Multivariate Statistical Analysis Introduction", Science Press, Beijing.