Authors: Mounir Mesbah, Bernard F. Cole, Mei-Ling Ting Lee. This book serves as a practical guide to methods and statistics in medical research. It includes step-by-step instructions on using SPSS software for statistical analysis, as well as relevant examples to help those readers who are new to research in health and medical fields.
Simple texts and diagrams are provided. Basic Statistical Methods and Models for the Sciences. The use of statistics in biology, medicine, engineering, and the sciences has grown dramatically in recent years and having a basic background in the subject has become a near necessity for students and researchers in these fields.
The constant for the model is represented by the parameter m0. Regression is a tool that allows us to take data from some sample and use these data to estimate m0 and m1. These values are then used to create predicted values of the outcome, with the observed or true value from the data designated as y and the predicted value as.
Furthermore, in equation 1 , the value m1 measures the causal effect of a one unit increase of X on the value of Y. The parameter m1 is also referred to as the regression coefficient for X, and is the average amount the dependent variable increases when the independent variable increases one unit and other independents are held constant. Thus, when independent measure increases by 1, the dependent variable increases by m1 units.
Babatunde In multiple linear regression, there are more than one independent variable in the model. Multiple regression allows researchers to examine the effect of many different factors on some outcome at the same time. The general purpose of multiple regression is to learn more about the relationship between several independent or predictor variables and a dependent variable.
This is because the simple linear regression is a simplification of reality. In real life, there are more than one variable that affects the behaviour of the dependent variable.
The intercept of the regression is c0 while c1 and c2 are referred to as the partial regression coefficients. The error term is represented by ui. Hypothesis testing are conducted to show whether the parameters that have been estimated are statistically significant or, whether the independent variables contribute to the explanation of variation in the dependent variable.
If we are able to reject the null hypothesis at an acceptable significance level, then we conclude that the parameter is not statistically significant. The quality of the fitness of the model is determined by the R2. The R2 has a value that is between 0 and 1.
High values of R2 will indicate that the model fits the data well. A limitation in the use of R2 is that its value increases with the number of explanatory variables. It does not usually penalize for the consequence loss of degrees of freedom as the number of explanatory variables increases. The power of the test is therefore affected. Thus, the adjusted R-square was developed to take care of the inadequacies.
In addition, the F-statistic test for the joint significance of all the parameters in the model. The composite and relative contributions of independent variables to the dependent variable are usually determined through multiple regression. These are explained in Tables 8, 9 and 10 respectively. The variables under consideration correlated significantly with the dependent variable.
R value is 0. If R is squared and the result is multiplied by , the percentage contribution of all the independent variables taken together to the variance of the dependent variable is obtained. Table 9 gives the Analysis of Variance.
The value here is significant because the probability is less than 0. What this means is that the R value earlier obtained is not due to chance. Error Constant We also have t values for all the variables and the probability values sig. Note the ranking of the variables according to their weight contributions. Basic Statistical Techniques in Research The detail of the trimming could not be given but it involves removing any path whose path-coefficient is less than 0.
The model in Figure 2 is usually referred to as parsimonious model. It shows the paths which have direct and indirect influence on the dependent variable. Path Analysis This is an extension of multiple regression analysis. The use of path analysis enables the researcher to calculate the direct and indirect influence of independent variables on a dependent variable.
These influences are reflected on the path coefficients, which are actually standardized regression coefficients Beta weights. Path analysis is one of the techniques for the study and analysis of causal relations in ex-post facto research. Babatunde starts with hypothesized model and end up with parsimonous model. This is after carrying out trimming of the paths by using structural equations. In the hypothesized model belo, variables X1, X2, X3, X4, X5, X6, and X7 re called independent variables or exogenous variables while variable X8 is referred to as dependent variable or endogenous variable.
The model is hypothetical. For example, data on fifty characteristics for states are unwiedly to handle, descriptively or analytically. Reducing them to their common factor patterns facilitates the management, analysis and understanding of such data.
These factors concentrate and index characteristics without much loss of information. States can be more easily discussed and compared on economic, development, size and public dimensions other than on the hundreds of characteristics each dimension involves.
Factor analysis could be carried out to find out the commonalities of the test items such that the 25 items would be reduced to a fewer number of items and the instrument would still be able to measure validly and reliably the construct attitudes towards Physics. Also, the scale items could be sorted into their various components so that the items, which correlate highly with themselves are group together. It could be observed that out of the 25 items, the first 9 items account for The 25 items have been reduced to 9 and the 9 items could be assumed to have measured the construct, which the 25 items were designed to measure.
This shows that since the 9 items were found to account for The analysis was carried out to establish the number of meaningful factors. Nine factors have thus been found to be meaningful or nontrivial.
These are the factors considered as peculiar factors perceived by the students as their attitudes toward Physics. Correlations Correlation is a measure of the relation between two or more variables. It indicates the strength and direction of a linear relationship between two random variables. If the variables are independent then the correlation is 0, but the converse is not true because the correlation coefficient detects only linear dependencies between two variables. Pearson Product-Moment Correlation Coefficient The most widely-used type of correlation coefficient is Pearson product-moment correlation coefficient.
Pearson's correlation coefficient is a parametric statistic. It is a common measure of of the correlation between two variables X and Y. When computed in a sample, it is designated by the letter r and is sometimes called Pearson's r. Pearson's correlation reflects the degree of linear relationship between two variables.
Pearson correlation, assumes that the two variables are measured on at least interval scales, and it determines the extent to which values of the two variables are proportional to each other. However, the value of correlation that is, correlation coefficient does not depend on the specific measurement units used. For example, the correlation between height and weight will be identical regardless of whether inches and pounds, or centimeters and kilograms are used as measurement units.
Spearman's rank correlation coefficient The Spearman's rank correlation coefficient is a non-parametric measure of correlation — that is, it assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any assumptions about the frequency distribution of the variables. Babatunde can be used for variables measured at the ordinal level unlike the Pearson product-moment correlation coefficient.
However, Spearman's correlation coefficient does assume that subsequent ranks indicate equidistant positions on the variable measured. Choosing Appropriate Statistical Technique in a Research Enterprise Statistical techniques can be used to describe data, compare two or more data sets, determine if a relationship exists between variables, test hypothesis and make estimates about population measures.
Not only it is important to have a sample size that is large enough, but also it is necessary to see how the subjects in the sample were selected. Volunteers generally do not represent the population at large.
However, student should realize that computer merely give numerical answer and save time and effort of doing calculations by hand. It is the duty of students to understand and interprets computer print-out correctly. Note that data can be subjected to parametric and nonparametric statistics depending on the nature of the data.
Purely numerical data like student score in a chemistry test could be subjected to a parametric test. On the other hand, the number of people who read newspapers in the morning is not a purely numerical data and it can only be subjected to a nonparametric test. In this case, you can not perform all the mathematical operations with the data. You can add, subtract but you can not determine mean score neither can you carry out division.
The variable type determines to some extent the type of statistical descriptive or inferential method that it will support. Variable types also influences the language of the hypotheses and hence, inferences that can be made from such hypotheses. Basic Statistical Techniques in Research 29 In any study, you describe, compare data or determine if relationship exists between variables.
If the percentage score of group A is higher than that of group B, you are describing the performance of the groups. When you go a step further by finding out whether the performance are significantly different you are at the realm of making inference and this informed the usage of the parametric statistics; in this case, t-test. If a study involves determination of relationship, we can use Spearman Rank order correlation, Pearson Product moment correlation, Chi-square statistics or even multiple regression analysis.
All depends on the nature of the research. Chi-square shows the degree of association between two different bases of classification. It should be noted that the z-test is used only when the population parameters are known and the variable of interest is normally distributed in the parent population. However, if the variable is not normally distributed, a large sample permits the use of a z-test. In this case, the use of z-test is regarded as an approximate test.
However z- test could be used for two means from independent samples. Normally, z-test is used to test for the mean of a large sample, and the t-test used for the mean of small sample.
Examples of Topics, Research Questions, Hypotheses and Selection of Appropriate Statistical Tools As pointed out earlier, the selection of statistical tool in a study is a function of the design of the experiment. Also, the language of the hypotheses is determined by the design. However, a researcher need not shy away from the fact that the topic of a research is an offshoot of the statement of the problem. Babatunde language of the topics, design, and the hypotheses?
The following examples would throw more light on the answer to the question. Questions i. This is a situation where a variable independent could determine another variable dependent , a kind of causal relationship. The independent variables here are: Student, teacher and school environment variables while the dependent variable is achievement in Senior Secondary school Chemistry.
It would be observed that hypothesis is not necessary here. This is an ex-post facto study of the survey type in which the researcher need not manipulate the independent variables, they have already manifested.
The first question demands that the experimenter would construct hypothesized causal model, which he has to trim in order to get the parsimonoius model. The second question has to do with the determination of the composite effect of the independent variables on the dependent variable. The answer to this question could be obtained through multiple regression analysis. Path analysis is an extension of multiple regression. Path coefficients are the beta-weights in multiple regression.
Basic Statistical Techniques in Research 31 Hypotheses 1. There is no significant main effect of treatment on i. Learning outcome in Chemistry ii. Attitude to Chemistry 2. There is no significant main effect of ability on: i.
Attitude to Chemistry 3. There is no significant main effect of gender on i. Attitude to Chemistry 4. There is no significant interaction effect of treatment and gender on i. Then the random variable which is the linear combination of the , i. Such interval estimates are also called confidence intervals. Decision Regions: The observation space is partitioned into acceptance region R H0 and rejection region R H1 ; if the observed features fall within the acceptance region, hypothesis H0 is confirmed, otherwise, H0 is rejected.
One-Tailed Test vs. State the null hypothesis, H0. State the alternative hypothesis , H1. Decide on the level of significance,. Choose an appropriate testing procedure and determine the acceptance region. Compute the test statistic from the sample data.
Make the decision: reject H0 if the p-value is less than the level of significance ; otherwise accept H0. Example: To test H0: against H Suppose the computed test statistic which corresponds to the p-value of.
0コメント