Chapter 10: Multifactorial analyses using SAS
Figure 10.1 (a) Output for one-way analysis of variance
Box 10.3 Presenting the results for one-way analysis of variance
Figure 10.1 and Box 10.3 Code
Click here to show code as text/* Create a library called "source". This means that when you process datasets that have been imported into SAS you can use the word source instead of having to type the path to the folder holding the SAS datasets. SAS will only be able to read data from that folder, not accidentally change the data. */ libname source "C:\Projects\Books\Presenting\data\sasData" access = readonly; /* This is a user defined format that can be used on any variable. It will last until you restart SAS. */ proc format; /* Display numeric values with words. The number before the ) will cause these to display in logical order. */ value groups 1 = "1) never smoked" 2 = "2) ex pre-pregnancy" 3 = "3) ex in pregnancy" 4 = "4) 1-14 cigs/day" 5 = "5) 15+ cigs/day" ; /* The ; goes after all the categories. */ run; /********************************************************************************* The GLM code below wants to include the record numbers for the outliers. Fixing it requires a tweak to the graphic design code. To learn about this kind of manipulation, look at: Getting Started with the Graph Template Language in SAS by Sanjay Matange *********************************************************************************/ ods path(prepend) work.template (update); proc template; link Stat.GLM.Graphics.FitBoxPlot to Common.Zreg.Graphics.FitBoxPlot; define statgraph Common.Zreg.Graphics.FitBoxPlot ; notes "Effect Plot with x=CLASS"; dynamic _TITLE _XLABEL _SHORTXLABEL _YLABEL _SHORTYLABEL _RESPONSENAME _RESPONSELABEL _DATALABEL _DISPLAYMISSING _XVAR_OBS _Y_OBS _FREQ _WEIGHT _OVERALLF _OVERALLP _SHOWINSET _byline_ _bytitle_ _byfootnote_; BeginGraph / includemissingdiscrete=_DISPLAYMISSING; entrytitle "Distribution of " _RESPONSENAME; layout overlay / yaxisopts=(griddisplay=auto_on label=_YLABEL shortlabel=_SHORTYLABEL) xaxisopts=(label=_XLABEL shortlabel= _SHORTXLABEL discreteopts=(tickvaluefitpolicy=rotatethin)); boxplot y=_Y_OBS x=_XVAR_OBS / labelfar=true primary=true; if ((_SHOWINSET EQ 1) AND EXISTS(_OVERALLF)) layout gridded / columns=2 halign=left border=true BackgroundColor=GraphWalls:Color Opaque=true autoalign=( TopRight TopLeft BottomRight BottomLeft); entry halign=left "F" / valign=top; entry halign=right eval (PUT(_OVERALLF,7.2)) / valign=top; entry halign=left "Prob > F" / valign=top; entry halign=right eval (PUT(_OVERALLP,PVALUE6.4)) / valign= top; endlayout; endif; endlayout; if (_BYTITLE_) entrytitle _BYLINE_ / textattrs=GRAPHVALUETEXT; else if (_BYFOOTNOTE_) entryfootnote halign=left _BYLINE_; endif; endif; EndGraph; end; run; proc template; define statgraph Stat.GLM.Graphics.BoxPlot ; dynamic _DEPVAR _CLASSVAR _GROUPDFBY _SHORTYLABEL _SHORTCLABEL _OVERALLF _OVERALLP _byline_ _bytitle_ _byfootnote_; BeginGraph; entrytitle "Distribution of " _DEPVAR; layout overlay / yaxisopts=(display=all gridDisplay=auto_on shortlabel=_SHORTYLABEL) xaxisopts=(shortlabel=_SHORTCLABEL discreteopts=(tickvaluefitpolicy=rotatethin)); boxplot y=_DEPEN x=_GROUPDFBY / freq=_FREQ labelfar=on; layout gridded / columns=2 halign=left border=true BackgroundColor =GraphWalls:Color Opaque=true autoalign=(TopRight TopLeft BottomRight BottomLeft Right Left Top Bottom Center); entry halign=left "F" / valign=top; entry halign=right eval (PUT(_OVERALLF,7.2)) / valign=top; entry halign=left "Prob > F" / valign=top; entry halign=right eval (PUT(_OVERALLP,PVALUE6.4)) / valign=top ; endlayout; endlayout; if (_BYTITLE_) entrytitle _BYLINE_ / textattrs=GRAPHVALUETEXT; else if (_BYFOOTNOTE_) entryfootnote halign=left _BYLINE_; endif; endif; EndGraph; end; run; ods exclude GLM.Means.smokegroup.birthwt.BoxPlot; /********************************************************************************* That ends the code to suppress the outlier labels.... *********************************************************************************/ /* Create a general linear model that will use the format to set the order of the predictor categories. */ proc glm data = source.smokingbwt order = formatted plots(only)=BoxPlot; format smokegroup groups.; /* Use the format created above. */ class smokegroup; /* Treat smokegroup as a categorical variable. */ model birthwt = smokegroup; /* Predict birth weight with smoking category. */ /* Calculate the means for the continuous birth weight variable grouped by the classification variable smokegroup and the comparisons between groups. */ means smokegroup / clm /* Include confidence intervals on the means. */ cldiff /* Include confidence intervals on the differences. */ scheffe /* Use the Scheffe method to adjust for multiple comparisons. */ nosort; /* Don't sort means into descending order. */ /* Calculate the predicted population means and differences between levels of smoking group. Use the Scheffe method to adjust for multiple comparisons. */ lsmeans smokegroup / adjust=scheffe; run; quit; /* To allow many models, proc glm keeps running until you quit */
Figure 10.2 Output for multiple regression
Box 10.5 Presenting the results of multiple regression
Figure 10.2 and Box 10.5 Code
Click here to show code as textlibname source "C:\Projects\Books\Presenting\data\sasData" access = readonly; proc reg data = source.pefr ; /* Predict pefr and include confidence limits on the parameter estimates (betas). The text before the : is an optional label on the model. */ UseAllPredictors: model pefr = age gas smokers wheeze / clb; Use3Predictors: model pefr = age gas wheeze / clb; Use2predictors: model pefr = age gas / clb; run; quit; /* To allow many models, reg keeps running until you quit. */
Figure 10.3 Output from multiple regression with categorical predictor variables
Box 10.8 Presentation of multiple regression with categorical predictor variable
Box 10.8 Figure 10.3 Code
Click here to show code as textlibname source "C:\Projects\Books\Presenting\data\sasData" access = readonly; /* Display numeric values with words. When modeling with SAS, the reference level of the categorical variable is the last one listed when the phrases/levels are in alphabetical order. The leading spaces force the reference group to be the last level of the predictor. */ proc format; value heavy 1 = " 1) Light" 2 = "2) Heavy" ; value alc 0 = "0) 0 g/week" 1 = " 1) 1-19 g/week" 2 = " 2) 20-49 g/week" 3 = " 3) 50-99 g/week" 4 = " 4) 100+ g/week" ; run; /* Model and set the categorical levels with the formats. */ proc glm data = source.smokingbwt order = formatted; format cigsgp heavy. alcgroup alc.; /* Set the display and sort order. */ class cigsgp alcgroup; /* These are categorical variables. */ model adjbw = cigsgp alcgroup / /* Predict weight with 2 predictors. */ solution /* Show the beta weights. */ clparm /* Show the confidence limits. */ ; /* The ; is the end of the model statement. */ run; quit; /* To allow many models, reg keeps running until you quit. */
Figure 10.5 Output for logistic regression
Box 10.12 Presenting the results for logistic regression
Figure 10.5 Code
Click here to show code as textlibname source "C:\Projects\Books\Presenting\data\sasData" access = readonly; proc logistic data = source.bpd; /* Remember that any record with missing values on the predictors or the outcome will not be used in the calculations. This dropping happens automatically. This where statement is just a reminder. If you are changing predictors, you can drop like this to make sure that all the models use the same subjects. */ where not missing (chest) and not missing (sex) and not missing(multiple) and not missing(o2dep36w) and not missing(o2depdis); /* There are four categorical/classification variables. */ /* The odds ratios are for being in group 1 instead of the baseline 0. */ class sex (ref = First) multiple (ref = First) o2dep36w (ref = First) o2depdis (ref = First) / param = REF /* Code the parameterization as 0 vs. 1. */ ; /* This ends the class statement. */ /* Predict having event = 1 (the last alphabetized level of the outcome) with four variables. */ model chest (event = LAST) = sex multiple o2dep36w o2depdis; run; proc logistic data = source.bpd; /* This where statement is here to make sure that the SAME PEOPLE ARE INCLUDED IN BOTH models. */ where not missing (chest) and not missing (sex) and not missing(multiple) and not missing(o2dep36w) and not missing(o2depdis); /* There are three categorical/classification variables. */ class sex (ref = First) multiple (ref = First) o2dep36w (ref = First) / param = REF ; model chest (event = LAST) = sex multiple o2dep36w; run;
Figure 10.6 Output for logistic regression with categorical predictor variables
Box 10.14 Presenting logistic regression with categorical predictor variables
Figure 10.6 and Box 10.14 Code
Click here to show code as textlibname source "C:\Projects\Books\Presenting\data\sasData" access = readonly; proc logistic data = source.smokingbwt; /* There is no where statemen,t so the output notes who is dropped. */ class smoker1 (ref = First) socialclass (ref = First) / param = ref ; model nausea1 (event = LAST) = smoker1 socialclass; run;
Box 10.15 Presenting unadjusted and adjusted logistic regression
Box 10.15
SAS can not easily produce this output.code will be here.