Okay, in this video, we're going to work through an analysis of a general linear model that has one factor and one covariant data, or real data that were collected by undergraduates at the University of Edinburgh from a practical or a lab in which they measured a number physiological, that measured a number physiological aspects of their own bodies. And then analyze them to see what they found. What we're interested in today is the influence of weight and sex on systolic blood pressure. The main focus of this video is to interpret and report our results and analysis that does not have a significant interaction between the covariate and the factor. But we will also briefly discuss what we would report if there had been if there had been evidence for an interaction between the factor and the covariate. We have a number of biological questions we can address with this dataset. First of all, we can ask whether or not systolic blood pressure varies with weight. So we can also ask whether or not systolic blood pressure differs between females and make the females and males after having accounted for the effective weight. And finally, we can ask whether or not the relationship between systolic blood pressure and weight seems to be similar or different between females and males. So those are the questions we're going to address. Let's now have a look at the data. So the data, whoops, r in this file, MS1 data, blood-pressure dot CSB. And we're going to save this cs, the, the, the data indices CSB file in object, which we'll call BP for blood pressure. So it will run that. As always. Let's just start by getting familiar with the data set. So let's look at the structure of the dataset. You can see we have three columns. One called sex, which has been coded with two factors. So either F or M for female and male. We then have a column which is a number. So a numeric value, which is our value of our adjusted weight. I can tell you I wasn't involved in actually collecting the original data, so I'm not sure how the wheat was adjusted. We're just going to treat it as a measure of weight. And then finally, we also have our measure of systolic blood pressure. Let's get into the cup QuickBooks. So we can see here that we have an unbalanced data set. We have 265 females and a 100 males. Again, some students have pointed out that their output from a summary function like this doesn't look like mine. And any differences like so for example, if you're not getting this type of output. So counts of the number of females and males than that, maybe just because you're using a more recent version of R than I am. Okay. Let's quickly visit another quick way of that. Let, let's imagine that your output, your summary function does not give you the numbers of females and males. And we want to know whether the data were balanced. And previous videos I was using the function summary by in order to count the number of data points and various combinations of our dataset. We can also use the function table where we can say, we can say table and then give our DataFrame and then specify the column that we're interested in. So we wanted to tabulate the number of data points that are in the columns sex. So if we run that, now we get the same information. 265 females, 100 males. If this were another, just as a slight tangent. If we were interested in doing an analysis that had two factors and wanted to know, well at the day we're balanced for those two factors. Then we can simply list both of those factors here like BP and then the name of our other factor. And so that's, that would be another option for seeing whether your data are balanced without using the summary by statement. Finally, let's just actually have a look at our data. We're not being given all the data points, but you can see that we have females and males and dispersed, blah, blah, blah. Okay, so let's start. Now that we're broadly familiar with our data. Let's start just by plotting the data and we'll start with just some box plots. So let's do a box plot of our dependent variable is going to be systolic blood pressure. Because our hypothesis is that sex and or wait would influence status systolic blood pressure. And so we're imagining from our hypothesis that systolic blood pressure than would be influenced by these other factors. And so that would mean that systolic blood pressure is our dependent variable. So we will start just by looking at a box plot of how systolic blood pressure depends on sex. What do we get? Here, ok. What we get here is a couple of things. First of all, we can see that the variance seems to be very similar between our two different groups, between females and males. You can see that because the bread, so these distributions are very similar. We can also see that the plots are very nicely symmetrical, which suggests that their data are likely normally distributed. We can also see the systolic blood pressure certainly tends to be higher for males than for females. Let's look at the data from another perspective, however. What was that column called? Adjusted? Ok. OK, full-term adjusted weight and capitalised. Now. Ok, we can also see that we can make fairly similar claims about weight. So males tend to be, tend to weigh more than females. The variance is fairly equal between these. And we can see that both of these distributions are nice or nice and symmetrical. Okay, so we can see that systolic blood pressure is, we have good reason to think based on these plots that systolic blood pressure will be higher in males than females. And we can also see that weight also tends to be higher in males than females. What about the relationship between systolic blood pressure and weight? What do we get there? Oops, typo. Okay. Well, here's something interesting. Well, it seems that systolic blood pressure increases with weight. So we have ourselves a conundrum. If our question, let's imagine that we wanted to know specifically whether or not being classified as female or male tends to influence the average systolic blood pressure that you're likely to have. So we're interested specifically on the effect of being either female or male on your blood pressure. Ok. What we're learning from these plots so far is that it would actually be if potentially complicated question to answer. Because we can see that on average, systolic blood pressure seems to be higher for males than for females. But we can also see the systolic blood pressure is higher. For individuals that way more. And if you remember, we also saw that males tend to weigh more than females. So this raises the question. Is systolic blood pressure higher for males than for females, tend to be higher for males and females. Not because there are inherent differences between females and males, but simply because males tend to be larger than females. And it's the difference in size that actually causes differences in systolic blood pressure? That's a fairly sophisticated question. So we're going to do an analysis in a moments that will allow us to test exactly that. So if we do a general linear model that includes both the effect of sex and the effective weight. When we're trying to understand systolic blood pressure. What that an analysis can do is it can allow us to determine whether or not we have evidence that systolic blood pressure seems to be different between females and males after having controlled for the effect of weight. That's really what we want to do if our goal is to understand whether or not there are inherent differences due to biology between females and males in their systolic blood pressure. And he wants to make sure that we're controlling for other factors that could be associated with systolic blood pressure like weight, then this is the kind of approach that we would want to take. We'll do that in a moment. Before we go on, however, I want to show you that we can learn a little bit more from plotting these data a little bit more neatly. And I want to show you that we could actually color code our data points by whether or not those data points are associated with female versus male. So we can say color equals. And then we can use what's called an if else statement. This kind of statement is very commonly used in programming. Where we would say, we would do some sort of test, make some sort of comparison, for example. And we would say that if the outcome of that test were true, then do one thing. And if the outcome of that test were false, then do something else. And that's exactly what we're going to do here. So our test here is going to be whether or not Sx is equal to female. I can't remember whether or not I should have quotation marks here or not. So we'll find out in a moment, I have these double equal signs. Because when you have double equal signs, that means that you're actually making a comparison between the thing on the left and the thing on your right. And you're asking whether or not these things are equal. If we only had one equal sign there, than what we would be doing is assigning the value on the right, to the value on the left. We don't want to be assigning anything, we want to be comparing them. So we're going to say that if Sx equals female, then make the points red. Otherwise make them blue. Ok. We'll see whether or not my notation here works. That did not work. There we go. Alright, so you do not want quotation marks around this value sex. So what we've done here, as you said, we want to color to be either red or blue. And we said, if Sx equals female, then make it red, else, make it blue. That's how that works. Okay. So you can see that the females tend to have a lower weight than the males. And that's just because the red points tend to be towards left. But you can also see though that within, especially within the males, we have a bit more variation among the males because they, they cover a broader range of weights and the females do. You can see still, even within the sexes, vary is a relationship between weight and systolic blood pressure, or at least it certainly looks that way. We'll get a better sense of that in our analysis. Okay, so we now wants to analyze these data. Let's start by creating a linear model. So we'll say BP. Bp dot lm. Lesser will save our output from our LM function. So say lm syst dot B p is a function of Sx plus adjusted weight plus their interaction. So put a colon between them. And now we're going to say data equals bP. So all we've done here is we've taken the column names in our dataset that supply the information about which sex a data point is associated with. Information about the weight that, that measurement of blood pressures associated with. We just listed those columns. And then we've listed them again with a colon in between to say that we want to include an interaction between those variables in our model. Okay? So we have named our variables and specify the interaction between them. Let's now run this model. Okay? Remember that we have unbalanced data, so we're not going to calculate a P value with this model. What we're gonna do first is we're just going to look at the coefficients. Okay? Yes, I want you to become familiar with how to interpret the coefficients in a model like this. So I'll say summary of our model. And here is what we can see. Who come. There we go. Alright? So we can see that our output. Has four different coefficients. We have an intercept, we have a term than sex m. Then we have a term for adjusted weight. And then we have a term for an interaction between sex and wait, where sex is M. What do these terms mean? These first two lines refer to the Y intercept. Refer to the Y intercept for our model. What we're doing in our model is we are fitting two different straight lines through our data. We're fitting a straight line that provides a relationship between weight and systolic blood pressure. But we're adding two lines because we're fitting one line for each sex. And the fact that we haven't included an interaction, that means that we can allow those two different lines we're fitting to have different slopes. So with that in our minds, we can now start to interpret this output. The first two lines that are here refer to the y intercepts for the lines that we have put through the female data and the mail data, you'll see that we have since M is listed here. That means that our intercept prefers to females just by process of elimination. We also know that our assigns the intercepts based on alpha numeric order. Essence female comes before male, it's set female as the intercept. So this value here of 86.49, this equals the y-intercept for the line that goes through the female data. This value of 8.38. That refers to what? That refers to the difference between the y-intercept for females versus males. So if we want to get the y intercept for males, then we have to take the y intercept for females and adds to that the difference between females, the female and male intercept. So this now of 94, this value of 94.8, that is the y intercept for the males. So that's what these first two terms refer to. They refer these for, that's what these first few coefficients referred to. They refer to the y intercepts. For the lines we fit through the female and male data. It should come as no surprise then that these last two lines then refer to the slopes that we fit through our data. This top line, this coefficient here for adjusted weight. This is the slope that has been fit through the female data. Okay, so that's our slope for the line that has gone through the female data. This term here represents the difference between the slope for females and the slope for males. Remember we said when we fit any interaction, we were allowing for differences in the slopes between females and males. Well, this term here represents the difference between those two different slopes. So if we want to calculate the slope for the line that goes through the mail data, we would take this value for the females comma and add to that this value for the males. And there we go. That's our value for the males. Okay? Remember these values for the y intercepts because we're going to use them later on in this video. Okay? So now we have learns to interpret these, these coefficients. Remember that if we include some contrasts in our LM function, which we will do shortly, then we cannot interpret the coefficients from the output of that model where we include the contrasts. We cannot interpret those coefficients in the same way. So if our goal is to obtain estimates of the slopes and the intercepts, we wants to obtain them in this way. Okay? Alright. Let's now also look at our assumptions. So they can check our assumptions just by looking at the residuals by same plot equals bP dot lm or plot of BP LM. Before we do that, we can remind ourselves that our data represent a random sample as, as we can imagine, they represent a random sample of a student population. We don't really know whether or not students that study physiology have any particular differences compared to other groups. But we'll assume that they represent a random sample of the student population. And we can also say that the data points have very likely be independent. So how do our other assumptions, the assumption of equal variance and normally distributed residuals, how do they look? Alright, well, let's look at our residuals. This first plot, as always, allows us to assess the assumption of equal variance. And what we're looking for is basically just a plot that has no trend whatsoever in terms of the vertical spread of our data points as you move from left to right. And stop and look at this and come to your own decision on this first. Okay? My impression is that these data look beautiful. You can see that the, the vertical spread here is very consistent. As we move from left to right, we get a little bit less. What looks like a little bit less vertical spread on the right here. But that's probably just because we're now at a position where we were now in an area of the data where we have very few data points. We're now looking at individuals with much higher weights. And we can see that individuals with higher weights are just much less common. And that's probably why we're not seeing this the same amount of variation, or at least that would be 11 explanation. So this looks very nice to me. We have a lot of data points down here compared to up here. And that's just because we happen to have more subjects or more students that have weights in this smaller range and the range of a hundreds, 210. Okay, so equivariance, beautiful. Normality flow, that is probably the most normally distribute or the best evidence for normally distributed data that we've seen in any of our videos. So the vast majority these points lie almost directly on the lines. So we have very good reason to believe that our data are normally distributed. So at this point, we are in a position to look at our p-values, except we don't want to do it using this function or we don't want to use it using the output of this model because we had unbalanced data. Handy wants to account for that in our analysis. So I'm just changing the name of this object to add on T3 to remind ourselves that we're, we've modified the model so that we can calculate our p-value is using type three sums of squares. That's why I said T3. So we're just going to add contrast equal. And now list. And now we're only going to list the factor. In our model. We only have one factor, which is sex. And so we're going to say now Sx is equal to contrast some. And there we go. Now we can run this model. Adding the contrast will not influence our residuals and so we don't need to check our assumptions again. Let's just go straight to checking our p-values. So we're going to use the ANOVA function with a big a which is found and the car library. So I'll open that. And then we'll say anova with a big a BPI dot lm dot T3. And we want to say. Type equals three. You want to calculate Type three sums of squares. Just to reiterate what I said in a previous video, I'm teach new type three sums of squares at this stage, just because that's the default for many statistical packages. It, I'm not trying to say that type three sums of squares is the best approach. So let's now look at our results here. Okay, so here's what we find. We find that based on this p-value for the interaction, we can say that there is very little evidence to suggest that we have a difference in a difference in the slopes between females and males. So as far as we can tell here, we have every reason to believe so far that the slopes are similar between females and males, OK. And based on this output, because there is no evidence for an interaction, we should also plot the data though to kind of corroborate this p-value. We're going to do that in a little bit. We're getting to that point. But since at this point we don't have any evidence for an interaction, we now can interpret the p-values for these main effects. And so far these main effects suggest are these p-values, these main effects suggest that we do have some strong evidence to believe that there is a linear, that there's a relationship between weight and systolic blood pressure. But it also says that this p-value for sex suggests that there's no difference in blood pressure between females and males. After we've accounted for the effective weight. I'm going to stop this for just a moment and say, at this point in the analysis, once we've come to this point and seen that so far we have no evidence for there being interaction. We have two different ways to proceed to ways to proceed with our analysis. I'm going to show you both ways. To be honest, I'm not entirely settled on which way I prefer. I lean towards the second way for reasons that I will talk about in future videos. But I'm going to show you both ways. So the first way I'm going to show me is probably the most common approach that's used. And that approach involves running our model again, but removing the effect of sex or sorry, removing the interaction. And we might be justified in doing that because if our real interest in the biology here was really. And to understand the effect of sex than we could argue that really this interaction is not really the main focus of our analysis because we're not really interested in the effective weight on blood pressure. We're just really interested in controlling for the effective weight so that we can understand the effect of sex better. If that were our perspective. If that's really the, the line of thinking that was driving our analysis, then some people would argue that we would be justified in removing this interaction. Just because it's, involves something that we're really not terribly interested in. If we have no evidence for there being an interaction as indicated so far by this very large p-value. And really this interaction accounts for something that really isn't our central interest because we're not interested in the effect of weight per se, then we might say that we can be justified in removing this from the model. Okay? So let's, let's see what happens when we take that approach. We'll do that just by updating the model we had before. And we'll just change that to say main. Just to remind us that we only have the main effects, that we do not have any interaction. So let's run this model. Let's check our residuals again. Because we are now fitting a different kind of model. We are fitting different parameters to our data. So when we removed one of the factors in our model, we're changing how we're modelling it. And we need to make sure that this new approach to modeling the data still meets the assumptions. So we'll say plot, BPI dot, we want that one. So let's plot these. That still looks fine. So equivariance seems just fine. Normality seems just fine. Okay, so we're still satisfied that our data to meet these assumptions, okay? Now, wouldn't formally have to say type equals three if we only have forget I said that, forget I said that. So let's just run this and now we'll see what we get. Okay? Now we can see that once we have removed the interaction, now we get a very different picture. Now we can see that there is strong evidence given by a very small p-value. There is strong evidence that sex does influence blood pressure. Why do we get these two different perspectives, depending on whether or not we have the interaction in or out of the model. I'm not going to give you a particularly satisfactory answer to that. What I am going to is that the reason for it has to do with something called variance inflation, which we're going to discuss in another series of videos. Again, there are different perspectives on how to model data when you have the potential for variance inflation. So this is another large topic in itself. So given this, given the fact that we have these two different approaches, or the fact we've get very different views of effective sex depending on which model that we use. And I'm telling you that there's this effect of something called variance inflation. You might say, you know, crisp and why are you giving us a dataset? That is not straightforward when we're trying to understand how to run a general linear model with a factor and a covariant? Well, the answer to that is simply that this is the real world. Even with a relatively simple dataset that we might get just by measuring undergraduates and a practical or a lab, we can still get a dataset that requires a fairly sophisticated understanding of how to model the data. So here's my advice. If you are modeling these data on your own and you found something that was, lets him as you're playing with the data. And you tried removing the interaction like this. And you got this puzzling result that the effect of sex changes dramatically between the two different models. You should question that. You should say what's going on. As a scholar or a developing scholar or developing scientist, it is your responsibility to try to understand your data and try to understand how you're modeling your data as best you possibly can. So if because you're in the real world, you are analyzing a data set like this and you got results like this. Then my advice is to say, well, you need to stop then. And And learn more about the techniques that you're using. That's the bottom line. And that advice will hold for any kind of tests you're working on, not just this kind of situation. And that's really why I mentioned this as a big kind of life lesson in the, in the life of data analysis. So that's our big lesson. Step away from that big lesson for a moment. And let's focus back on, on this particular case because our goal here is not just to make things confusing. Our goal here is to learn how we can analyze data when we have both a factor and a covariance in our linear model. Just to step back, I told you earlier that I'm going to show you two different ways of approaching this problem. One of the ways is to do exactly what we've done. Which is that if we see that we have a nonsignificant interaction, or if we have, if more generally, we have data that look-a-like. There's really no evidence for an interaction at all. And you can base at assessment both in the p-value and from plotting the data. Then one line of thought, which has a legitimate line of thought, is to remove the interaction as we've done, and then look at your results again. And we're going to continue with that line of thought so you can learn properly how to analyze your data and understand them from that perspective. So for this first line of thought, we've said that we can remove the interaction and then look at our results again. And you can see now that we have evidence both for an effect of weight and for an effect of sex. Okay. What we'd like to do now is we, we'd like to understand these variables and more detail. Specifically, you want to get the slope of weight. And we also want to understand the difference between females and males. So we're going to finish off this analysis from this perspective, doing those things, getting to understand weight and sex better. And then we'll analyze these data again from our second perspective. Okay? So if we want to understand these, these terms better, then of course we can use our EM means function. So let's use a library, library EM means. And let's start first of all, by getting a sense of the difference between females and males. So. We want to work with VAT object. So we're going to say EM memes of this object. Whoops. And we want to calculate the different EM means or the different estimated marginal means for females and males want to do is we want to get separate estimates for each sex. So I'm going to say bpm means one because this is our first perspective on the data. So we can run this. And now, let's see, we get here. So here are the values, the estimated marginal means for females and males with standard errors and confidence intervals for those estimated marginal means. Let's stop and think about what these marginal means. Actually mean. How to interpret. These are the mean values of females and males, sorry, the mean systolic blood pressure for females and males. Once we have accounted for the effect of of body mass of our adjusted weight. Okay? So these mean values have been adjusted for the influence of our bodies, of body size. In other words, if we were just to take the mean value, the mean systolic blood pressure of females and males, we would find that they do not equal these values. Let's just do that right now. Just to show you. So we'll say library Dubai to, to, to and summary by. So you can see that these means for females and males do not simply reflects just the raw means of females and males. So these mean values for females and males had been adjusted for for the effect of weight. Okay? Now, what we'd like to do is we want to compare females and males want to be able to get an effect that we want to get a sense of the effect size of B in one sex or the other. So we can do that as we've learned already, using the pairs function. And there we go. And we can see that the difference between females and males, the difference in their blood pressure, having accounted for the effective weight is equal to 9.25. It's negative because males have a higher systolic blood pressure than females. So for taking males and subtraction that from females will get a negative value. So the difference between females and males for their blood pressure once we've accounted for the effective weight is this estimate here, and we now have a standard error for this. So these would be our effect sizes that we could report in. If we want to write these, write this up in some form of reporter publication. We can also get our conference intervals. And now we have confidence intervals for that difference between females and males. Okay? So that's how we can understand the effect of sex. We'd also like to be able to get the effect of just get the mean. Sorry, they've been slope. That's the word I'm looking for. And to do that, we can use a different function within the EM means library, which is called EM trends. And we can do here is we can say, we can use this term. So we're going to give it our model. We're going to tell it that we want to, we're going to do this separately for each sex. But the answer is the slope is going to be the same for each sex because we've removed and interaction from the model. Since there's no interaction, we're forcing the slope to be the same for females and males. But this is still useful for you to see. So Here's what we can do with the EM trends function. What we can do is we can calculate the slope for, for the effect of adjusted weight. And they can do that separately for each sex. Okay. One of the reasons I'm showing you to do it this way is because if we'd gotten very different results from our out, from our data, if instead of finding there'd be no interaction between females and males. If instead we found that there was evidence for an interaction, then this EM trends function here. This would be the main result that we will be looking to report. Because remember that if you have a significant interaction, we cannot try to interpret the main effects on their own. So if you had a significant interaction between a covariate and a factor, really the best thing you can do to understand that interaction is to simply characterize the slopes for each of you different levels of that factor. And that is what we would do here. So just telling you this here now allows me to say what you would do if we'd gotten completely different results, okay? But we can also use that in this case. So you can see here that the this value of 0.354, that is our slope for the relationship between systolic blood pressure and a or between adjusted weight and systolic blood pressure. And we're getting exactly the same value for females and males. Because as I said earlier, we do not have an interaction in our model because there's no interaction. We're forcing the effective weight to be the same for each of our two levels of sex. Okay? That's one way of getting, of getting the DB, the slope. Alternatively, we could run our model again. But this time we're going to take out contrasts because we could get our slope just from just from the output of this model. I'm, I've taken out contrast because if you remember, we include contrast in the lm function that changes the interpretation of the coefficients. So if we wanted to just get the slope for the relationship between weight and sex, we could get that similarly just by running this model without the contrasts and than just looking at and then just looking at this result. And you can see that the effect of that, the estimated value of the slope is 0.354. Let's 0.3544, which is the same as we got above. Okay? And you can see that that's what I wanted to show you. Okay? So that is how we would analyze our data from the first perspective, since we don't have a significant interaction or any evidence for an interaction, we could be justified and removing the interaction, running the model again and focusing on characterizing the main effects. So getting the slope value. For a covariate and determining the, the adjusted mean values for each of the levels of R factor. And then also getting an effect size for the difference between the levels of those of our factor. Okay, that's, that's our first way of analyzing these data. Let's now go back to our original model, which we had up here. I'm just going to paste it again down here just so we don't have to keep scrolling up so much. What we could do. So let's just run that again and just remind ourselves. These were our results. Okay. Well, we could do is we could, even though we don't have evidence for an interaction, we could levy interaction in the model and still go after trying to understand the effects of main effects. So the main effect of sex and the main effect of weight. Okay? So the main effective are covariant. And we can, we can focus on trying to understand these main effects. Because so far we have no evidence for there being an interaction between these two, between these two terms in our model. Okay? So I'll just say that again. Because we have no evidence for there being interaction. Our next step is to try to understand these, the effects of our main effects of sex and adjusted weight. Earlier, we did that by removing the interaction term to the model and then focusing on the effects of sex. And wait, we could do that as well, which is in R, which is what we can do in our second approach. We could do that also by just going directly to your looking at our effect sizes using the EM means function. Okay? And, but using, still, using this model that includes this interaction, we would not do this if we had really any reason to believe there was a difference between the slopes for our tree different sexes. So if by plotting the data or from your P-value here to any reason to believe that there could be a difference between these between the slopes are two sexes, then you would you'd be very wary of taking the approach that I'm going to take next. But because we have no evidence for there being interaction, which I'll show you some more in a moment. This is, this is a fine thing to do. Okay? So let's focus now on understanding these main effects and we're going to do that using EM meets. Okay? Let's start again by looking at the effect of sex. So we're going to say ba, da da da da da. Didn't mean to tape it down there. We'll say BPI. Dot EM means dot two because we're taking our second approach will say EM means and be p dot lm T3. And that's the one we want. And we want to characterize the estimated marginal means for each of our levels of sex. There we go. Now we're getting this warning because we have an interaction in the model. Okay? But we're going to ignore it for now because based in our p-value, we have no reason to believe that there is actually an interaction later on, but we're going to estimates the slopes for each of our two sexes. And you're going to see from that process, again, there's every reason to believe that there is no interaction between sex and wait. Okay. So because we have no evidence to believe that there's an interaction, we're going to ignore this warning. So let's look at our estimated marginal means. Okay? And so here they are. And so again, we get a value of a hundred and eight and a hundred and seventeen. I can't remember what we got before. Let's compare specifically our standard errors. So here we have 0.721.46. So our standard errors are very slightly different, slightly different between the output from these two different models. And that's not unexpected because our models, we're doing different things. In one case, we forced the effect of weights to be the same between the two groups. In this case, we're working with now we've allowed the effective sex to be different, or the effective waits to be different between the two groups. So it's not surprising that our estimates of R e means are very slightly different between our two approaches. So we could report these estimated marginal means and their standard errors. We also want to get an effect size, which you can get just by saying pairs. And so there is our effect size, just like we got before. And if you wanted to get a confidence interval, then we could just do that. Similarly, just by taking the output of pairs and put it directly into the output of confidence interval. And if I remember right, confidence intervals here are pretty similar to what we got above in the first analysis. Okay? So that's how we can understand the effect of sex, even when we're keeping our interaction in our model. Okay? And if we were, when we do this, we can see that EM means is telling us we have strong reason to believe that there is an effect of sex that's there. We're getting here as our output for pairs. When we did our contrast between females and males. R p value for our contrast indicates that we have strong evidence to believe that there's a difference in the blood pressure between females and males. We also get that same conclusion from looking at confidence intervals for that difference. And so II-A means is telling us, yes, we have good reason to believe there's difference between females and males. Even though our p-value, which we got, which we got much earlier on up here, told us that there was no effect of yet no effective sex. Okay. And again, if you remember, the reason why we get this contradictory result is because of something called variance inflation, which we're not going to talk about in this, in this video. Okay? So that is how we would characterize the effect of characterize the effect of sex. I'm using the second approach. If you wanted to understand the the differences. If you want to calculate the slopes, then again, we use our EM trends function. We want to use it for this output. We want to calculate the trend are calculates the slope separately for each sex. And the variable for which you want to calculate the slope is weight or adjusted weight. Look at that output. Now we can see that we're getting to separate estimates for the slopes of females and males. Sorry, we're getting two different estimates of the slope for the relationship between weight and systolic blood pressure. We get one of those slopes for females and one slope for males. And you can see that there's very little difference between them. For females, the slope is roughly 0.35. For males it's roughly 0.36. And these based on these standard errors, there's really no reason to believe that these that these slopes will be different from one another. Okay. So when I said earlier that we can use additional evidence to decide whether or not there's an interaction. This is what I was referring to. If we just calculate the slope separately for females and males, we can see that. Our estimates of these slopes are remarkably similar. Okay, So based on both our estimates of our slopes and from the P-value that we got from our type three sums of squares. Both of those lines of lines of reasoning suggests to us that there's no sign of an interaction. Ok? So that is largely how we would go about analyzing our data and giving you two different approaches to analyze this data set. Let's finish just saying, you know, how would we report these results? Just check into my list to make sure I've actually talked about everything that I want to talk about. I think I've covered everything so we can now talk about how we could report these results. And I'll show you how could you report them using each approach, okay? So this first wave, reporting the data is what we might report if we had removed the interaction. So first of all, we'd present a nice figure like this. That's the thing I forgot to show you after we walk through these results, I'll go back and show you the code to create a figure like this. So we could present a figure where we're color-coding the female and male data points differently, where color-coding are different lines we might be fitting through our data. And these lines have the same slope. And we could just give a figure caption like this where we say systolic blood pressure of females in red and males in blue. First-year biomedical students as a function of adjusted weight. And the lines are estimated from a general linear model that included sex adjusted weight. Oops, I should, I, this is a typo. Forget that it says and their interaction, we should remove that. Okay? And then we're just going to present one slope because as we saw earlier, if we remove the interaction, that we only get one estimate for the slope and that's what we get here. Ok, some presenting the slope with the standard error for that slope and 95% confidence intervals for that slope. And then we could report our main results with something like this. I'm not gonna read through all of this because that will just put everyone to sleep. I'll say if you want to know exactly what's said here, than just pause the video and have a look when I want to point out, however, is that I'm reporting the F statistic and the F statistic with the degrees of freedom for each of the different aspects of our model. So we're, I've started out by six, I'll just read the first sentence said a general linear model including sex adjusted weight and under, and their interaction, indicated the slope of relationship between adjusted weight and systolic blood pressure did not differ between females and males. So referring to the interaction, here's the F value for that interaction and the degrees of freedom. You can get the degrees of freedom from the outputs of anova. With a capital a. Output of anova with a small a will also give you the degrees of freedom that you need. We need the degrees of freedom for our interaction and for the residuals. And then we present our p-value and I've referred to the figure. And then later on in this explanation, I report the FBI's with degrees of freedom. Woops. Looks like I've made a typo somewhere. So these degrees of freedom I just noticed, or 362 there and there was 361. Apologies for these typos. Okay. We report the the f value and p-values for those main effects. And then we just go on to talk about what the effects of sex and the slope R, Okay? So we say the marginal means of females and males are equal to this and that with our standard errors. And then we can talk about our effect size. So therefore, after accounting for the effective adjusted weight, that means systolic blood pressure for females was this much smaller than for males. And so we give our effect size with our standard error. And then, and we can give confidence intervals for that effect size as well. Okay? I've not reported the slope here because we did that in our previous slide with the, in the figure caption. And now we can report something similar, very similar based on our approach where we keep the interaction in the model. I'll just point out the differences. First of all, if we keep the interaction and we can calculate two different slopes for females and males. And so I'm reporting the results for the slope of females and the results, the slope for males in this figure caption, that's the main difference. And then in this write up, again, you can just pause it to read it. The main thing that I've changes that I'm not reporting the p-values for the main effects. Instead, we're going directly in for the effects of our contrasts. Okay? And that's pretty much where we will end except for my showing you the code for creating that figure. I hit it for myself earlier. And so we can create this plots just by running this code. Most of this code should be familiar to you by now. So I've already explained within the plot function how you can color code the dots by using this if else statement. Previously, we've talked about how you can control the range of the y-axis by saying why limb and then giving a vector which specifies the minimum and maximum for the y-axis. And then I've just labeled the x and y axes with the x lab and y lab options. This time I decided not to use the mtext function. And this function here allows us to change the font size of the, of the labels. This is really the main thing that I wanted to show you that was new. Which is we can plot lines on our, on our plot by using the function AB line where what we list first is the intercept, the Y intercept. And you'll remember that we calculated the y intercepts way, way back earlier in in this video. Okay, when I told you to remember how we did that. So we're now making use of those y intercepts here to plot these, these lines on our figure. So you, the AB line function takes the y-intercept as 1and argument. And then next it takes the slope. That's all it needs. But then you can add other options. Like you can say what color you want the line to be. You can change the thickness of the line or what the line looks like, et cetera. Here I've only change the color and I've done that separately for the mail data and the female data. Okay, and so that's how you can create a plot like this. And we'll stop the video there. We've covered tons of ground. I hope this has been helpful and I'll say, thank you very much.