Skip to main content

Statistical Significance in Hypothesis Testing

You started taking vitamin-D a couple of weeks ago and you notice it takes you less time to fall asleep at night, is it a result of the vitamin D or is it something else, is there maybe something else that causes you to fall asleep more easily?

So you decide to do an experiment.

You take 50 coworker volunteers, and you split them to two groups, one 125coworker group will get vitamin D the same one you took, while the other group will take a placebo.

You notice that the guys who took the real vitamin did get shorter amount of time to fall asleep.  Was it the vitamin D as the cause?

It could be, and it could also be the case that not, maybe they share a project they work on and it's going well, so they fall asleep better while the other ones are having hard time and it's not related therefore to any vitamin D they take or not.

This is where hypothesis testing and significance come into play.

Hypothesis testing is almost what we did here with the experiment, but we want to also know was the vitamin D correlated to the change in sleep behavior or not?

Basically we want to answer this question:

We calculate how unlikely was it for the sleep pattern to change, or was it likely and sleep patterns do change for these coworkers, so we cannot explain it with the vitamin D.

Because if sleep patterns change from time to time for our workers then it's not that of strange thing that it changed while taking the vitamin D in other words it's just a thing that happens.

We have two ways to estimate whether the change was random or not.

1. Check how much did the sleep pattern change, did they move from 30 minutes to fall asleep to 30 seconds? If that's the case we might be up to something because this change was huge! If it was only from 30 minutes to 29 minutes then this could well be a random thing.  The higher the change then we say the more significant it is, and we would give this significance a higher number.

So the first thing we check how much change did we have.

The second thing which we are going to check in order to estimate whether this change was just due to some random stuff going in their life or could be related actually to the pill we gave them is whether the set of results we got from them is very diversified or not.

For example if all of them reduced the time to sleep in exactly x minutes, i mean all of them then we have no variance in the results and this could be more suspiciously related to the pill.

However, if one guy reduced it by one second one by 29 minutes and for the third it raised, then we have a lot of variance and it's less likely that we can deduce something about the pill with strong significance.

Therefore, we have two items we measure in order to check whether we can make any claims about the hypothesis, how far is our average value from the original average value they had, the farther the more effect we got and more significant and how variant our measurements are with relation one to another the more diversified the less we can say it's significant.

So if you look at how we check for whether we can make any claims on our hypothesis with high significance what we are doing is looking at the results we got, compare them to the original result see how far it is, and how spread the results themselves are.

However, I skipped something important, which I didn't tell you yet.  And it stands at the basis of significance check.  When we look at the standard variation of our data, we don't just look at the standard variation of it, we look at the standard variation of the average value of all our measurements.

StdDev of Average Of Measruements.
Why is that? Why do we care about the stddev of the average of measurements and not the stddev of our measurements themself when talking about significance?

This is because it was proven and you can also intuitively see that the standard deviation of the mean decreases when we have more measurements.

This is because it does not matter the source distribution of the measurements when you check the average they always behave in normal distribution!

If you toss a coin and call the heads 1 and tails 0, then if you toss it 10 times then the chance of getting head or tail is 0.5 for each coin toss.  It's uniform.

However, if you toss a coin 2 times and then sum the result and then toss another 2 time and sum the results than with the same 10 experiments if you count the sum of those experiments then the sum could be 10 but it could be 5 or could be 0 it's not only 1 or 0 for each toss.

So if we look at the average of these toss coins we get the normal distribution.

And this is because we look at average, or the sum of the results, when you look at some results and averages there are multiple possible outcomes, and they always behave in the normal distribution form.

So now that we know that our averages behave in a specific distribution form we can look at those results and deduce stuff we can tell hey this result was really off the normal distribution curve, there was so little chance this would happen this must be significant.

To sum up

The significance is a number, we calculate this number with formulas, however there is an intuition when you look at any of these statistical formulas that compute the significance of our experiment results you would always see that what they do is check how different the average that we got in the experiments is different from population average (if we know it) the higher this difference then the more significant our result is, however the higher the variation of the averages we got in our samples then this means our samples results are not stable and therefore it's more hard to conclude conclusions about experiment and therefore we have a lower significance.


Popular posts from this blog

Dev OnCall Patterns

Introduction Being On-Call is not easy. So does writing software. Being On-Call is not just a magic solution, anyone who has been On-Call can tell you that, it's a stressful, you could be woken up at the middle of the night, and be undress stress, there are way's to mitigate that. White having software developers as On-Calls has its benefits, in order to preserve the benefits you should take special measurements in order to mitigate the stress and lack of sleep missing work-life balance that comes along with it. Many software developers can tell you that even if they were not being contacted the thought of being available 24/7 had its toll on them. But on the contrary a software developer who is an On-Call's gains many insights into troubleshooting, responsibility and deeper understanding of the code that he and his peers wrote. Being an On-Call all has become a natural part of software development. Please note I do not call software development software engineering b

SQL Window functions (OVER, PARTITION_BY, ...)

Introduction When you run an SQL Query you select rows, but what if you want to have a summary per multiple rows, for example you want to get the top basketball for each country, in this case we don't only group by country, but we want also to get the top player for each of the country.  This means we want to group by country and then select the first player.  In standard SQL we do this with joining with same table, but we could also use partition by and windowing functions. For each row the window function is computed across the rows that fall into the same partition as the current row.  Window functions are permitted only in the  SELECT  list and the  ORDER BY  clause of the query They are forbidden elsewhere, such as in  GROUP BY ,  HAVING  and  WHERE  clauses. This is because they logically execute after the processing of those clauses Over, Partition By So in order to do a window we need this input: - How do we want to group the data which windows do we want to have? so  def c

Building Secure and Reliable Systems

A recent book was published this year by Google about site reliability and security engineering, I would like to provide you a brief overview of it and incorporate my own analysis and thoughts about this subject while saving you some time from reading, at least part of it. Take a few of your customers and ask them, what are the top 5 features on my product that you like.  The answer that you are likely to get is, I really like how polished the UI is, or the daily report I get by mail is just fantastic, or since I started using your product I was able to save one hour a day my productivity got up and the share /chat button on document that you added recently is doing a great job. Your customers are very unlikely to answer the question of what top 5 features of my product do you like with I really like its security or I really like that we lost no chat messages since I started using it.  No real customer will even think of it, moreover, assuming you did a very good job, they won&#