Skip to main content

Non-Parametric: Wilcoxon's Signed rank test for one sample and Example.

Non - Parametric Test 

Wilcoxon's  signed rank test:

In the previous blog we see the sign test. in sign test we converting the observation into (+) plus and (-) minus sign there is no considering the magnitude of observation. if the data are measured in interval or ratio scale, it is the draw back of sign test it can be overcome. if the information about the variable are given in ordinal scale then we use sign test but if the data are given in interval or ratio scale then not recommended the sign test. therefore overcome this drawback of sign test we use wilcoxons sign ranked test.in that test use sign as well as magnitude of difference. therefore the wilcoxons test use more information than sign test, hence it is more powerful than sign test. And wilcoxons test is alternative to   t-test. 

Assumptions: The wilcoxons sign rank test required following assumptions.

1. The sample selected form population with unknown median.

2. The variable for under study is continuous. 

3.The scale of measurement is at least interval scale.

 Procedure: The procedure of Wilcoxons signed rank test.

Let X1 ,X2, X3, ..........Xn. be a random sample of size n arranged in order of  occurrence. form the population with unknown median. now we wish to test the hypothesis that the specified value m0 (i.e. the hypothetical median is m ) of population median. 

Hypothesis: The null and alternative hypothesis is as 

H:μ=μ0  VS

H≠ μ0      

the test consist following steps as

Step I: We subtracting  m0 from each observation (i.e. obtaining the difference between the  m0 and each observation and it is denoted as di with their signs plus and minus. i.e. di  = Xi -m0 .

but one of the observation is equal to m0 then we discarding the observation form data. then the sample size id reduced it is denoted as n.

Step II :- in next step we take the absolute value of difference di as |di |

Step III:- in this step we ranking the di with their magnitude from smallest to largest value, the smallest observation have rank 1 and second smallest observation have rank 2 and so on. if the tied occur we take the average of rank and assigned same rank to tied observations.

Step IV :- in this step we assigned the original sign  of difference to the rank.

Step V:- we calculating the sum of positive rank and denotes as T+  and sum of negative rank is denotes as T- 

Step VI: for taking the decision about hypothesis the test statistics is smaller value of T+ and  T- and the test statistics  is compared with critical value for the given level of significance. here critical value id obtained using the table, of wilcoxons critical value table.

the decision criteria is based on the number of observation  because it dependant on sample size.

1. if the observation less than or equal to 25 then we use small sample test.

for small sample test we use 

the test statistics is

T= Min(T+ , T- )

if the calculated T is less than or equal to the critical value  Tα/2  at α% of level of significance. i.e. T  Tα/2 

then we reject the null hypothesis  at α% of level of significance. other wise we accept null hypothesis.

2.   if the observation Greater than 25 then we use large sample test. 

for large sample test the test statistics has approximately normal distribution with mean 

E(T) = {n(n+1)}/4

And variance is Var (T) = {n(n+1)(2n+1)}/24

Therefore the test statistics of Z-test is

Z= {T-E(T)}/S.D(T) and Z Has normal distribution

Where E(T) = {n(n+1)}/4,  Var (T) = {n(n+1)(2n+1)}/24

We calculating the Z and compared with Critical value at α % of level of significance. And taking decision about hypothesis. 

here calculated Z  value is lies in Non- rejection area we Accept the hypothesis.

carefully check this values. 

e.g. the calculated value of Z = -0.31 and critical value is +-1.96

then the calculated value  -0.31 is greater than the critical value -1.96 and less than +1.96 that means it lies is Non-rejection area hence we Accept the Null Hypothesis.

 Example on wilcoxon's sign test for one sample test.

Ex. 1. A random sample of 15 children of one month or older shows the following pulse rates (beats per minutes) 119, 120, 125, 122, 118, 117, 126, 114, 115, 126, 121, 120, 124, 127, 126. assuming that the distribution of pulse rate is symmetric about median its median and continuous, is there evidence to suggest that the median of pulse rate is 120 beats per minute at 5% level of significance. 

Answer:- Here the only value are given the distribution of pulse rate not given so we does not use         (assumption of normality not fulfill)  parametric test. and the assumption of Wilcoxon test are fulfill then we use wilcoxon's sign rank test for the  testing. 

we want to test the median of pulse rate is 120 or not for that we defining the null and alternative hypothesis as 

H:μ0=120  (i.e. the median of pulse rate is 120)

 VS

H0≠120  (i.e. the median of pulse rate is not equal to 120)

the test statistics for the test is T= Min(T+ , T- )

where T+ = sum of positive rank and T- = sum of negative rank.

now calculating T+ & T-  using table 

Sr. No.

Pulse rate X

Di=(Xi-m)

| Di|

Rank

Signed Rank

1

119

-1

1

1.5

-1.5

2

120

-

-

-

-

3

125

5

5

7.5

7.5

4

122

2

2

3.5

3.5

5

118

-2

2

3.5

-3.5

6

117

-3

3

5

-5

7

126

6

6

10.5

10.5

8

114

-6

6

10.5

-10.5

9

115

-5

5

7.5

-7.5

10

126

6

6

10.5

10.5

11

121

1

1

1.5

1.5

12

120

-

-

-

-

13

124

4

4

6

6

14

127

7

7

13

13

15

126

6

6

10.5

10.5

from the above table we calculate T+ & T-  

 T+ =7.5+3.5+ 10.5+1.5+6+13+10.5 = 63

 T- = 1.5+3.5+5+10.5+7.5 = 28

here n =  total number of plus and minus sings = 13

here the n is less than 20 then we use small sample test. (i.e. n < 20)

we have the test statistics T= Min( T+ , T-

T = Min ( 63, 28) = 28

it is calculated value of T statistics now the critical value corresponding to n=13 and at  5%  level of significance. form the table it will be 18

now comparing this values the calculated value is greater than the critical value(i.e. 28 18) then we Accept the null hypothesis. hence we conclude that the median of pulse rate is equal to 120. 


Ex. 2.  The following data show the weights of 34 students in collage. 

49, 50, 51, 48, 47, 48, 46, 47, 45, 25, 65, 59, 58, 47, 49, 46, 41, 40, 58, 49, 57, 45, 85, 48, 48, 47, 69, 58, 64, 57, 59, 52, 51, 42.

to test the median weight of students is 50 kg at 5% level of significance.

Answer Here the only value are given the distribution of Weight not given so we does not use    ( assumption of normality not fulfill)  parametric test. and the assumption of Wilcoxon test are fulfill then we use wilcoxon's sign rank test for the  testing. 

H0 : μ0=50 (i.e. the median of Weight is 50)

H1μ0≠50(i.e. the median of Weight is not equal to 50) 

the test statistics for the test is T= Min(T+ , T- )

where T+ = sum of positive rank and T- = sum of negative rank.

now calculating T+ & T-  using table 

Sr. No.

Weight X

Di=(Xi-m)

| Di|

Rank

Signed Rank

1

49

-1

1

3

-3

2

50

-

-

-

 -

3

51

1

1

3

3

4

48

-2

2

8

-8

5

47

-3

3

12.5

-12.5

6

48

-2

2

8

-8

7

46

-4

4

15.5

-15.5

8

47

-3

3

12.5

-12.5

9

45

-5

5

17.5

-17.5

10

25

-25

25

32

-32

11

65

15

15

30

30

12

59

9

9

26

26

13

58

8

8

16.5

16.5

14

47

-3

3

12.5

-12.5

15

49

-1

1

3

-3

16

46

-4

4

15.5

-15.5

17

41

-9

9

26

-26

18

40

-10

10

28

-28

19

58

8

8

16.5

16.5

20

49

-1

1

3

-3

21

57

7

7

19.5

19.5

22

45

-5

5

17.5

-17.5

23

85

35

35

33

33

24

48

-2

2

8

-8

25

48

-2

2

8

-8

26

47

-3

3

12.5

-12.5

27

69

19

19

31

31

28

58

8

8

16.5

16.5

29

64

14

14

29

29

30

57

7

7

19.5

19.5

31

59

9

9

26

26

32

52

2

2

8

8

33

51

1

1

3

3

34

42

-8

8

16.5

-16.5


from the above table we calculate T+ & T- 

 T+ =277.5

 T- = 259.5

 T= Min(277.5 , 259.5 ) = 259.5

here n =  total number of plus and minus sings = 33

here the n is Greater than 20 then we use large sample test. (i.e. n > 20)

for large sample test the test statistics has approximately normal distribution with mean 

E(T) = {n(n+1)}/4 =  33(34) / 4 = 280.5

And variance is Var (T) = {n(n+1)(2n+1)}/24 = (33*34*69)/ 24 = 3225.75

Therefore the test statistics of Z-test is

Z= {T-E(T)}/S.D(T) = {259.5-280.5}/Ö(3225.75) = -0.3697

Where S. D. =Ö(Var(T))

We calculating the Z and compared with Critical value at 5 % of level of significance.

the calculated value of Z = -0.3697 and critical value is +-1.96

then the calculated value  -0.3697 is greater than the critical value -1.96 and less than +1.96 That means it lies is Non-rejection area hence we Accept the Null Hypothesis.


Comments

Popular posts from this blog

Statistical Inference II Notes

Likelihood Ratio Test 

Statistical Inference: Basic Terms and Definitions.

  📚📖 Statistical Inference: Basic Terms. The theory of estimation is of paramount importance in statistics for several reasons. Firstly, it allows researchers to make informed inferences about population characteristics based on limited sample data. Since it is often impractical or impossible to measure an entire population, estimation provides a framework to generalize findings from a sample to the larger population. By employing various estimation methods, statisticians can estimate population parameters such as means, proportions, and variances, providing valuable insights into the population's characteristics. Second, the theory of estimating aids in quantifying the estimates' inherent uncertainty. Measures like standard errors, confidence intervals, and p-values are included with estimators to provide  an idea of how accurate and reliable the estimates are. The range of possible values for the population characteristics and the degree of confidence attached to those est...

B. Com. -I Statistics Practical No. 1 Classification, tabulation and frequency distribution –I: Qualitative data.

  Shree GaneshA B. Com. Part – I: Semester – I OE–I    Semester – I (BASIC STATISTICS PRACTICAL-I) Practical: 60 Hrs. Marks: 50 (Credits: 02) Course Outcomes: After completion of this practical course, the student will be able to: i) apply sampling techniques in real life. ii) perform classification and tabulation of primary data. iii) represent the data by means of simple diagrams and graphs. iv) summarize data by computing measures of central tendency.   LIST OF PRACTICALS: 1. Classification, tabulation and frequency distribution –I: Qualitative data. 2. Classification, tabulation and frequency distribution –II : Quantitative data. 3. Diagrammatic representation of data by using Pie Diagram and Bar Diagrams. 4. Graphical representation of data by using Histogram, Frequency Polygon, Frequency Curve and     Locating Modal Value. 5. Graphical representation of data by using Ogive Curves and Locating Quartile Values....

Index Number

 Index Number      Introduction  We seen in measures of central tendency the data can be reduced to a single figure by calculating an average and two series can be compared by their averages. But the data are homogeneous then the average is meaningful. (Data is homogeneous means data in same type). If the two series of the price of commodity for two years. It is clear that we cannot compare the cost of living for two years by using simple average of the price of the commodities. For that type of problem we need type of average is called Index number. Index number firstly defined or developed to study the effect of price change on the cost of living. But now days the theory of index number is extended to the field of wholesale price, industrial production, agricultural production etc. Index number is like barometers to measure the change in change in economics activities.   An index may be defined as a " specialized  average designed to measure the...

B. Com. I Practical No. 4 :Graphical representation of data by using Histogram, Frequency Polygon, Frequency Curve and Locating Modal Value.

Practical No. 4 Graphical representation of data by using Histogram, Frequency Polygon, Frequency Curve and Locating Modal Value   Graphical Representation: The representation of numerical data into graphs is called graphical representation of data. following are the graphs to represent a data i.                     Histogram ii.                 Frequency Polygon    iii.                Frequency Curve iv.        Locating Modal Value i.     Histogram: Histogram is one of the simplest methods to representing the grouped (continuous) frequency distribution. And histogram is defined as A pictorial representation of grouped (or continuous) frequency distribution to drawing a...

Basic Concepts of Probability and Binomial Distribution , Poisson Distribution.

 Probability:  Basic concepts of Probability:  Probability is a way to measure hoe likely something is to happen. Probability is number between 0 and 1, where probability is 0 means is not happen at all and probability is 1 means it will be definitely happen, e.g. if we tossed coin there is a 50% chance to get head and 50% chance to get tail, it can be represented in probability as 0.5 for each outcome to get head and tail. Probability is used to help us taking decision and predicting the likelihood of the event in many areas, that are science, finance and Statistics.  Now we learn the some basic concepts that used in Probability:  i) Random Experiment OR Trail: A Random Experiment is an process that get one or more possible outcomes. examples of random experiment include tossing a coin, rolling a die, drawing  a card from pack of card etc. using this we specify the possible outcomes known as sample pace.  ii)Outcome: An outcome is a result of experi...

Statistical Inference I ( Theory of Estimation) : Unbiased it's properties and examples

 📚Statistical Inference I Notes The theory of  estimation invented by Prof. R. A. Fisher in a series of fundamental papers in around 1930. Statistical inference is a process of drawing conclusions about a population based on the information gathered from a sample. It involves using statistical techniques to analyse data, estimate parameters, test hypotheses, and quantify uncertainty. In essence, it allows us to make inferences about a larger group (i.e. population) based on the characteristics observed in a smaller subset (i.e. sample) of that group. Notation of parameter: Let x be a random variable having distribution function F or f is a population distribution. the constant of  distribution function of F is known as Parameter. In general the parameter is denoted as any Greek Letters as θ.   now we see the some basic terms :  i. Population : in a statistics, The group of individual under study is called Population. the population is may be a group of obj...

Median test

 Non- Parametric test Median test Median test is also a Non-Parametric test and it is alternative to Parametric T test. The median test is used when we are interested to check the two independent sample have same median or not. It is useful when data is discrete or continuous and if data is in small size.  Assumptions:  I) the variable under study is ordinal scale II) the variable is random and Independent. The stepwise procedure for computation of median test for two independent sample : Step I :- firstly we define the hypothesis Null Hypothesis is the two independent sample have same median.  Against Alternative Hypothesis is the two independent sample have different median.  Step II :- In this step we combine two sample data. And calculating the median of combined data. Step III :- after that for testing hypothesis we constructing the (2x2) contingency table. For that table we divide the sample into two parts as number of observation above and below to the ...

Non- Parametric Test: Run Test

Non- Parametric Test  A Non-Parametric tests is a one of the part of Statistical tests that non-parametric test does not assume any particular distribution for analyzing the variable. unlike the parametric test are based on the assumption like normality or other specific distribution  of the variable. Non-parametric test is based on the rank, order, signs, or other non-numerical data. we know both test parametric and non-parametric, but when use particular test? answer is that if the assumption of parametric test are violated such as data is not normally distributed or sample size is small. then we use Non-parametric test they can used to analyse categorical data  or ordinal data and data are obtained form in field like psychology, sociology and biology. For the analysis use the  some non-parametric test that are Wilcoxon signed-ranked test, mann-whiteny U test, sign test, Run test, Kruskal-wallis test. but the non-parametric test have lower statistical power than ...