Skip to main content

B. Com. i. OE- II Practical No. 2 Measures of Dispersion – II

 

      Measures of Dispersion – II: Range, Quartile Deviation, Standard Deviation and their respective relative measures for Grouped data.

 

Measures of Dispersion

 

I. Introduction.

II. Requirements of good measures.

III. Uses of Measures of Dispersion.

IV. Methods Of Studying Dispersion:

    i. Absolute Measures of Dispersions

         i. Range (R) 

        ii. Quartile Deviation (Q.D.)

          iii. Standard Deviation (S. D.)

        iv. Variance

   ii. Relative Measures of Dispersions

         i. Coefficient of Range 

        ii. Coefficient of Quartile Deviation (Q.D.) 

        iii. Coefficient of Standard Deviation (S. D.)

        iv. Coefficient of Variation (C.V.) 

                                                                                                              

I. Introduction.

We have the various measures of central tendency, like Mean, Median & Mode,  it is a single figure that represent the whole data. Now we are interested to study this figure(i.e. measures of central tendency) is proper representative of  actual values of the data. If most of the actual values in the data are close to the average then it is properly represent the data, and if the actual values of data are away from the average then it not properly represent the data. In that case we are interested to study  how far the actual values away from the average is known as Dispersion. i.e. Dispersion Means the Spread of actual values from the average or mean. 

foe example we consider the following data, 

Observations

Total

Mean

Set A :

101

98

99

102

400

100

Set B :

1

0

1

398

400

100

 

The size of data and mean of the bot sets are same, is 100, but question is the mean is good representative of data or not ? both set have same mean, but the values in the set A are very close to the mean, then the mean is good representative, but in Set B the values are far from the mean. hence mean is not good representative of the data in set B, from this we see the difference between mean and actual values of data are less then mean is properly represent the data, (for set A), therefore we measure the variation in the data. or the measure of  Dispersion. 

    i. Definition:- Dispersion is the Spread of value from the mean, Or deviation of different values of the data from it mean is known as Dispersion.

Following are the Objectives of  Measures of Dispersion. 

    i. To Measure the Reliability of an average.

    ii. To Compare the variability of different distribution. 

    iii, To control the variability. 

                                                                                                              

II. Requirements of good measures.

The main objective of the Dispersion is to measure the Reliability of an Average. following are the Properties of good measures of dispersion.

        i. It should be simple to understand and rigidly defined.

        ii. It should be easy to calculate. 

        iii. It should be based on all observation in data.

        iv. It should have sampling stability.

        v. It should not ne unduly affected by extreme values. 

                                                                                                              

III. Uses of Measures of Dispersion.

Measure of Dispersion is also known as measure of variability or spread. following are the some uses of Measures of Dispersion.

    1.Understanding the distribution of data: Measure of dispersion help to understand the how data points are spread out around the central tendency (measure of central  tendency means mean , median , mode), a small dispersion indicates the data points are close to central value, while the dispersion is larger indicates the large variability in data set. 

    2. Comparing Data set: Using the measures of dispersion finding which data set has greater variability, or comparing this data set based on variability in data points.

                                                                                                              

IV. Methods Of Studying Dispersion:

There are two types of measures of dispersion, i) Absolute Measure of dispersion and ii) Relative Measures of dispersion. 

    i) Absolute Measures of Dispersions: The measure of dispersion is expressed in the term of original unit of the data are called Absolute Measures of Dispersion. and following are the Absolute measures of Dispersions. 

        i. Range (R) 

        ii. Quartile Deviation (Q.D.) 

        iii. Mean Deviation (M.D.)

        iv. Standard Deviation (S. D.)

        v. Variance

 

    ii.)  Relative Measures of Dispersions: The measure of dispersion is expressed in Ratio or Percentage of are called Relative Measures of Dispersion. and following are the Relative measures of Dispersions. 

        i. Coefficient of Range 

        ii. Coefficient of Quartile Deviation (Q.D.) 

        iii. Coefficient of Mean Deviation (M.D.)

        iv. Coefficient of Standard Deviation (S. D.)

        v. Coefficient of Variation (C.V.) 

                                                                                                              

we study one by one the measures of Dispersions 

I. Range

Definition:- The range is the one of the simplest method of measuring Dispersion. It is defined as the Difference between the largest and smallest values of the data. Or the range is defined as the difference between maximum and minimum values in data sets. 

 it is formulated as

Range = L - S

Where L:- the largest value in data

S:- the smallest value in data. 

Sometimes the Range is denoted as R.

It is an absolute measure of Dispersion. 

Now the Relative measure of Dispersion corresponding to Range is called the coefficient of range. It is gives by 

Coefficient of range = (L-S)/(L+S)

📑 MERITS OF RANGE

I. It is simple to understand.

II. It is easy to calculate. 

📑DEMERITS OF RANGE

I. It has no sampling stability.

II. It is effected by extreme values.

Range is the simplest measure that it gives quick idea about the spread in data. a large range indicates a wider variability, and a smaller range indicates narrower Spread. (i.e. the data points in dataset  are closer together and it has less variability) it indicates less dispersion. the greatest drawback of range is that is too much affected by extreme values.

Examples on Rang for Grouped data: 

 

Find the Range and Coefficient of range  for the following data.

 

I.Q.

60-70

70-80

80-90

90-100

100-110

110-120

Freq.

7

12

28

42

30

10

 

Aim : TO Find the Range and Coefficient of range  for the following data.

Formula:

Range = L – S

Coefficient of range = (L-S)/(L+S)

Calculation :

 for finding the range we need to determining the Difference between the upper limit of  the highest class limits and the  lower limit of the Lowest class limits. 

the upper limit of  the highest class limits = 120

and  the  lower limit of the Lowest class limits = 60

Range = 120-60  = 60

Range = 60

and coefficient of range = (120-60) / (120+60)   = 60 / 180 = 0.3333

 

The Range = 60 and Coefficient of Range = 0.3333

Result: Range = 60 and Coefficient of Range = 0.3333

 

                                                                                                              

 

II. Quartile Deviation: 

 we already seen that the greatest drawback of range is that is too much affected by extreme values. this drawback can be overcome by ignoring the extreme value, this can be done by ignoring 25% observation then finding the range of the data. that mean difference between third quartile and first quartile. this range is called Quartile Deviation (Q.D.) Or semi-inter quartile range. it is denotes as Q.D.

it is given by 

Quartile Deviation  = ( Q3 – Q1) / 2

Where Q3 is the third quartile of the data. 

and  Q1is the first quartile of the data.

and the quartile are calculates as Q1  = size of {(N+1) / 4 } th item.

here N  is the number of observations in data.

   Q3=size of {[3(N+1)] / 4 } th item.

the Q.D. is absolute measure of dispersion and the coefficient of Q.D. is the relative measure of dispersion.

therefore the coefficient of Q.D. =  ( Q3 – Q1)  /  ( Q3 + Q1

📑 MERITS OF Quartile Deviation

1. it is simple to understand 

2. easy to calculate.

3. it is not effected by extreme values.

4. it is specially useful in case of open-end classes.

 

📑DEMERITS OF Quartile Deviation

1.it is not useful for further mathematical calculations.

2. it ignoring the first 25% and last 25% data.

3. it is not based on all observations.

4. it has not sampling stability.

 

The quartile deviation is ignoring first 25% and last 25% data. mean it is based on the middle 50% data.

a large quartile deviation indicates the greater dispersion means grater variation in dataset, while the smaller quartile deviation indicates the smaller variation in data set.

 

 we see the example of Quartile Deviation And Coefficient of Quartile Deviation in next post. the Quartile Deviation is Based on the First and Third Quartiles. 

 

Quartile Deviation:

Quartile Deviation id calculated for i. Individual data, ii. Discrete data, iii. Continuous data

i. For Continuous Distribution: Grouped data 

Example . Calculating the quartile deviation and coefficient of quartile deviation for the following data.  

Marks

Frequency

60-70

2

70-80

7

80-90

12

90-100

28

100-110

42

110-120

36

120-130

18

130-140

10

140-150

3

150-160

2

Aim: - To Calculating the quartile deviation and coefficient of quartile deviation.

Statistical Formula:

Quartile Deviation  = ( Q3 – Q1) / 2

coefficient of Q.D. =  ( Q3 – Q1)  /  ( Q3 + Q1

 firstly we arranging the data in ascending order and adding cumulative frequency column.

Observation table:

Marks

Frequency

C.F.

60-70

2

2

70-80

7

9

80-90

12

21

90-100

28

49

100-110

42

91

110-120

36

127

120-130

18

145

130-140

10

155

140-150

3

158

150-160

2

160

Total 

160

 

Calculation :

Q1 = Size of {(n)/4}th item.

and  Q3 = Size of {3(n)/4}th item.

Q1 = Size of {(n)/4}th item. =  size of  {(160)/4} th item. =  40 th observation

for 40 th observation is corresponding to class 90-100, and lower limit of first quartile is 90 

Q1  = lower limit of first + {([n/4]-C.F.)/f} x i

Where C.F. = cumulative frequency of  previous class i.e. C.F. = 21

n/4 = 40

f = frequency of first quartile class i.e. f = 28

i = class width  =  upper limit - lower limit = 70-60 = 10

Q1  = 90 + {(40-21)/28} x 10

Q1  = 90+6.78 = 96.78

 Q3 = Size of {3(n)/4}th item.= 120 th observation

the third quartile class is 110-120, lower limit of third quartile class  is 110

C.F. = cumulative frequency of  previous class i.e. C.F. = 91

i = class width  =  upper limit - lower limit = 70-60 = 10

f = frequency of first quartile class i.e. f =36

3[n/4] = 3[160/4] = 120

Q3  = lower limit of first + {(3[n/4]-C.F.)/f} x i

Q3  =  110 + ({120-91}/36) X 10 

Q3  =  110 +8.06

Q3  =  118.06 

Quartile Deviation  =  (Q3 – Q1)/2  = (118.06 - 96.78)/2

Quartile Deviation = 10.64

Coefficient of Quartile Deviation =  (Q3 – Q1)/ (Q3 + Q1)

Coefficient of Quartile Deviation = 21.28/ 214.84   =  0.0990

Coefficient of Quartile Deviation  = 0.0990

therefore the Quartile Deviation is 10.64 and Coefficient of Quartile Deviation is  0.0990

Result:  the Quartile Deviation is 10.64 and Coefficient of Quartile Deviation is  0.0990

see all three examples to understand how to solved the examples based on the Quartile Deviation for individual, discrete and continuous data. the formula of Quartile Deviation based on the type or nature of data set, used proper formula to calculate Quartile Deviation And it coefficient of Quartile Deviation

 

III. Standard Deviation: 

we see the some measures of dispersions are Range, Quartile Deviation and Mean Deviation but in these measures of dispersions are not based on all the observations i-n dataset so it not gives the proper result or they are not more reliable then any other measure of dispersion. among the all measures of dispersions the Standard Deviation is more reliable and most important measure of dispersion. the standard deviation is based on the all observations in a dataset hence it is more reliable than any other. it is defined as " The square-root of the arithmetic mean od the square of deviation from the mean" so some times it is called as root mean square deviation. and it is denoted as σ (Sigma).

and it is write as S.D. 

and it is formulated as S.D. = σ = √[ ( ∑ f m2 / N) – (x̄)2] = Square-root of  [( ∑ f m2 / N) – (x̄)2]

where xi- is the i th observation and  i = 1, 2,...........n

x̄ is the arithmetic mean x̄ = ∑ f m / N

Merits of Standard Deviation i.e. S.D. 

i. Easy to define: it is rigidly defined or simple to define

ii. Based on all observations: It is based in all the observations in the data set, therefore if one value is changed in data then S.D. also changed. 

iii. it has sampling stability:  it is not affected by sampling fluctuations as compared to other measure of dispersions.

iv. the Standard deviation is used for further mathematical calculation. or analysis. 

v. the standard deviation is more reliable then any other measure of dispersion.

Demerit 

i. it is not simple to understand, and calculate.

The standard deviation is widely used in different field for analysis of data.

 

IV. Variance: 

The square of standard deviation is called variance. and variance has own importance in statistics. 

and it is denoted as σ2

 Variance =  σ2= ( ∑ f m2 / N) – (x̄)2

where xi- is the i th observation and  i = 1, 2,...........n

x̄ is the arithmetic mean x̄ = ∑ f m / N

now we see the it coefficients of measures of dispersions.

1. Coefficient of variation :

the standard deviation is the absolute measure of dispersion, they shoes the variability of actual value from it mean. it can be used to comparing the variability of two different groups. the variability of different group is can be expressed in percentage is called coefficient of variation and it denoted as 

C. V. 

Coefficient Of Variation = C. V. = (S.D. / Mean ) x 100

if we are interested in comparing the variability of different groups then we use the Coefficient of Variation. if the C. V. is higher then the data have higher variability. mean the values in that data set are far from the mean of that data. 

we also find the coefficient of standard deviation as 

coefficient of standard deviation = (S.D. / Mean) 

 and the difference between the Coefficient of S.D. and C. V.  is in the Coefficient of S.D. we take the ratio of S.D and Mean and in the C.V we take ratio of S.D and mean and multiply by 100 to express in   percentage.

note that  the c.v. is less than 100 but in some cases the value of S.D. is larger than Mean then C .V . is greater than 100.

 Example : Calculate Standard deviation, variance and C.V. for given data.

Calculate C.V.

Marks

0-10

10-20

20-30

30-40

40-50

50-60

F

17

27

36

47

24

12

Aim: - To Calculate Standard deviation, variance and C.V. for given data.

Statistical Formula:

S.D. = σ = √ [ ( ∑ f m2 / N) – (x̄)2

Variance = σ2= ( ∑ f m2 / N) – (x̄)2

C.V. = (S.D. / Mean) x 100

Observation Table:

Marks

F

mid-point m

fm

fm^2

0-10

17

5

85

425

10 - 20

27

15

405

6075

20-30

36

25

900

22500

30-40

47

35

1645

57575

40-50

24

45

1080

48600

50-60

12

55

660

36300

Total

163

 

4775

171475

Calculation :

x̄ = ∑ f m / N

x̄ = 4778 / 163

x̄ = 29.2945

S.D. = σ = √ [ ( ∑ f m2 / N) – (x̄)2] =  Square-root of  [( ∑ f m2 / N) – (x̄)2]

S.D. = σ = √ [ ( 171475 / 163) – (29.2945)2

S.D. = σ = √ [ (1051.9939) – (29.2945)2

S.D. = σ = √ (193.8262) = 13.9221

Variance = σ2= ( ∑ f m2 / N) – (x̄)2

Variance = σ2= [ ( 171475 / 163) – (29.2945)2

Variance = σ2= [ (1051.9939) – (29.2945)2

Variance = σ2= 193.8262

C.V. = (S.D. / Mean) x 100

C.V. = ( 13.9221 / 29.2945) x 100

C.V. = 0.4752 X 100

C.V. = 47.52%

Result : Standard deviation, variance and C.V. of  given data is

S.D. = σ = 13.9221, Variance = σ2= 193.8262, C.V. = 47.52%

 

 

 

 

 

 

 

 

 

 

 thank you for visiting Shree GaneshA Statistics. share it with your friends.

if any Question feel free to contact me: 

https://gsstats.blogspot.com/p/contact-us.html 


 

Comments

Popular posts from this blog

Statistical Inference II Notes

Likelihood Ratio Test 

Index Number

 Index Number      Introduction  We seen in measures of central tendency the data can be reduced to a single figure by calculating an average and two series can be compared by their averages. But the data are homogeneous then the average is meaningful. (Data is homogeneous means data in same type). If the two series of the price of commodity for two years. It is clear that we cannot compare the cost of living for two years by using simple average of the price of the commodities. For that type of problem we need type of average is called Index number. Index number firstly defined or developed to study the effect of price change on the cost of living. But now days the theory of index number is extended to the field of wholesale price, industrial production, agricultural production etc. Index number is like barometers to measure the change in change in economics activities.   An index may be defined as a " specialized  average designed to measure the...

Statistical Inference: Basic Terms and Definitions.

  📚📖 Statistical Inference: Basic Terms. The theory of estimation is of paramount importance in statistics for several reasons. Firstly, it allows researchers to make informed inferences about population characteristics based on limited sample data. Since it is often impractical or impossible to measure an entire population, estimation provides a framework to generalize findings from a sample to the larger population. By employing various estimation methods, statisticians can estimate population parameters such as means, proportions, and variances, providing valuable insights into the population's characteristics. Second, the theory of estimating aids in quantifying the estimates' inherent uncertainty. Measures like standard errors, confidence intervals, and p-values are included with estimators to provide  an idea of how accurate and reliable the estimates are. The range of possible values for the population characteristics and the degree of confidence attached to those est...

B. Com. -I Statistics Practical No. 1 Classification, tabulation and frequency distribution –I: Qualitative data.

  Shree GaneshA B. Com. Part – I: Semester – I OE–I    Semester – I (BASIC STATISTICS PRACTICAL-I) Practical: 60 Hrs. Marks: 50 (Credits: 02) Course Outcomes: After completion of this practical course, the student will be able to: i) apply sampling techniques in real life. ii) perform classification and tabulation of primary data. iii) represent the data by means of simple diagrams and graphs. iv) summarize data by computing measures of central tendency.   LIST OF PRACTICALS: 1. Classification, tabulation and frequency distribution –I: Qualitative data. 2. Classification, tabulation and frequency distribution –II : Quantitative data. 3. Diagrammatic representation of data by using Pie Diagram and Bar Diagrams. 4. Graphical representation of data by using Histogram, Frequency Polygon, Frequency Curve and     Locating Modal Value. 5. Graphical representation of data by using Ogive Curves and Locating Quartile Values....

Basic Concepts of Probability and Binomial Distribution , Poisson Distribution.

 Probability:  Basic concepts of Probability:  Probability is a way to measure hoe likely something is to happen. Probability is number between 0 and 1, where probability is 0 means is not happen at all and probability is 1 means it will be definitely happen, e.g. if we tossed coin there is a 50% chance to get head and 50% chance to get tail, it can be represented in probability as 0.5 for each outcome to get head and tail. Probability is used to help us taking decision and predicting the likelihood of the event in many areas, that are science, finance and Statistics.  Now we learn the some basic concepts that used in Probability:  i) Random Experiment OR Trail: A Random Experiment is an process that get one or more possible outcomes. examples of random experiment include tossing a coin, rolling a die, drawing  a card from pack of card etc. using this we specify the possible outcomes known as sample pace.  ii)Outcome: An outcome is a result of experi...

Time Series

 Time series  Introduction:-         We see the many variables are changes over period of time that are population (I.e. population are changes over time means population increase day by day), monthly demand of commodity, food production, agriculture production increases and that can be observed over period of times known as time series. Time series is defined as a set of observation arranged according to time is called time series. Or a time Series is a set of statistical observation arnging chronological order. ( Chronological order means it is arrangements of variable according to time) and it gives information about variable.  Also we draw the graph of time series to see the behaviour of variable over time. It can be used of forecasting. The analysis of time series is helpful to economist, business men, also for scientist etc. Because it used to forecasting the future, observing the past behaviour of that variable or items. Also planning for future...

Non- Parametric Test: Run Test

Non- Parametric Test  A Non-Parametric tests is a one of the part of Statistical tests that non-parametric test does not assume any particular distribution for analyzing the variable. unlike the parametric test are based on the assumption like normality or other specific distribution  of the variable. Non-parametric test is based on the rank, order, signs, or other non-numerical data. we know both test parametric and non-parametric, but when use particular test? answer is that if the assumption of parametric test are violated such as data is not normally distributed or sample size is small. then we use Non-parametric test they can used to analyse categorical data  or ordinal data and data are obtained form in field like psychology, sociology and biology. For the analysis use the  some non-parametric test that are Wilcoxon signed-ranked test, mann-whiteny U test, sign test, Run test, Kruskal-wallis test. but the non-parametric test have lower statistical power than ...

B. Com. I Practical No. 4 :Graphical representation of data by using Histogram, Frequency Polygon, Frequency Curve and Locating Modal Value.

Practical No. 4 Graphical representation of data by using Histogram, Frequency Polygon, Frequency Curve and Locating Modal Value   Graphical Representation: The representation of numerical data into graphs is called graphical representation of data. following are the graphs to represent a data i.                     Histogram ii.                 Frequency Polygon    iii.                Frequency Curve iv.        Locating Modal Value i.     Histogram: Histogram is one of the simplest methods to representing the grouped (continuous) frequency distribution. And histogram is defined as A pictorial representation of grouped (or continuous) frequency distribution to drawing a...

Statistical Quality Control

 Statistical Quality Control  Statistical quality control (S. Q. C.) is a branch of Statistics it deals with the application of statistical methods to control and improve that quality of product. In this use statistical methods of sampling and test of significance to monitoring and controlling than quality of product during the production process.  The most important word in statistical Quality control is quality  The quality of product is the most important property while purchasing that product the product fulfill or meets the requirements and required specification we say it have good quality or quality product other wise not quality. Quality Control is the powerful technique to diagnosis the lack of quality in material, process of production.  Causes of variation:   When the product are produced in large scale there are variation in the size or composition the variation is inherent and inevitable in the quality of product these variation are clas...