#### Chapter 11 Language of Descriptive Statistics

Section 11.2 Frequency Distributions and Percentage Calculation

# 11.2.1 Introduction

Let $X$ be a given property. A sample of size $n$ resulted in the original list (sample)

$x\mathrm{ }=\mathrm{ }\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right) .$

##### Info 11.2.1

If $a$ is a possible property value, then

${H}_{x}\left(a\right)\mathrm{ }=\mathrm{ }\text{number of}\mathrm{ }{x}_{j}\mathrm{ }\mathrm{ }\text{within the original list}\mathrm{ }x\mathrm{ }\text{with}\mathrm{ }{x}_{j}=a$

is called the absolute frequency of the property $a$ in the original list $x=\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$.

If ${a}_{1},{a}_{2},\dots {a}_{k}$ are the possible property values in the original list $x=\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$, then we have

${H}_{x}\left({a}_{1}\right)+{H}_{x}\left({a}_{2}\right)+\dots +{H}_{x}\left({a}_{k}\right)\mathrm{ }=\mathrm{ }n$

or in words: each of the $n$ values is counted by exactly one of the frequencies.
##### Info 11.2.2

The relative frequency of the property value $a$ in the original list $x=\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$ is defined by

${h}_{x}\left(a\right)\mathrm{ }=\mathrm{ }\frac{1}{n}·{H}_{x}\left(a\right) .$

If ${a}_{1},{a}_{2},\dots {a}_{k}$ are the possible property values in the original list $x=\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$, then we have

${h}_{x}\left({a}_{1}\right)+{h}_{x}\left({a}_{2}\right)+\dots +{h}_{x}\left({a}_{k}\right)\mathrm{ }=\mathrm{ }1 .$

Relative frequencies always lie in the interval $\left[0;1\right]$ and are often specified in percentages,e.g. ${h}_{x}\left({a}_{1}\right)=34%$ instead of ${h}_{x}\left({a}_{1}\right)=0.34$.
##### Info 11.2.3

Collecting the absolute or relative frequencies of all occurring (or possible) property values in the original list (sample) $x=\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$ in a table results in the empirical frequency distribution.

##### Example 11.2.4

In a data centre, the processing time (in seconds, rounded to one fractional digit) of $20$ program jobs was determined. This resulted in the following original list of a sample of size $n=20$:
 $3.9$ $3.3$ $4.6$ $4.0$ $3.8$ $3.8$ $3.6$ $4.6$ $4.0$ $3.9$ $3.9$ $3.9$ $4.1$ $3.7$ $3.6$ $4.6$ $4.0$ $4.0$ $3.8$ $4.1$

The smallest value is $3.3$ s, the largest value is $4.6$ s, the increment is $0.1$ s. Thus, we have the empirical frequency distribution listed (in tabular form) below. To keep the table short all values less than $3.3$ and greater that $4.6$ are not listed.
 Result $a$ ${H}_{x}\left(a\right)$ ${h}_{x}\left(a\right)$ Percentage $3.3$ $1$ $\frac{1}{20}=0.05$ $5%$ $3.4$ $0$ $0$ $0%$ $3.5$ $0$ $0$ $0%$ $3.6$ $2$ $\frac{2}{20}=0.1$ $10%$ $3.7$ $1$ $\frac{1}{20}=0.05$ $5%$ $3.8$ $3$ $\frac{3}{20}=0.15$ $15%$ $3.9$ $4$ $\frac{4}{20}=0.2$ $20%$ $4.0$ $4$ $\frac{4}{20}=0.2$ $20%$ $4.1$ $2$ $\frac{2}{20}=0.1$ $10%$ $4.2$ $0$ $0$ $0%$ $4.3$ $0$ $0$ $0%$ $4.4$ $0$ $0$ $0%$ $4.5$ $0$ $0$ $0%$ $4.6$ $3$ $\frac{3}{20}=0.15$ $15%$ Sum $20$ $1$ $100%$