Chapter 11 Language of Descriptive Statistics

Section 11.1 Terminology and Language

11.1.1 Introduction


For statistical observations (surveys) of appropriately chosen units of observation (a.k.a. units of investigation or experimental units), the values or attributes of a property or properties are determined. Here, a property is a characteristic of the observation unit to be investigated. The terminology of descriptive statistics is as follows:
  • The unit of investigation (also: unit of observation) is the smallest unit on which the observations are made.
  • The characteristic or property is the statistical variable of the unit to be investigated. Characteristics are often denoted by upper-case Latin letters ( X,Y,Z,).
  • Characteristic attributes or property values are values that properties can take. They are often denoted by lower-case Latin letters ( a,b,,x,y,z, a1 , a2 ,).
  • The set of units of observation that is investigated with respect to a property of interest is called universe or also population. It is the set of all possible observation units.
  • A sample is a "random finite subset" of a certain population of interest. If this set consists of n elements, then this set is called a "sample of size n".
  • Data are the observed values (attributes) of one or more characteristics or properties of a sample unit of observation of a certain population.
  • The original list is the protocol that lists the sampled data in chronological order. Thus, the original list is a n-tuple (or vector, written here mostly in coordinate form):

    x  =  ( x1 ,, xn ).

    This n-tuple is often called a "sample of size n".

Example 11.1.1
From a daily production of components in a factory, n=20 samples of 15 parts each are taken and the number of defective parts in each sample is determined. Here, xi is the number of defective parts in the ith sample, i=1,,20. The original list (sample of size n=20) contains the following data:

x  =  (0,4,2,1,1,0,0,2,3,1,0,5,3,1,1,2,0,0,1,0).

In the second sample, x2 =4 defective parts were found. The population in this example is the set of all 15-element subsets of the daily production. The property of interest is in this case

X  =   Number of defective workpieces in a sample of 15 elements .


Info 11.1.2
 

The variables in a statistical observation are called characteristics or properties. Values that the properties can take are called property values or characteristic attributes.

Properties are roughly classified into qualitative properties (that can be ascertained in a descriptive way) and quantitative properties (that can naturally be ascertained numerically):
  • Qualitative properties:
    • Nominal properties: attributes classified according to purely qualitative aspects. Examples: skin colour, nationality, blood type.
    • Ordinal properties: attributes with a natural hierarchy, i.e. they can be ordered or sorted. Examples: grades, ranks, surnames.

  • Quantitative properties:
    • Discrete properties: property values are isolated values (e.g. integers). Examples: numbers, years, age in years.
    • Continuous properties: property values can (at least in principle) take any value. Examples: body size, weight, length.


The transition between continuous and discrete properties is partly fluid, once we consider the possibility of rounding.