Skip to main content

Table 2 Quality indicators for cell deviations from target (full population version of BHP)—district#3-digit-industry level

From: Population aggregates from administrative data samples–how good are they?

 

(i)

(ii)

(iii)

(iv)

(v)

 

Workers

Establishments

 

BHP 50%

SIAB Individual File

SIAB Basis Establishment File

BHP 50%

SIAB Basis Establishment File

Absolute deviation

 Mean

− 2.5

− 1.0

− 1.1

0.0

− 0.4

 p-value

0

0

0

0.26

0

 rmse

596.3

148.8

101.3

6.1

31.7

 mae

144.1

82.9

59.1

3.4

14.1

 Ratio

0.323

0.186

0.132

0.092

0.376

Percentage deviation

 Mean

0

0

0

0

− 0.001

 p-value

0.679

0.893

0.907

0.438

0.793

 mape

0.519

0.787

0.739

0.372

0.953

Percentage deviation

(size weighted)

 Mean

− 0.006

− 0.002

− 0.002

0

− 0.01

 p-value

0

0

0

0.26

0

 mape

0.323

0.186

0.132

0.092

0.376

N

789,616

789,616

789,616

789,616

789,616

  1. The table shows various quality indicators for cell deviations from the target dataset (the full population version of the BHP) at the district#3-digit-industry level, including absolute deviations, percentage deviations and percentage deviations weighted by cell size. Approximations are calculated for the number of workers and establishments. Calculations are based on the 50 percent sample of the BHP, the SIAB Individual File and the SIAB Basis Establishment File, respectively. Indicators are the mean error (mean), root mean squared error (rmse), mean absolute error (mae), mean absolute percentage error (mape) and ratio of the total sum of errors to the total sum of cell counts (ratio). p-values for a t-test of (mean) against zero are shown in (p-value)
  2. Sources: Establishment History Panel (BHP)—Version 7519 v2 (https://doi.org/10.5164/IAB.BHP7519.de.en.v2); Sample of Integrated Labour Market Biographies (SIAB)—Version 7519 v1 (https://doi.org/10.5164/IAB.SIAB7519.de.en.v1), own calculations