Skip to main content

Table 1 Quality indicators for cell deviations from target (full population version of BHP)—district level

From: Population aggregates from administrative data samples–how good are they?

 

(i)

(ii)

(iii)

(iv)

(v)

 

Workers

Establishments

 

BHP 50%

SIAB Individual File

SIAB Basis Establishment File

BHP 50%

SIAB Basis Establishment File

Absolute deviation

 Mean

− 488

− 196

− 207

1

− 74

 p-value

0.001

0

0

0.288

0

 rmse

8844

2103

1539

83

515

 mae

4727

1574

1137

62

351

 Ratio

0.054

0.018

0.013

0.008

0.047

Percentage deviation

 Mean

0

− 0.003

− 0.002

0

− 0.006

 p-value

0.849

0

0

0.039

0

 Mape

0.06

0.024

0.018

0.011

0.057

Percentage deviation

(size weighted)

 Mean

− 0.006

− 0.002

− 0.002

0

− 0.01

 p-value

0

0

0

0.288

0

 mape

0.054

0.018

0.013

0.008

0.047

 N

4010

4010

4010

4010

4010

  1. The table shows various quality indicators for cell deviations from the target dataset (the full population version of the BHP) at the district level, including absolute deviations, percentage deviations and percentage deviations weighted by cell size. Approximations are calculated for the number of workers and establishments. Calculations are based on the 50 percent sample of the BHP, the SIAB Individual File and the SIAB Basis Establishment File, respectively. Indicators are the mean error (mean), root mean squared error (rmse), mean absolute error (mae), mean absolute percentage error (mape) and ratio of the total sum of errors to the total sum of cell counts (ratio). p-values for a t-test of (mean) against zero are shown in (p-value)
  2. Sources: Establishment History Panel (BHP)—Version 7519 v2 (https://doi.org/10.5164/IAB.BHP7519.de.en.v2); Sample of Integrated Labour Market Biographies (SIAB)—Version 7519 v1 (https://doi.org/10.5164/IAB.SIAB7519.de.en.v1), own calculations