Skip to main content

Table 3 Quality indicators for cell deviations from target (full population version of BHP)—district#3-digit-industry level (with and without smoothing)

From: Population aggregates from administrative data samples–how good are they?

 

(i)

(ii)

(iii)

(iv)

 

Workers

Establishments

 

SIAB Basis Establishment File (no smoothing)

SIAB Basis Establishment File (smoothing)

SIAB Basis Establishment File (no smoothing)

SIAB Basis Establishment File (smoothing)

Absolute deviation

 Mean

− 0.8

− 0.8

− 0.3

− 0.3

 p-value

0

0

0

0

 rmse

86.2

96.9

27.0

23.6

 mae

42.8

38.5

10.2

6.4

 Ratio

0.132

0.119

0.376

0.234

Percentage deviation

 Mean

− 0.547

0.545

− 0.693

0.445

 p-value

0

0

0

0

 mape

0.849

0.808

1,035

0.83

Percentage deviation

(size weighted)

 Mean

− 0.002

− 0.002

− 0.01

− 0.01

 p-value

0

0

0

0

 mape

0.132

0.119

0.378

0.236

 N

1,090,720

1,090,720

1,090,720

1,090,720

  1. The table shows various quality indicators for cell deviations from the target dataset (the full population version of the BHP) at the district#3-digit-industry level, including absolute deviations, percentage deviations and percentage deviations weighted by cell size. Approximations are calculated for the number of workers and establishments. Calculations are based on the SIAB Basis Establishment File, with and without a smoothing adjustment. Indicators are the mean error (mean), root mean squared error (rmse), mean absolute error (mae), mean absolute percentage error (mape) and ratio of the total sum of errors to the total sum of cell counts (ratio). p-values for a test of (mean) against zero are shown in (p-value)
  2. Sources: Establishment History Panel (BHP)—Version 7519 v2 (https://doi.org/10.5164/IAB.BHP7519.de.en.v2); Sample of Integrated Labour Market Biographies (SIAB)—Version 7519 v1 (https://doi.org/10.5164/IAB.SIAB7519.de.en.v1), own calculations