The labour market effects of the polish educational reform of 1999

We estimate the effect of the 1999 education reform in Poland on employment and earnings. The 1999 education reform in Poland replaced the previous 8 years of general and 3/4/5 years of tracked secondary education with 9 years of general and 3/3/4 years of tracked upper-secondary education. The reform also introduced new curricula, national examinations, teacher standards, and a transparent financing scheme. Our identification strategy relies on a difference-in-differences approach using a quasi-panel of pooled year-of-survey and age-of-respondent observations from the Polish sample of the EU-SILC database. The results indicate that the reform has increased employment probability (by around 3 percentage points) and earnings (by around 4%).


Introduction
In this study, we look at the Polish education reform of 1999 that, among others, has increased general education for all students, introduced a national core curriculum, standardised examinations at the end of each education stage, introduced new teacher qualification requirements, and adopted a transparent financing scheme.
Previous results on the effect of education reforms on labour market outcomes are mixed. Comprehensive education reforms in Scandinavia are usually hailed for their positive effects on labour market outcomes. Sweden carried out a major education reform in the 1950s, increasing compulsory education from 7/8 to 9 years, abolishing tracking (i.e. sorting students into general and vocational tracks) based on academic achievement after the sixth grade, and introducing a national curriculum. Meghir and Palme (2005) showed that this reform increased educational attainment and the future earnings of children with low educated parents. At the same time, the earnings of those with highly educated parents decreased, but the average effect was positive.
The reform in Finland in the 1970s delayed the tracking of students from age 11 to 16 by extending comprehensive education from 4 to 9 years. Pekkarinen, Uusitalo, and Kerr tested the effects of this reform on the income elasticity (Pekkarinen et al., 2009) and the average test scores (Kerr et al., 2013) and concluded that it had only a small but overall positive effect on both dimensions. In Norway, a school reform implemented between 1960 and 1972 increased compulsory education from 7 to 9 years. The 9 years of comprehensive education consisted of 6 years of primary school and a 3 year lower secondary school. Lower secondary schools existed before the reform, too, but attending them was voluntary and they were only available in some municipalities. This reform also improved the quality of education by standardising the curriculum. Aakvik et al. (2010) find that the reform increased the level of education in the country, and it also increased the returns to upper secondary and tertiary qualifications. However, some papers find no-returns to extending compulsory education in Germany (Pischke and von Wachter 2008) or France (Grenet 2013), and the story is similar if we look at education reforms that increased the time of education particularly for vocational students: no (substantial) effects were found for the Netherlands (Oosterbeek and Webbink 2007).
These differences in the results might stem from the differences in the type of education extended. While in Germany and France, the increase in education was after tracking, in the Anglo-Saxon (Grenet 2013) and Scandinavian countries, where a positive effect of additional years of schooling was found, it was before tracking. 1 We argue that the reason for this is that "before tracking reforms" not only extended education but also improved the quality of inputs, such as the curriculum, 2 teachers, or peers, as these are crucial in the education production function (e.g. Chetty et al. 2014;Rivkin et al. 2005;Sacerdote 2011). When additional years of education are "inserted" before students are selected into tracks, the effect is different from when they are already selected into academic or vocational tracks.
Offering increased general training to vocational students does not affect teachers and peers, as these inputs are unlikely to change due to the increased length of vocational education or increased general content. 3 However, if the system is 'de-tracked' , i.e. the age of selection is increased, the composition of teachers and peer groups will remain as they were before selection in that "inserted" year, and thus-at least for the later vocational track students-a higher peer and teacher effect might improve their long term outcomes.
The 1999 reform of Poland, which we analyse in this study, was a comprehensive one like the Scandinavian reforms. However, it was unique in the sense that besides extending comprehensive education, it altered all of the above-mentioned essential inputs in the education production function. This reform replaced the former 8 year general primary school with a 6 year general primary and a 3 year general lower secondary school. Therefore, the age of first selection was postponed from age 15 to 16. The length of tracked upper-secondary education was shortened by one year so that the overall length of primary and secondary education remained 12 years, except in the basic vocational track, which was 3 years before the reform and remained 3 years.
Besides these structural changes, the reform changed other major parts of the education system: Poland introduced a national core curriculum, standardised examinations at the end of each education stage, introduced new teacher qualification requirements, and adopted a transparent financing scheme. Using PISA data, Jakubowski (2015) showed that the between-school variance of test scores decreased after the reform. Jakubowski et al. (2016) also showed, using difference-in-differences approach adjusted with propensity score matching, that the fast increase in Polish students' PISA test scores was driven by potential vocational students, who after the 1999 reform were in the comprehensive lower secondary school instead of already having been tracked into the low-tier vocational schools. They argue that the increased resources of an additional year of better teachers, peers, and general curriculum have improved these students' test scores. They also show that the improvements in student achievement at the age of 15 are still substantial at the age of 16 or 17 when students are already tracked to different secondary school programs. These results confirm our expectation that improved inputs result not only in better achievement but provide a stronger basis for further development of foundational skills among students who continue education in vocational tracks.
Additional years of education translate into higher wages (Psacharopoulos, Patrinos, 2018). This effect is partly explained by the human capital theory, which assumes that better education translates into skills that increase our productivity (Becker, 1975). An alternative explanation is that by passing through the education system, one signals its innate ability and employers reward it with higher wages (see Page, 2010, for a review). I our case, both theories imply that the Polish reform should increase wages as it improved skills but also educational attainment. Research also shows that at the country level the improvement of cognitive skills translates into higher economic growth (Hanushek, Woessmann, 2012). Moreover, better skills bring non-economic benefits (Heckman et al., 2018).
The empirical research documents well that the Polish reform was beneficial for students' school performance, but we are aware of only a couple of attempts to assess its long-term impacts on labour market outcomes. Liwiński (2020) applies a regression discontinuity design (RDD) focusing on the differences in employment probability and wages among adults with basic vocational degrees. He found that the reform improved hourly wages by 13%. However, as the 1 A counterexample is presented by Malamud and Pop-Eleches (2010), who find no effect before tracking. 2 Improving only the curriculum, however, might not be sufficient. In Sweden in the 1990's, academic content in all vocational tracks was increased. This change reduced curriculum differences between the academic and vocational tracks at the upper secondary level, also allowing vocational graduates to apply to university. The reform was preceded by a pilot scheme, introducing the new system in selected schools throughout the country. Using this time and spatial variance Hall tested the effect of this reform on tertiary enrolment and wages (Hall 2012) and on unemployment chances (Hall 2016). The results of these analyses have shown no difference between pre-and post-treatment cohorts in outcomes. Hall argues that a potential reason for the non-effect is the increased dropout rate of the vocational track students induced by the academic content. 3 At least in the short run, and thus, the typically utilised regression-discontinuity approach shows no effects. reform has also changed patterns of selection into different upper secondary tracks, with a decreasing number of students attracted to basic vocational schools, the results comparing just adults with basic vocational training might provide a biased picture of the overall effect of the reform. Strawinski and Broniatowska (2021) applied a similar RDD approach to the Polish LFS data, but limiting their sample to people with secondary education degree only. Their result suggests a rather small but positive overall impact of the reform. They provide additional results showing larger benefits for vocational school graduates in rural areas. However, these results might be biased for the same reasons as the analysis of Liwiński (2020) as a larger number of students in the post-reform cohorts continued education into tertiary level.
These papers use RDD and rely on the assumption that cohorts before and after the reform are as if they were randomly allocated. This is a strong assumption. Typically, RDD studies use high-density data around the cut-off, for these studies, this is clearly not the case (but see Grenet 2013 for a similar approach). Thus, using a quasi-DID identification strategy could offer a strong robustness check to the RDD results. This study aims at filling this gap by adding more evidence on the labour market outcomes of the reform. We show that the reform had a significant positive effect on the earnings and employment chances of the post-reform cohorts. We argue that these results come from the comprehensive nature of the reform, i.e. general education was extended, and its quality was improved before students were tracked to upper secondary schools.
Unlike the Scandinavian reforms, the Polish reform did not have a pilot stage; instead, it was introduced simultaneously in the whole country. Thus, our estimations cannot exploit any time or spatial variation. Due to the lack of high-frequency micro-data before and after the reform, it is impossible to use a regression discontinuity design. However, by pooling several years of cross-sectional surveys, we can generate a quasi-panel of year-of-survey and age brackets, which we will use to estimate difference-in-differences models. Unlike in the Scandinavian studies, the variance does not come from the time of implementation but from the time (year) of observation. This way, we will directly compare the employment chances and real earnings of pre-reform (control) and post-reform (treatment) cohorts. In Sect. 2, we discuss the 1999 reform in more detail. Section 3 presents the data and descriptive analyses. In Sect. 4, we present our estimation methodology and baseline results. Section 6 shows robustness checks, and Sect. 7 concludes.

The educational reform of 1999 in Poland
The 1999 educational reform was one of the four reforms-social security, healthcare, public administration, and education-implemented by the government elected in 1997. The three main goals of the education reform were to increase the level of education in society, provide equal educational opportunities to everyone, and improve the quality of education (Bialecki et al. 2002). While the other three reforms were implemented simultaneously, they similarly affected all students and all adults in the population. The education reform was rolled out so that the older cohorts followed the old system, while only the younger cohorts, born from 1986, followed the new system with the new structure, curricula, and standardised national exams.
The reform consisted of many distinct parts, of which we concentrate mainly on the structural changes but briefly discuss all parts here (see Jakubowski, 2021, for a more detailed discussion of the reform components). In sum, the 1999 reform (1) extended comprehensive general education by one year, and (2) changed the structure of general education by dividing the previous 8 year primary school into 6 years of primary and 3 years of general lower secondary school. It also (3) shortened academic and technical upper secondary education by one year, but the basic vocational school remained 3-yearslong. Besides these structural changes, the reform (4) introduced a core curriculum, which guided teachers in determining their syllabi, giving them more autonomy, and it (5) introduced the teacher professional attainment ladder with four levels and strong incentives for professional development and obtaining master degrees for all teachers. The third line of changes concerned the testing and admission system: (6) new, standardised tests were introduced at the end of each education stage (primary, lower secondary, and upper secondary), and parallel to that, (7) the admission system to each stage shifted from entrance examinations to using the results of these endof-stage standardised exams. Finally, (8) the reform also transferred ownership of nearly all schools to local governments and introduced a new per pupil formula-based financing system. Figure 1 shows the structural changes in the Polish education system. Most importantly, the newly established comprehensive lower secondary schools (gimnazjum in Polish) had to follow the same general curriculum and admit all students from their catchment area without any additional requirements. They could accept additional applicants from other areas to the remaining places. Lower secondary schools were larger, and there were fewer of them than primary schools. They opened in larger settlements, which was especially important in rural areas, where one gimnazjum collected the children from neighbouring villages who previously attended different local primary schools. So, students spent their last 3 years of general education in a comprehensive institution intended to give the same high-quality education for students all over the country.
Before the reform, there were three upper secondary tracks: a 4 year academic secondary school (liceum), a 5 year technical secondary school (technikum), and a 3 year basic vocational school. Only liceum and technikum ended with a maturity exam, which did not change with the reform. After the reform, academic and technical secondary schools became one year shorter, but the basic vocational schools remained 3 years long. Therefore, lower secondary graduates and those who chose the vocational track received one additional year of comprehensive education. A new institution was operating for a short time, the so-called profiled academic secondary school, but it was abolished after a few years. In general, the upper secondary education curricula and programs were not changed substantially. In academic and technical upper secondary schools the curricula were adjusted where necessary as parts of the material taught in first grades were now moved to general lower secondary schools.
Before 1999, education was compulsory until age 17 with the possibility of part-time education after finishing the 8 year primary school. With the extension of comprehensive education, the end of compulsory education increased to age 18. 4 Finally, Poland introduced a new curriculum and standardised final exams to measure whether students achieve the curricular goals at the end of the 6th and 9th grades. The 6th-grade test was low-stakes and provided information on student and school performance. The 9th-grade tests were high-stakes and were used as an entrance exam to the upper secondary stage. These exams were first launched in 2002 for the first cohorts that followed the new curriculum in the new comprehensive lower secondary schools. For the same cohort, the new standardised maturity exams were introduced in 2005 and replaced higher education admission exams. The standardised examinations strengthened the impact of the extended general education as they were compulsory for all students.
Overall, the reform revolutionised all parts of the school system. However, one can reasonably expect that these changes differently affected various groups of students and had different impacts over time. One can expect a delayed effect on student outcomes of changes in curriculum, professional teacher standards, or system governance structure. Even the reformers assumed that several years were needed to benefit from these changes. However, one year's general education extension immediately provided students with one more year of the general curriculum. The new accountability system established by new standardised examinations provided additional incentives to students and teachers to cover the general curriculum. However, that made a difference only to students who would otherwise have gone to the basic vocational schools. Before the reform, the other groups had to follow the academic curriculum and pass secondary school entrance examinations.
The evaluation of changes in student achievement confirms that the main group that the reform affected immediately was the students of basic vocational schools. Jakubowski et al. (2016) demonstrate that the immediate impact of the reform on 15-year-olds' achievement was close to one standard deviation for students whose socioeconomic background is identical to former students of basic vocational schools. Moreover, additional comparisons for 16-and 17-year-old students suggest that this effect is long-lasting. After 1-2 years of basic vocational education, students still show improved reading and mathematics outcomes close to an equivalent of one year of instruction. A similar achievement effect for students in the academic track was negligible even 6 years after the reform was implemented.
As the new comprehensive lower secondary schools started to operate already in 1999/2000, the first cohort that started their 7th grade in the new system is the 1986 cohort. So, we categorise everyone born after 1986 as a member of the treated group. 5 We argue that structural changes immediately affected every cohort from 1986 onwards, while the other changes were beneficial in the long run, affecting both pre-and post-treatment cohorts, although to a slightly different extent. 6 So, if we compare a few cohorts before and after the reform, the potential differences in their labour market outcomes are most likely caused by the additional year of general education before the delayed tracking.

Data and descriptive statistics
We use the EU Statistics on Income and Living Conditions (EU-SILC) data between 2005 and 2013. 7 The EU-SILC contains detailed income and labour, education, and health status data at the personal and household level. The data consists of private households with all household members surveyed, but only above age 16 are interviewed personally for the income data. In this paper, we use the cross-sectional database of EU-SILC.
To generate a balanced "quasi-panel", we pool the crosssectional datasets between 2005 and 2013 and restrict the sample to participants between ages 20 and 27. This allows us to compare the pre-reform and post-reform groups at the same age: in 2005, the youngest control group members were 20 years old, and in 2013, the oldest treatment group members were 27 (see Table 1 below). This means we have 16 cohorts in the sample, eight in the treatment (T86 to T93-where 86, 87, etc., signal the birth year of the cohort) and eight in the control group (C78 to C85). These people were born between 1978 and 1993 (see Tables 8 and 9 in the Appendix). There are around 48 500 observations in the sample, with 23 500 in the control group and 25 000 in the treatment group. In Poland, the school starting age is 7, and the threshold is January 1. In the sample, everyone born after January 1 1986 is considered to be treated-to have studied in the new system-and everyone born until December 31 1985, is considered to be in the control group. There is a scope for misclassification around the threshold because of grade retention or skipping a grade. However, grade repetition was always rare in Polish schools and was mainly limited to students with special needs. The earliest data on grade repetition are available for 2005, and they show that in primary schools, only 0.6% of students were repeating a grade (GUS, 2006).
For educational attainment, we rely on the ISCED classification 8 : those with ISCED 2 (lower secondary) qualification or below are considered low-educated, those with ISCED 3 (upper secondary) or 4 (post-secondary nontertiary) are at the medium level, and those with ISCED 5 (tertiary) are highly educated. We see the highest education level attained in the data and the year when it was achieved, but unfortunately, we do not observe the exact type of school the person graduated from. So, in the case of upper secondary education, we see how old the person was when she finished the upper secondary level, but we do not see whether it was a liceum, a technikum, or a basic vocational school. Figure 2 shows the distribution of the ages when each level of education was achieved, separately for the treatment and control group. 9 The distribution of finishing ages is different for all three education levels for the two groups. The most visible change is 5 School enrollment cut-off date in Poland is January 1. 6 For example, the first 6 th grade standardised exams took place in 2002, so the 1986 and 1987 cohorts did not have to take them yet. 7 Previous studies looking at the effect of the Polish education reform of 1999 have used the Polish LFS data (Liwiński 2020, Strawinski andBroniatowska 2021). We have also planned our initial analysis on the Polish LFS, however when we asked the Polish Statistical Office for the official database, their offer has vastly exceeded our budget. So we decided to use the EU-SILC instead, which was freely available through the Eurostat. Admittedly, the national LFS is better in its level of education measure as the SILC. However-as we point out below-education is a 'bad control' in this reform. We also admit that the number of observations is higher in the LFS than in the SILC, however, earnings are much better recorded in the EU-SILC. Thus we believe that the SILC is not an inferior option compared to LFS for this analysis. 8 The summary of the variables can be found in the Appendix (Table 7). 9 Note that each figure shows only those people who have that particular education level as the highest finished level at the date of the survey.
in the median finishing age of the lowest educated: the median age increased from 15 to 16, mainly because of the one additional year of comprehensive education. The difference for upper secondary graduates (ISCED 3) is not so salient, but we also see the distribution shift to the right, driven by the one extra year for vocational students. The distribution of tertiary (ISCED 5) graduation age became bimodal, as, in the Bologna system, a BA degree also counts as a tertiary qualification, while earlier only long-cycle programmes existed. Note that studying the differences between treated and non-treated cohorts in this dimension is not straightforward. Poland signed the Bologna Declaration in 1999 and 28 other European countries, which introduced the typical threelevel system of tertiary education-bachelor, master, and doctorate. The three-cycle structure of higher education was implemented gradually. In 2003 Poland was already implementing the Bologna structure (Kwiek 2014). By the school year 2004/2005 10% of state higher education institutions had already adopted the 2-cycle model (Bachelor and Master) in all fields of study and 50% of the institutions in at least 50% of the fields of study (see European Commission 2005). In 2008 all tertiary students were enrolled in the Bologna system (see Kwiek 2014).
Consequently, the first bachelor-level graduates of the new system entered the labour market in around 2006. Therefore, the 1984 and 1985 cohorts also had the opportunity to study in the Bologna system, depending on their institution and field of study-however, the first full "Bologna cohort" was the 1989 cohort (who finished in 2011). This coincidence makes it hard for us to study the effect of the reform on the upper end of the education distribution.
Appendix Table 11 expl or es the age of finishing education in a regression framework. Treated cohorts, on average, tend to stay just as long in education as the control cohorts, but this average zero effect masks a significant composition effect: the low educated stay about 0.9 years longer in school, while the average upper secondary graduate stays a little over 1 month (0.09 years) longer in school. This average effect for the medium educated is probably due to the ca. 15% of students in basic vocational tracks, who stay about one year longer in school. On the other hand, higher educated people finish education about 0.7 years earlier, which is probably due to the previously non-existent BA degree. When looking only at those who are currently not pursuing any educationwhich means that they have finished their educational career, at least for a while-the pattern is similar. However, effect sizes are slightly different: low educated stay in school 0.7 years longer, medium level educated over 2 months longer and higher educated about 0.8 years less.
The two main outcome variables we are interested in are employment status and earnings. EU-SILC classifies activity status into four categories: at work, unemployed, in retirement or early retirement, and other inactive. The first category covers those who work either full-time or part-time or are self-employed full-time or part-time. When looking at employment chances, we compare employed people to unemployed, as we find that the share of the active population compared to the inactive did not change during the sample period. We drop those in retirement or in early retirement as there are only 55 of these people in our sample.
Income data is collected as gross current monthly earnings before the deduction of taxes and social insurance contributions. Income is given in Euros and current prices, so we converted this data to Polish Złoty in 2005 prices. The database contains data on experience, expressed as the number of years spent as an employee or self-employed since the respondent first started a regular job. As low educated people studied for one more year Table 1 Distribution of treatment and control group cohorts by age and year of survey The table shows the distribution of the treatment and control cohorts in our sample. The numbers in the cells indicate the birth year of the cohort that falls into that year-of-survey-age cell. The letters C and T show if the cohort belongs to the control group or the treatment group For the number of observations, see the Appendix, Table 8 Year of survey  Age   2005  2006  2007  2008  2009  2010  2011  2012  2013   20  C85  T86  T87  T88  T89  T90  T91  T92  T93   21  C84  C85  T86  T87  T88  T89  T90  T91  T92   22  C83  C84  C85  T86  T87  T88  T89  T90  T91   23  C82  C83  C84  C85  T86  T87  T88  T89  T90   24  C81  C82  C83  C84  C85  T86  T87  T88  T89   25  C80  C81  C82  C83  C84  C85  T86  T87  T88 26 after the reform, and this change was immediate, there was a year (2001) when, in theory, no one graduated from the lowest education level. 10 There is a similar gap year (2004) for vocational graduates because their studies also became longer by one year. This, and other differences in labour market characteristics make it very important to control for labour market entry in our estimations. There is data on the year of starting the first job, which, of course, is only available for those who have ever worked.
To correct for this, we generated a variable called 'labour market entry year' that equals the year of starting the first job, or, when it is not available, the year when the respondent finished their highest education level (provided in the EU-SILC database). We corrected this variable so that the labour market entry age of respondents is at least 18, as this is the legal age when young people can start to work full time in Poland. 11 Table 2a-c show when each age group of pre-and postreform cohorts started their first job. Since post-reform cohorts finished their education later, they started working later (at 20.47 vs 20.35 years, p < 0.01 12 ). The difference is significant for the younger cohorts and disappears in the older cohorts. Surprisingly, later job starting does not go together with less experience: treated people of the same age tend to have higher years of experience than the control group (2.87 vs 2.58 years on average, p < 0.01, see Table 3a-c for more detail). This could be due to different employment chances. If treated cohorts have higher employment chances than the control cohorts, then-on average-even if they start working later, they might secure more stable jobs and gain more experience over a shorter period. Table 4a-c seem to underline this assumption, as we see a higher share of employed people in younger cohorts among the treatment group than in the control group, and the differences are even higher within the lowest educated people. Across all age groups in the full sample the means are 79 percent vs 77 percent (p < 0.01). Figure 3 shows the real earnings distribution of the pre-and post-reform cohorts. Treated people tend to earn more on average (PLN 1664 vs PLN 1388), mainly because fewer people are at the bottom of the earnings distribution in the treatment group. That is, the earnings distribution is shifted to the right, moving those at the bottom of the distribution to the middle. This shift is apparent in the full sample and for the low educated. From the raw data and from the research before us, we assume that Poland's 1999 comprehensive education reform had a non-negligible and positive effect on the Polish labour market. We believe that it was especially young people (where the level of education and skills gained matter most) and those at the bottom of the education distribution, who stayed one more year in school, who benefited the most from the reform.

Methodology and baseline results
We now turn to our causal estimates. Year of birth determined the assignment into treatment, which means that self-selection into the treatment or control group was impossible. However, there are cohort-specific differences, as the treatment and control group members were born in different years. To handle this issue, we control for age fixed-effects. There are also differences between the years each survey was taken, so we control for survey year fixed effects. 13 We assume that in the absence of the treatment, in both the treatment and the control group, changes in outcomes between two consecutive survey years would have been the same for all ages, and vice versa: changes between two consecutive ages would have been the same for all survey years (parallel trends assumption). 14 For this reason, as a baseline, we opted for a difference-in-differences method, where age and yearof-survey act as the two dimensions of the estimation Tables 2a-c show the mean age when treated and control group members started their first regular job, separately by age groups. The third column shows the differences between the two groups at each age with t-statistics in the parentheses. 2a presents the data of the whole sample, Table 2b for those who have at most a lower secondary qualification, and 2c for those with an upper secondary qualification  (the first differences) and the treatment variable as the diff-in-diff (second difference) estimator. 15 The baseline specification of the multivariate model is the following: where Y is the outcome variable (current educational status, employment status, or earnings) for each individual (i). Treat is the treatment dummy, which can vary across ages (a) and year-of-survey (s). X are individual-level variables (gender and highest level of education, in some specifications), and γ, δ, and μ are age, year of survey, and region fixed effects. ε is the idiosyncratic error term, while α, β, and ρ are parameters to be estimated. 16 For a similar estimation framework, see, for instance, Pischke (2007). Table 5 shows the results of our baseline regression on labour market outcomes. In columns 1 and 2 we estimate linear probability models of the employment probability on the sample of the active population. Columns 3 and 4 show the results of regressions of log real earnings. We estimate all models with age and year-of-survey fixed effects as well as region fixed effects. While these should take out much of the unobserved heterogeneity across cohorts in educational outcomes, pre-and post-treatment cohorts differ in their year of labour market entry, too, which means they might have faced very different labour market conditions. This difference can easily bias the effect of the treatment on longer-run labour market outcomes. Consequently, in columns 2 and 4 we also include year-of labour-market-entry fixed effects. 17 On the one hand, controlling for the year when one enters the labour market seems to be essential as different demand-side factors can alter the entrants' employment probabilities and initial earnings. On the other hand, the year of labour market entry might be considered a 'bad control' (see Angrist and Pischke 2008) as it correlates well with years spent in schooling, which depend on the reform. Moreover, year of labour market entry correlates strongly with age and year-of-survey, which inflates the Y asri = α + βTreat as + ρX asri + γ a + δ s +µ r + ε asri,

Table 3 (continued)
Tables 3a-c show the mean years of experience in the treatment and the control group, separately by age groups. The third column shows the differences between the two groups at each age with t-statistics in the parentheses. 3 a presents the data of the whole sample, 3b for those who have at most a lower secondary qualification, and 3c for those with an upper secondary qualification The treatment variable is basically a simplified interaction term between the age and the year-of survey, as it is shown in table 1. 16 We have tested for potential differences in composition across regions (see Bukowski 2019). Including regional fixed-effects in the regressions do not change any of the results. 17 For people with no experience, we imputed their year of labour market entry with their year of finishing highest education (see above). variance of the model. Nevertheless, substantial results of the models do not differ much with or without the year of labour market entry fixed effect. The results in Table 5 show that the treated group is about 3 percentage points more likely to be employed and earn 4 to 5% higher earnings. 18 Additional results presented in Fig. 4a and b show how these average treatment effects vary across age cohorts. The estimates are too imprecise for employment probability to detect any statistically significant and consistent age-related patterns. However, people closer to their twenties benefited from a 10 percentage points increase in their earnings after the reform compared to the pre-treatment people of similar age. This effect declines with age and disappears around age 24-26. These results can be explained by arguing that the reform improved labour market entrance, but the effects disappear with age as experience becomes more important than general skills learned at school.

Robustness checks
In a robustness check, we simplify our models and compare only two cohorts-right before and right after the reform-with each other. This might decrease the power of our analysis substantially but also highlight the importance of the "inserted" year of education before tracking, as all other reform elements have impacted these two cohorts similarly. Table 6 presents the same models as in Table 5 regressed only on the 1985 and 1986 sample, so the cohorts born right before and right after the 1986 January 1 cut-off. The effects on the employment probabilities are not significant; the 1986 cohort earns 9-11% more than the 1985 cohort on average.
In another robustness check, we test the mechanisms by using the reform as an instrument for the finishing age of schooling. Unfortunately, the database does not contain data on the years spent in education, but we know the age when the highest educational level was attained. As shown above, due mainly to the Bologna Process, we cannot compare the average finishing age for the full pre-and post-treatment sample. However, we can compare the lowest educated sub-population, as there was no compositional change regarding this education group. Appendix Fig. 5 shows the first stage of our IV: the age distribution for those with only ISCED 2 or less for each pair of cohorts. The median finishing age for this level was 15 until the 1985 cohort and became 16 with the 1986 cohort. Columns 1 and 2 in Appendix Table 12 show that the age when the highest degree was obtained for this population has zero effect in itself on log of earnings and employment probability. 1st stage estimates in columns 3 and 5 show that the reform can act as a strong instrument: treated cohorts are 0.4 or 0.6 years older than the pre-treatment cohorts when they finish their highest degree of schooling. In columns 4 and 6 we see that the 2SLS coefficients of finishing age on earnings and employment are high and positive, 13.6% and 12 percentage points, respectively. Unfortunately, they are insignificant, due most likely to the small power of our analysis, but they are of a very similar magnitude to the significant estimates of Table 6 in earnings. 19 Nevertheless, both estimations show that the lowest educated population would have been more likely to earn more had they attended comprehensive lower secondary schooling for an additional year.

Conclusion
We analysed the effects of the 1999 education reform on labour market outcomes, in particular on employment probability and earnings. This reform was comprehensive, as it extended general education by one year before students were tracked into upper secondary education. It also improved the quality of education for the low-track students before they were tracked and increased the general skills of potential vocational students and those who later chose the vocational track. We show that the reform resulted in a 3 percentage point increase in employment probability. The reform also increased earnings by 4-5% on average. The results suggest that the positive effects on earnings decrease with age, so that the reform treatment could have been beneficial in the labour market entrance, but its long-term impact is relatively small.
Our research tries to answer the question of why there are mixed results in the literature about the returns of education. Increasing the length of compulsory education or an additional year of education offered to vocational students usually do not help their employment chances or increase their earnings. We show that a comprehensive Tables 4a-c show the share of employed respondents in the active population in the treatment and the control group, separately by age groups. The third column shows the differences between the two groups at each age with t-statistics in the parentheses. 4a presents the data of the whole sample, 4b for those who have at most a lower secondary qualification, and Table 4c for those with an upper secondary qualification type of reform can successfully improve the labour market outcomes, and the effects are driven by the young and the low educated (and likely by vocational students). This study has its drawbacks: due to the lack of a proper education measure in the EU-SILC we can only estimate an average effect for all education levels. This tells us little about the potential mechanisms driving the results, thus we can only speculate that the increased general education and decreased tracking (better peers and teachers) caused the effects. After 1999 students in Poland were forced to sit an additional year in less selected classes than before and were taught by teachers who were less selected. This change was likely beneficial for the low-track children, as their composition of peers and teachers improved substantially. Our results suggest that the reform has reached its initial goal of decreasing inequalities.
Finally, despite being successful in improving the education outcomes (as shown by the PISA study) and the labour market outcomes of post-reform students (as shown by this study, but also by Liwiński, (2020) and Strawinski and Broniatowska (2021), using different data, methods and subgroups), after 18 years, the Polish government reversed the reform in January 2017 and re-introduced the old system. So, from September 2017, students study again in the old 8 + 4/5 system. Along with this reform, they try to improve the quality of vocational education. Based on the current paper and research before us, we would warn against a retracking reform like this as it might not be the best way to improve the labour market conditions of vocational students. Table 5 The effect of the reform on labour market outcomes, linear models

Robust standard errors in parentheses
All models are linear regressions on the sample of the active population. Columns 1-2 show the results of an LPM on employment probability, while columns 3-4 show the results of the linear regression on earnings. In columns 2 and 4, labour market entry fixed effects are included, too ***p < 0.01, **p < 0.05, *p < 0.1

Variables
(1) Year-of-survey fixed-effect y y y y Region fixed-effect y y y y Year of LM entry fixed-effect n y n y L. F. Drucker et al. Fig. 4 a The effect of the reform on employment probability and earnings by age groups. The figure shows the effects of the reform on each age cohort estimated by models 2 and 4 in Table 5, extended with treatment*age interactions. The left figure shows the treatment effect on the probability of being employed (interpreted as percentage points), while the right figure on the logarithm of earnings (interpreted as percentage change in the earnings). The controls include gender and age, year-of-survey, region, and year-of-labour-market-entry fixed effects. The bars show the average effects with 95% confidence intervals. b The effect of the reform on employment probability and earnings by age groups, without labour market entry fixed effects. The figure shows the effects of the reform on each age cohort estimated by models 1 and 3 in Table 5, extended with treatment*age interactions. The left figure shows the treatment effect on the probability of being employed (interpreted as percentage points), while the right figure on the logarithm of earnings (interpreted as percentage change in the earnings). The controls include gender and age, year-of-survey, and region fixed effects. The bars show the average effects with 95% confidence intervals Appendix A See Tables 7, 8 , 9, 10, 11, 12, 13  Year-of-survey fixed-effect y y y y Region fixed-effect y y y y Year of LM entry fixed-effect n y n y    Year-of-survey fixed-effect y y y y Table 12 The effect of the reform on employment probability and earnings-IV estimation for people with ISCED2 or less

Robust clustered standard errors in parentheses
Columns 1-2 show the association of the age when the highest degree was obtained with log earning and with employment probability, respectively for those who finished at most ISCED 2 level. Columns 3-4 and 5-6 show the 2SLS estimations of log real earning and employment probability with the treatment used as an instrument for the age when they finished the ISCED 2 level *** p < 0.01, **p < 0.05, *p < 0. Year-of-survey fixed-effect y y y y y y Table 13 The effect of the reform on employment probability and earnings separately by gender Year-of-survey fixed-effect y y y y Region fixed-effect y y y y Year of LM entry fixed-effect n y n y Robust standard errors in parentheses ***p < 0.01, **p < 0.05, *p < 0.1 Figure 5 Appendix B Unfortunately we cannot do a proper common trend analysis with our data. As this is a quasi-panel-of ageof-respondent and year-of-observation-, we could either compare people of similar age in different survey years or people of different ages in the same survey year. Figure 6 compares age cohorts in different survey years. The increase in the log wages as we move to the right in the graph is therefore partly caused by economic trends. Figure 7 compares people of different ages in the same survey year, so as we move to the right in the figure, the cohorts presented become younger. Thus, none of these graphs is informative enough alone. The only way to compare pre-and post-reform cohorts is by accounting for both dimensions in a regression framework-including both age and year-of-observation as fixed-effects; and this is what we do in the paper. Figures 6, 7