KWReq—a new instrument for measuring knowledge work requirements of higher education graduates

Starting from the observation that questionnaires for appropriately measuring the changing working conditions and requirements of the highly qualified workforce do not exist, we developed a new German-language instrument focussing on knowledge work. Based on theoretical considerations, we first identified three basic dimensions that constitute knowledge work: novelty, complexity, and autonomy. During the subsequent process of questionnaire development with higher education graduates, including a cognitive pretest, a quantitative development study, and a replication study, these dimensions were operationalised by initially 173 and finally 22 items. Confirmatory factor analysis and structural equation modelling of the data of both the development and the replication study show that the 22-item instrument validly and reliably measures novelty (4 items), complexity with three subdimensions (9 items), and autonomy, also with three subdimensions (9 items). An English version of the questionnaire is available. However, the empirical test of the English-language questionnaire as well as possible refinements of the measurement instrument, which will be discussed in the final section of the paper, are left to future research.


Introduction
The amount of research output has been rising since decades. The number of scientific publications increased from 455,315 in 1991455,315 in (Tindemans 2005 to 733,305 in 2002 (Hollanders and Soete 2010) and 1,270,425 in 2014 (Soete et al. 2015). Not only knowledge creation but the entire knowledge sector-embracing industries in the field of knowledge production, knowledge infrastructure, knowledge management, and knowledge mediationhas grown considerably, both in terms of employment and value added (cf. EFI 2020; Rohrbach 2007). At the same time, the role of traditional factors of production of goods and services such as labour and capital decreased (Stehr et al. 2013), while knowledge and information gained considerable importance (Hube 2005). For example, the production costs of software are made up almost completely of costs for high-skilled labour (ibid.), that is, costs for carriers and processors of knowledge. Furthermore, digitalisation and other innovations increase the relevance of knowledge throughout different sectors and result in a greater complexity of work tasks (Hube 2005;Spitz-Oener 2006). These changes in the economy toward a knowledge-based economy require adjustments not only for the management of enterprises (Kablouti 2007) but also for the workforce and educational institutions (Välimaa and Hoffman 2008).
One of the tasks of higher education is to prepare students for the world of work. In Europe, triggered by the Bologna Process, this educational function of higher M. Trommer et al. education has been emphasized more strongly during the last two decades and employability has become a most wanted educational outcome of higher education (Artess et al. 2017;Schaeper and Wolter 2008;Schaeper 2009). Thus, educational institutions need to reflect the developments in the economy and the labour market, to analyse the job requirements of higher education graduates and to discuss their implications for higher education institutions (Tiemann 2013;van der Velden and Allen 2011).
In the past, the demand for academic qualifications was often determined by macroeconomic indicators, especially economic growth, the association between income and the level of qualification or the composition of the labour force according to educational qualifications (Alesi and Teichler 2013;Henseke 2019;Teichler 2009). The percentage of higher education graduates was often taken as an indicator for the knowledge intensity of the economy, specific trades, firms or jobs (Tiemann 2013) and continues to be used as such, for example in the German reports on research, innovation and technological performance (EFI 2020). However, these indirect measures do not reflect the actual professional tasks of the highly qualified and do not directly assess knowledge work. To describe the aforementioned changes in the labour market more precisely, the "task approach", that is, the measurement of the job tasks, is considered to be more appropriate (Tiemann 2013).
Knowledge work is not an exclusive characteristic of the highly qualified labour force but can be found in all occupational and qualification groups. However, knowledge work is particularly prevalent among higher education graduates (see Spitz-Oener (2006) and Tiemann (2013) for Germany; Brinkley et al. (2009) for the UK). Given the generally increased role of knowledge work in the economy and its prevalence in graduate employment, an instrument suitable for measuring knowledge work among higher education graduates can be expected to have a high analytical potential and to allow for new and more differentiated insights in graduate employment and the relationship between higher education and the world of work.
Several German-language instruments have been developed to collect information on jobs or job tasks. Some of these instruments are specifically designed for higher education graduates. For example, Braun and Brachem (2015) developed a questionnaire for assessing a broad range of generic job-related activities and requirements. On the basis of existing questionnaires, publications on job requirements and graduate employment, and interviews, they derived nine dimensions (e.g., planning and organising of work processes, promoting others, communicating in foreign languages, physical performance, dealing autonomously with challenging tasks) and operationalised them through 49 items. Thus, the questionnaire is quite long and it does not focus on knowledge work, although some items could be used to measure certain facets of this construct. The internationally comparative REFLEX project (The Flexible Professional in the Knowledge Society; Allen and van der Velden 2011), which also included Germany, measured job requirements in several dimension (e.g., professional expertise, functional flexibility, innovation and knowledge management, mobilisation of human resources,) using 29 items (van der Velden and Allen 2011). However, although the questionnaire is relatively economical, it does not rely on the task-based job requirements approach but asks graduates directly about the competences required by the job.
Other measurement instruments, which use a task approach, are directed towards the general work population. For example, Matthes et al. (2014) constructed a 48-item questionnaire capturing five dimensions: analytic tasks (reading, writing, mathematics), interaction and communication (e.g., customer contact, counselling), manual tasks (physical requirements), task complexity, and autonomy. The instrument has been used in the German National Educational Panel Study (NEPS). The Programme for the International Assessment of Adult Competencies (PIAAC; OECD 2013), which was conducted in 39 countries, Germany among them, collected information on job tasks and activities through 49 items organised in eleven task clusters (e.g., reading, writing, problem solving, co-operation, influencing, learning, physical requirements, task discretion).
Again, these questionnaires are rather lengthy. Like the survey instrument proposed by Braun and Brachem (2015), they contain some items suitable for measuring certain aspects of knowledge work but do not capture the broader meaning of the construct. In addition, when collecting data on the general work population the measurement instruments must be able to cover the full range of task or skill levels. Usually, such instruments have little variance when it comes to groups with a smaller range of task levels. In other words, while these questionnaires are well suited for analysing differences between different qualification levels or status groups, they are less capable of revealing more subtle distinctions within more homogeneous groups.
In view of the above-mentioned limitations of existing questionnaires in assessing knowledge work in the highly qualified workforce, we developed a new German-language survey instrument. The aim was, first, to measure job tasks in a theory-based way among higher education graduates, laying a focus on knowledge work. Second, the questionnaire should be short enough to be used in multi-topic surveys. Third, the items should be able to discriminate between higher education graduates. And fourth, the survey instrument was intended to meet psychometric standards.
The aim of this paper is to describe the development and the psychometric properties of the instrument. We begin in Sect. 2 with some conceptual clarifications, from which we derived three dimensions of knowledge work. We then continue with giving an account of the process of instrument construction, including cognitive pretesting and a quantitative instrument development study. In Sect. 4 the results of reliability and validity analyses are presented. The analyses primarily used data of the development study but also of the graduate panel survey conducted by the German Centre for Higher Education Research and Science Studies (DZHW, Deutsches Zentrum für Hochschul-und Wissenschaftsforschung). We conclude our paper with a discussion of the results, limitations, possible future improvements, and the usefulness of the questionnaire.

The "Job Requirements Approach"
Methodologically, our approach to measuring skills requirements of higher education graduates largely follows the Job Requirements Approach (JRA). This approach was applied in several international and national surveys such as the UK Skills Survey (Felstead et al. 2007), PIAAC (Allen and van der Velden 2014;Klaukien et al. 2013), NEPS (Allmendinger et al. 2019) and the DZHW graduate panel surveys (Braun and Brachem 2015). Its distinctive feature is to measure job requirements by describing the job tasks from the perspective of the job holder. This principle implies the assumptions that "what people do at their workplace reflects demands and requirements of work" and that "the best way to get information about job-related activities and requirements is to ask the employees themselves" (Braun and Brachem 2015, p. 576). The description of job tasks not only provides information on job requirements but also on competences: "Because a certain match between employees' activities at work and their own competences can be assumed, the JRA allows for measuring job-related activities and requirements that can serve as a potentially less biased proxy for jobrelated competences than direct self-rated levels of competences" (ibid.).

Knowledge work
The concept of knowledge work has been addressed from various disciplinary and research perspectives. Correspondingly, definitions of knowledge work are manifold and there is no common understanding of what this concept means and is comprised of (Kelloway and Barling 2000;Palvalin 2019;Pyöriä 2005). Early conceptualisations drew a basic distinction between mental and manual work. However, this unidimensional distinction is too simple to capture the specific features of knowledge work. Both manual and mental work can be knowledge work as shown by the example of a surgeon performing a complicated operation (cf. Hube 2005, p. 36). More recent definitions of knowledge work are more complex and entail several dimensions.
For example, Hube (2005) introduces the dimensions novelty and complexity in his extensional definition and conceives of knowledge work as "mentally objectifying activities, which refer to novel and complex work processes and results […]" (ibid., p.61, our translation; for an English-language summary of Hube's approach see Sobbe et al. 2016). The definition of Hermann (2004), too, includes novelty and complexity. She suggests that "knowledge work is always done when tasks are to be fulfilled that-at least for the person concerned-are so complex and new that the existing knowledge and personal experiences do not suffice to reach an appropriate solution; thus, it becomes necessary to resort to knowledge of others or to generate new knowledge herself or himself " (ibid., p.214, our translation). Similarly, Haner et al. (2009) use complexity and novelty as core features of knowledge work. However, they add autonomy as a third dimension and propose to describe knowledge work along the three basic dimensions complexity, novelty, and autonomy (ibid., p.19). Empirically, they distinguish four types of knowledge work: knowledge-based work, knowledge-intensive work, strongly knowledge-intensive work, and knowledge work in a narrow sense. The latter is characterised by frequently dealing with complex and new tasks, a high level of autonomy, and the necessity to continuously revise, improve and renew the acquired knowledge (ibid., p.25).
Our own conceptual approach, which guided questionnaire construction, refers to the definitions given by Hube (2005) and Hermann (2004) and follows Haner et al. (2009) insofar as it includes autonomy as an additional aspect. Thus, we consider novelty, complexity, and autonomy to be central dimensions of knowledge work.

Novelty
As implied by the definitions of Hube (2005) and Hermann (2004), the novelty criterion can refer either to the novelty of the task itself or the novelty of work results. Our operationalisation of knowledge work focuses on novelty of work results for two reasons. On the one hand, we aimed at creating a parsimonious questionnaire and, therefore, had to make a selection. On the other hand, we consider new work results to be more demanding than new tasks and, therefore, more suitable to measure knowledge work in highly qualified populations.
Novelty in this sense is closely linked to the terms creativity and innovation, which are often used as synonyms (Scott and Bruce 1994). The similarity of these concepts becomes obvious when considering the definitions of creativity and innovation. Creativity mainly refers to the "production of novel and useful ideas" (ibid., p.581). Innovation goes beyond creativity as in many conceptualisations the term has to do not only with idea generation but also with the application and realisation of ideas (Janssen 2000;Scott and Bruce 1994;Stashevsky et al. 2006) or with transforming ideas into "new/improved products, service or processes" (Baregheh et al. 2009(Baregheh et al. , p. 1034. The refined and more precise interpretation of novelty means that we refer this dimension of knowledge work to innovative work or innovative work behaviour, which has often been studied (e.g., de Jong and den Hartog 2010; Janssen 2000; Rehman et al. 2019;Scott and Bruce 1994).

Complexity
In research on work and organisations complexity has been conceptualised at different levels and referring to various phenomena (e.g. systems, organisations, products, jobs, tasks) (Haerem et al. 2015). Corresponding to the different perspectives on complexity, definitions are manifold (Blockus 2010). However, they often use similar attributes to characterise the level of complexity: the number and variety of elements or components, the relations between these elements, and the changeability of elements and relations (Blockus 2010, summarised by Harlacher et al. 2017. For example, Luhmann (2013) defines system complexity in terms of (1) the number of elements, (2) the number and (3) the diversity of relationships between these elements, and (4) the changes of these factors over time. Regarding our level of analysis, task complexity, Wood (1986) distinguishes three dimensions: (1) component complexity (number of distinct acts and information (= task inputs)), (2) coordinative complexity ("nature of relationships between task inputs and task products" (Wood 1986, p. 68)), (3) dynamic complexity (changeability of task inputs and the relationship between inputs and products). Another early conceptualisation was proposed by Campbell (1988). He considers task complexity to be related to information load, diversity, and rate of change and to be a function of specific task attributes such as multiple ways to solve a task and uncertainty about the linkage between alternative paths to a solution and the desired outcome. In a similar vein and referring to Wood (1986), Campbell (1988) and others, Blockus (2010) identifies four characteristics of task complexity: (1) the amount of (sub)tasks, (2) the diversity of (sub)tasks, (3) the changeability of (sub)tasks, and (4) the interdependence between different (sub)tasks.
On the basis of these conceptualisations, we consider (1) quantity and diversity of information and/or tasks or task elements ("variety"), (2) interdependence and (3) dynamics to be constituent attributes of task complexity. The first component refers to the degree to which the job of an individual entails a variety of different activities and information to be processed. It resembles the dimension "skill variety" of the Job Diagnostic Survey developed by Hackman and Oldham (1975) and is related to the amount of different skills, competencies, and knowledge that are required. According to Blockus (2010), the second dimension, interdependence, can be understood as the extent to which (sub)tasks are (mutually) dependent. This type of interdependence can apply to the (sub) tasks of one person or to the tasks performed by different individuals. Another type of interdependence concerns the relationship between an individual's tasks and the context, the environment or external factors. Insofar, this dimension of complexity incorporates part of Willke's (2006) definition of system complexity, which refers to the system-environment relationship (cited from Blockus 2010). Integrating the suggestions of the scholars cited above, dynamics can be defined in terms of changeability of tasks, necessary information, dependencies or the relation between work activities and outcomes. Changeability implies uncertainty (Campbell 1988), ambiguity, and unpredictability. Therefore, Klabunde (2003), who conceived of system complexity as being made up of variety, connectivity, and dynamics, equates dynamics with uncertainty and unpredictability.

Autonomy
A most influential definition of work autonomy was given by Hackman and Oldham (1975). In their view, job autonomy is "the degree to which the job provides substantial freedom, independence, and discretion to the employee in scheduling the work and in determining the procedures to be used in carrying it out" (Hackman and Oldham 1975, p. 162). Although Hackman and Oldham (1975) implicitly distinguish different facets of job autonomy, early conceptualisations and operationalisations often considered autonomy a one-dimensional, global construct (Breaugh 1985;Theurer et al. 2018). Nowadays a multi-dimensional concept of workplace autonomy is common (Theurer et al. 2018) and several measurement instruments address at least two dimensions (e.g. Breaugh 1985;Little 1988;Morgeson and Humphrey 2006;Sprigg et al. 2000): work method autonomy (freedom to choose procedures and methods to accomplish a task) and work scheduling autonomy (control over the timing, sequencing, and scheduling of work). On the basis of a literature review, Breaugh (1985) adds "criteria autonomy" as an important third dimension and defines it as the "ability of employees to influence the types of tasks they work on or the goals they are supposed to accomplish" (Breaugh 1999, p. 359). He chose the term "criteria autonomy" because the freedom to decide on tasks and goals gives workers "control over the criteria which will be used to evaluate them" (Breaugh 1999, p. 359). What Breaugh (1985Breaugh ( , 1999 calls criteria autonomy, is similar to a concept that other researchers refer to as "strategic autonomy". Strategic autonomy enables "a team (or individual) to not only solve problems, but to actually define the problem and the goals that will be met in order to solve that problem" (Lumpkin et al. 2009, p. 50).
When developing the comprehensive Work Design Questionnaire (WDQ), Morgeson and Humphrey (2006), too, distinguished freedom in work scheduling and work methods as two subareas of work autonomy. In contrast to Breaugh (1985), they do not identify criteria or strategic autonomy as a third dimension but "decision making". Unfortunately, Morgeson and Humphrey (2006) neither explain and substantiate their decision nor do they define decision-making autonomy. While in our view control over tasks and goals (criteria autonomy) constitutes a dimension of autonomy that is clearly different from scheduling and method autonomy, decision-making autonomy should not be considered as a distinct facet of work autonomy. It rather represents a general dimension, which includes-among other decision areas-method autonomy and scheduling autonomy. The wording of the three items that are used to operationalise decision-making autonomy support our point of view.
In conclusion, we follow Breaugh (1985Breaugh ( , 1999 in conceptualising work autonomy in terms of work scheduling, work method and criteria autonomy. Nonetheless, we decided to include decision-making autonomy in the process of instrument development to examine our assumption that decision-making autonomy is a generic concept and that work scheduling and work method autonomy are part of decision-making autonomy. Our concept of knowledge work is linked to the international discussion and other approaches in many respects. Regarding novelty, Ramirez and Steudel (2008) identify creativity and innovation as one of eight dimensions of knowledge work. According to Reich (1991; cited from Jacobs 2017), knowledge work involves the use of symbolic analytic processes. The job tasks of symbolic analysts in turn "require creativity and innovativeness" (Pyöriä 2005, p. 120). And Jacobs (2017) considers the familiarity or novelty of work situations a key point in the discussion about knowledge work. Complexity is mentioned as a central attribute of knowledge work, for example, by Benson and Brown (2007), Davenport (2005), Jacobs (2017), and Ramirez and Steudel (2008). Benson and Brown (2007) also characterise knowledge work by dimensions that we define as sub-dimensions of complexity, namely variation and dynamics and reciprocal interdependence. Finally, autonomy plays an important role in the conceptual considerations of Benson and Brown (2007), Pyöriä (2005), and Ramirez and Steudel (2008).
Definitions of knowledge work often refer to the distinction between routine and non-routine work (e.g., Benson and Brown 2007;Pyöriä 2005;Ramirez and Steudel 2008;Reinhardt et al. 2011). In our approach we did not model routine/non-routine as a distinct dimension but captured it indirectly through the dimensions novelty, complexity, and autonomy. This seems to be justified since routine/ non-routine is often defined in terms of novelty, complexity and autonomy. For example, Davenport (2005), who uses the level of complexity and the degree of collaboration (or interdependence) as central criteria for categorizing knowledge work into four types, defines routine as one pole of the complexity dimension (and interpretation/judgement as the other). Benson and Brown (2007, p. 125) equate non-routine with variation and dynamic, reciprocal interdependence, and autonomy by saying that the distinction between routine and nonroutine allows knowledge work to be broken into these three dimensions. And Matthes et al. (2014) propose to measure routine and non-routine tasks as defined by Autor et al. (2003) by complexity and autonomy.

Questionnaire development
We developed the instrument for measuring knowledge work requirements of higher education graduates in four steps. First, we inspected existing questionnaires in German and English to find suitable indicators of the dimensions and subdimensions of knowledge work described above. This search yielded a pool of 173 items including some self-developed items. From this pool we selected 46 items for further consideration and translated or adapted them if necessary (see Table 1 for information on the number of items in the different steps). Except of three items that did not need to be tested, these items were included in the second step, the cognitive pretest. The guided interviews were conducted with 14 higher education graduates from different disciplines and focused on comprehension by using predominantly the cognitive technique of probing. In addition, the cognitive interview also addressed issues of information retrieval and response selection. Category-selection probing was also applied to answer the question as to whether the two response rating scales presented in the questionnaire 1 were appropriate, accurate and easy to use. The cognitive techniques applied followed the guidelines described in Prüfer and Rexroth (2005). As a main result of the cognitive pretesting, eight items were removed because of unintended interpretations (e.g., "creating new knowledge" was understood in terms of teaching). In addition, one item was splitted in two because of its multidimensional stimulus ("My job allows me to take initiative and exercise discretion. ").
The remaining 39 items were used in the third steep, the instrument development study (16 items for the dimension complexity, 9 for novelty and 13 for autonomy; see Table 2 for examples and Table 10 in the appendix for all items). After having thoroughly analysed the data (results are provided in Sect. 4), 22 items formed the final questionnaire (printed in bold in Table 10 in the appendix).
The instrument development study was carried out as a web survey with former students at German higher education institutions who had previously participated in the "HISBUS online panel". The HISBUS online surveys are repeatedly carried out by the German Centre for Higher Education Research and Science Studies (DZHW) to gather information on current issues in German higher education. 652 panel members who had finished higher education and gave their consent to participate in further surveys were invited to the development study in winter 2017. To increase the response rate we used a lottery incentive and sent two reminder emails. Within eight and a half weeks a total of 580 respondents (83%) at least started the survey. After excluding respondents who had multiple missing values, did not obtain a higher education degree or did not work after having left higher education, the sample used for analysis consisted of 411 cases. An overview of basic sample characteristics is given in Table 3.
In addition to the core items measuring knowledge work, the questionnaire included questions and items on socio-demographic factors, education, and the current or last job. As far as this information was used for testing the criterion-related validity of the newly developed instrument, the corresponding variables are described in more detail in Sect. 4.1.
Because of technical difficulties, 13 variables of the questionnaire on knowledge work requirements had a substantial amount of missing observations. In the sample used for analysis, 7 variables have 309 to 344 valid observations, 6 variables have only 62 to 95 valid observations. There are even some pairs of variables with no joint observations. Fortunately, the variables concerned, which are marked in Table 10 (appendix), do not belong to the same dimension or subdimension.
The final instrument with 22 items in seven (sub) dimensions was also used in the third wave of a panel survey with higher education graduates of the academic year 2009 carried out by the DZHW. The third wave was implemented as a three-part web-based survey approximately ten years after graduation. The first part took place between April and June 2019 and primarily collected data on the occupational and educational life course since the second panel wave 2015 and the current occupational situation. The response rate was 61 percent. The second part was conducted between August and October 2019 and included the questionnaire on The job allows me to set my own priorities

Decision making
The job allows me to make a lot of decisions on my own knowledge work requirements, which took an estimated two minutes to complete. Because of the data problems encountered in the KWReq development study, we used the data of the DZHW graduate panel 2009 for a second test of the factorial validity and dimensional structure. The results, based on a sample of 3369 cases, are presented in Sect. 4.5. It is noteworthy that the sample composition differs considerably from that of the development study (see Table 3). At the time of the interview, the respondents of the graduate panel survey were much older, they had more often obtained a masters' or doctoral degree and more often held managerial positions.

Data analysis strategy
Our approach to analysing the data of the development study can be best described as a combination of exploratory and confirmatory elements. The aim was threefold: (1) to obtain a parsimonious questionnaire with as few items as possible (at least three items for each factor) and as many items as necessary to measure the dimensions and subdimensions of the construct "knowledge work" in a valid and reliable way; (2) to confirm the hypothesised factor structure of the construct "knowledge work" and, in case it is not confirmed, to find an appropriate dimensional model; (3) to assess the reliability as well as the convergent, divergent and criterion validity. To reach these aims, we first examined the three dimensionsnovelty, complexity, autonomy-separately. In a subsequent step, we analysed the complete model. Finally, we tested whether the model fitted to the data of the development study also holds in the sample of the DZHW graduate panel.
For assessing construct validity and selecting the most appropriate variables, we primarily performed confirmatory factor analysis (CFA), supplemented by exploratory factor analyses (EFA), and calculated scale and item reliabilities using Cronbach's Alpha and corrected item-total correlations 2 respectively. To evaluate criterion validity, structural equations models (SEM) were estimated. For CFA, EFA, and SEM the statistical software Mplus 8.4 was used. All other analyses were conducted in Stata 16. EFA was performed using maximum likelihood (ML) estimation and applying oblique rotation. As a method of handling missing data we used full information maximum likelihood (FIML) estimation.
To evaluate the measurement models, we used several criteria: (1) The RMSEA (root mean square error of approximation), which compares the model-implied covariances with the observed covariances and favours parsimonious models. Threshold values for acceptable model fit are much debated. Hu and Bentler (1999) propose a cut-off value "close to 0.06". Steiger (2007) sets an upper limit of 0.07. Other scholars are even less conservative and consider RMSEA values less than 0.08 to be indicative of a good fit, values in the range between 0.08 and 0.10 as mediocre but acceptable and values above 0.10 as unacceptable (overview in Brown 2015;Hooper et al. 2008). (2) The comparative fit index (CFI), which measures the increase in model fit relative to the "independence" model. (3) The Tucker-Lewis index (TLI), which is closely related to the CFI but imposes a greater penalty for lack of parsimony than the CFI. Initially, a lower bound of 0.90 was proposed for both indices. More recently, a minimum value of close to 0.95 has become the gold standard. We report the results of the model's chi-square test of overall fit, which tests the hypothesis that the covariances predicted by the specified model do not deviate from the population covariances. But we do not use these results for model evaluation because the model chisquare value is, among other things, affected by sample size so that with increasing sample size the test statistic becomes more sensitive to even slight differences between observed and model-implied covariances and would suggest rejecting the model. Marsh et al. (2004) warn against strictly adhering to the rules of thumb mentioned above. We therefore also considered other criteria such as the scales' internal consistency as measured by Cronbach's alpha, the size of the factor loadings, and theoretical arguments. Because alpha depends on the number of items a scale is composed of, we consider a value of around 0.60 to represent an acceptable internal consistency of a three-item scale. Regarding standardised factor loadings, values of 0.30 or 0.40 are conventionally accepted as cut-off points (Wang and Wang 2012). We are more restrictive and require a minimum factor loading of between 0.50 and 0.60 for an item to be considered as a valid indicator of a construct.
Apart from the items constituting the core instrument for measuring knowledge work, we used additional variables in our analyses. To assess divergent validity seven items concerning the job-research nexus or research involvement were analysed. Three items formed a scale referring to the consumption and application of research (examples: reading scientific literature, converting research results into processes/ applications/products; α = 0.79), four items constituted a scale called "active research" (examples: working in research, conceptualizing research or development projects; α = 0.88).
To evaluate concurrent criteria validity, we selected three variables measuring the education-job match regarding the professional position (extent to which the job corresponds to the higher education qualification in terms of the professional position), the task level (extent to which the higher education qualification matches the level of the job tasks), and the professional qualification (extent to which the employment corresponds to the field of study). The three items were presented with a fivepoint Likert type response scale ranging from "no match" to "good match". In addition, we included the question as to which academic degree is most appropriate for the job (master/PhD, bachelor, no academic degree) and whether the survey participants are holding a leadership or highly qualified position.
The results of the analyses using the KWReq development study are described in Sects. 4.2, 4.3 and 4.4. In Table 11 in the appendix the mean, standard deviation and correlation of the variables of the knowledge work questionnaire are reported.

The novelty dimension
The single-factor CFA model fitted the data well (RMSEA = 0.063; CFI = 0.976; TLI = 0.968; see Table 4).  Exploratory factor analysis yielded an eigenvalue greater than 5 for one factor and an eigenvalue of less than 1 for all other factors. Thus, the EFA results also suggested a single-factor model.
Based on the highest factor loadings and corrected item-total correlation, we selected four items (n3, n4, n6, n9) for inclusion in the final instrument. These items refer to different aspects of the concept and different levels of abstraction. The item concerning the development of new products or services (n5) would have added an additional facet but was excluded from further consideration because of the low factor loading and item-rest correlation. In addition, such an innovative behaviour is not often required, as indicated by the low arithmetic mean of 1.98 (see Table 10 in the appendix).
The fit of the CFA model with a reduced item set was a little bit better (RMSEA = 0.055; CFI = 0.996; TLI = 0.988) than the fit of the model with all nine items. Even the chi-square statistic was not significant. Because Cronbach's alpha is sensitive to the number of items and the final scale was reduced by five items, the value for the internal consistency decreased from 0.925 for the nineitem scale to 0.892 for the four-item scale. Nonetheless, Cronbach's alpha of nearly 0.90 indicates a high internal consistency.

The complexity dimension
The hypothesized dimensional structure of the complexity dimension was not supported by confirmatory factor analysis (RMSEA = 0.111; CFI = 0.796; TLI = 0.757). An exploratory factor analysis with all complexity variables showed that according to the eigenvalue criterion greater than 1 a four-factor solution should be preferred. The factor pattern of this model revealed that the dependency items cde4 (consideration of possible consequences for other people or areas; see Table 10 in the appendix) and cde5 (performing tasks independently of others) as well as the dynamics items cdy1 (updating professional knowledge) and cdy2 (examining and adjusting the way of working) substantially contributed to several factors (results not shown). Because of their ambiguity, these items were excluded from further analyses. Exploratory factor analysis with the remaining items yielded three factors with an eigenvalue greater than 1. However, the results of the χ 2 difference test comparing the three-factor solution with the four-factor model indicated that the latter fitted the data better than the former ( χ 2 diff = 33.47; df = 9; p = 0.0001). Therefore, we decided in favour of the four-factor solution, which had an acceptable model fit (RMSEA = 0.068; CFI = 0.975; TLI = 0.931; see Table 5). Table 5 shows the estimated EFA factor loadings (also known as factor pattern). When oblique rotation is performed, factor loadings represent the standardised regression coefficients for predicting the variables by a particular factor. By and large, they revealed a clear factor pattern: Factor 2 significantly and substantially predicted only the variables cde1, cde2, and cde3. In addition, these variables were not strongly associated with other factors. Thus, the analysis confirmed the dependency subdimension. Regarding factor 3, four significant coefficients could be observed. However, the loading of variable cv2 (0.28) was far smaller than those of the variables cdy3 (0.81), cdy4 (0.93), and cdy5 (0.51), which again were only weakly related to other factors. We therefore consider factor 3 to be the common factor of the lastmentioned variables, which are part of the dynamics subdimension and describe rather reactive work behaviours. Factor 4 represents a second and more proactive part of the dynamics subdimension. The variables cdy6, cdy7, and cdy8 are most strongly predicted by factor 4 and are not significantly affected by other factors. Regarding significance and size of the factor loadings, factor 1 is best described by variables cv1, cv2, and cv3, that is, the variables of the variety subdimension. However, the coefficient associated with variable cv2 was rather low (0.28) and similar to the loadings on factor 3 and factor 4. On the other hand, the correlation of variable cv2 with factor 1 was quite high (0.61; factor structure not shown) and the internal consistency was good (α = 0.84), suggesting to keep this variable as indicator of factor 1.
According to the factor correlation matrix (in the bottom part of Table 5), factor 2, which represents the dependency subdimension, seems to measure something else than the other factors or complexity subdimensions. The correlations were quite low (0.14 ≤ r ≤ 0.18), while factors 1, 3 and 4 were more strongly correlated (0.53 ≤ r ≤ 63). We therefore disregarded the subdimension dependency and kept only the variety subdimension and the two dynamics subdimensions "reactive" and "proactive". This decision was also justified by the results of validity analyses. While all of the external criteria examined (e.g. education-job match, leadership position) were significantly related to the latent factors representing variety, reactive dynamics and proactive dynamics (see Table 8), none of them correlated with our operationalisation of the dependency subdimension (results not shown).
Confirmatory factor analysis proved that the reduced model had an overall acceptable model fit (RMSEA = 0.065; CFI = 0.973; TLI = 0.957; Table 5) and satisfactory factor loadings (0.58 ≤ λ ≤ 0.82). It should be noted that we allowed the residuals of items cdy3 and cdy4 to be correlated. This decision was based on the modification index. The high value suggested that the two items share common variance, which cannot be explained by the underlying latent factor and may be attributed to linguistic aspects: Both items use the German verb reagieren, which may serve as a signal word. It should be also noted that although the estimates and fit indices did not exceed or fall below the threshold values, the subdimension "dynamicsproactive" was not particularly well measured.

The autonomy dimension
We conceptualised autonomy as a three-dimensional construct with work scheduling, work method and criteria autonomy as subdimensions. Confirmatory factor analysis supported the hypothesised model (see Table 6). The indices of overall model fit were acceptable (RMSEA = 0.059; CFI = 0.975; TIL = 0.963), as were the factor loadings (0.63 ≤ λ ≤ 0.86) and Cronbach's alpha (0.75 ≤ α ≤ 0.88). The newly developed criteria scale was not as well measured as the method and scheduling dimension-the latter being operationalised using the well-established and validated instruments developed by Morgeson and Humphrey (2006) and translated into German by Stegmann et al. (2010).
As stated above, Morgeson and Humphrey (2006) proposed decision-making autonomy as a third dimension of autonomy. Following Breaugh (1985) and assuming that decision making constitutes a general dimension encompassing several subdimensions, we preferred to include the more specific category criteria autonomy. We tested the hypothesis concerning the character of  the decision-making factor by estimating additional measurements model including the decision-making items and specifying a second-order factor underlying method autonomy, scheduling autonomy and criteria autonomy. It turned out that our assumption was partially supported. Confirmatory factor analysis with four first-order factors yielded the result that decision making and method autonomy were, indeed, very highly correlated (0.97), suggesting that these constructs more or less measure the same. However, the correlations between decision making and scheduling autonomy or criteria autonomy proved to be considerably lower (0.71 and 0.83). We decided to keep method autonomy because it is a more specific construct than decision making. This decision was corroborated by a correlation of 0.94 between decision making and the second-order factor with method, scheduling and criteria autonomy as firstorder factors.

The full model
We first estimated a first-order CFA including the novelty dimension; the complexity dimension with the three subdimensions variety, dynamics-reactive, and dynamics-proactive; and the autonomy dimension with the three subdimensions method, scheduling, and criteria autonomy. The fit for this model was RMSEA = 0.056, CFI = 0.941, and TLI = 0.927. We then estimated a second-order CFA specifying a second-order factor for complexity and another second-order factor for autonomy. This model fitted the data not as well as the first-order model (RMSEA = 0.063; CFI = 0.922; TLI = 0.910) and also yielded larger values of AIC and BIC. 3 We therefore opted for the first-order model. The estimation results are displayed in Fig. 1. Although CFI and TLI fall below the threshold value of 0.95, we consider the model fit to be acceptable. On the one hand, the RMSEA statistic was smaller than 0.06 and even the upper bound of the confidence interval (CI) was only slightly greater than 0.06. On the other hand, all estimates of the factor loadings exceeded the value of 0.60 (range from 0.62 to 0.85) and were significant. In addition, the correlations between the latent factors support the postulated dimensionality of the construct "knowledge work": The correlations between factors within a dimension (i.e. complexity, autonomy) were higher than the correlations between factors belonging to different dimensions (i.e. complexity, autonomy, novelty). Table 7 makes it easier to discern this pattern. It presents the latent factor correlations estimated by a CFA that was performed to assess the divergent validity of the knowledge work questionnaire and therefore included the two research dimensions described in Sect. 4.1. Although this CFA model differs from the "pure" knowledge work CFA model, the correlations between the knowledge work factors are very similar. Table 7 also shows that the two factors representing different forms of research involvement measured a distinct construct.  While the research factors correlated 0.68, the correlation coefficients between the research factors and factors of the knowledge work instrument were much smaller. Finally, we examined the concurrent criterion validity by estimating structural equation models. We performed separate analyses regressing the knowledge work factors separately on each of the external variables described in Sect. 4.1. The independent variables were analysed as manifest variables. The unstandardised regression coefficients and explained variances (R 2 ) are presented in Table 8.   Education-job match regarding …the professional position 0.12*** 0.14*** 0.02 0.24*** 0.20*** 0.13*** 0.16*** …the task level 0.21*** 0.21*** 0.07* 0.34*** 0.26*** 0.14*** 0.19*** …the professional qualification 0.17*** 0.12** 0.04 0.22*** 0.18*** 0.11** 0.13*** Most appropriate acad. degree 0.16*** 0.18*** 0.06* 0.30*** 0.15*** 0.11** 0.10** Leadership or highly qualified position (yes vs. no) 0.08** 0.08** 0.03 0.14*** 0.05* 0.04 0.03 It turned out that all regression coefficients were significantly different from zero but the explained variances were not. The independent job-related variables were able to explain a considerable proportion of the variance in the proactive dynamics dimension. R 2 values ranged between 0.14 and 0.34; the estimation of a SEM model including all independent variables yielded a R 2 value of 0.43 (results not shown). To put it differently, the dynamics-proactive factor discriminated best between groups of higher education graduates with different job characteristics. In contrast, the independent variables examined accounted only for a very low proportion of the variance in the reactive dynamics dimension (0.02 ≤ R 2 ≤ 0.07; R 2 (full model) = 0.11). In other words, the job requirements represented by this dimension are relatively independent of other job characteristics. As indicated by the mean of this factor (Table 12 in the appendix), they are also quite common in the work of higher education graduates and have a low variance. In this respect the reactive dynamics scale resembles the variety dimension.
Regarding the relative weight of the external variables as predictors or correlates of the knowledge work dimensions, the education-job match concerning the task level seems to be particularly important. Among the criterion variables analysed this variables is associated with the highest R 2 for all dimensions. Which variable comes next depends on the dimension. The education-job match regarding the professional position is the second most important external variables when the autonomy dimensions are concerned. The type of academic degree that is considered appropriate for doing the job is the second most important factor when the complexity dimensions are concerned. All in all, the results suggest sufficient criterion validity.

A second test: application of the KWReq questionnaire to the DZHW graduate panel
Because of differences in the highest academic degree and the ten-year interval between graduation and the third panel wave, the career of the higher education graduates of the academic year 2009 was more advanced than that of the respondents of the KWReq development study (Table 3). Correspondingly, the former rated most items of the knowledge work questionnaire on average higher than the latter (exact figures not shown). The exception refers to the dynamics-reactive dimension, which in the sample of the DZHW graduate panel survey 2009 showed equally high or even lower means. Nonetheless, the dimensional structure of the measurement instrument was similar to that of the KWReq development study.
With fit indices comparable to those obtained with the data of the development study (RMSEA = 0.056; CFI = 0.944; TLI = 0.931), the fit of the CFA model proved to be acceptable (see Table 5). All factor loadings were sufficiently high (0.60 ≤ λ ≤ 0.89) and significant. The internal consistency as measured by Cronbach's alpha was satisfactory to good, although often lower than in the development study.
The correlations between the latent factors, too, were weaker than in the development study. However, the correlative pattern was comparable. Factors belonging to the same dimension were more strongly correlated than factors belonging to different dimensions, with one exception: The correlation between the proactive dynamics factor and variety was less strong (0.66) than the correlation between dynamics-proactive and novelty (0.74).

Discussion
Starting from the observation that instruments for appropriately measuring knowledge work of the highly qualified workforce do not exist and that higher education institutions need to reflect the changes in the world of work, we developed a new questionnaire. This questionnaire consists of 22 items, which mainly describe job activities, thereby following the job requirements approach, and partly also refer to job characteristics. The instrument focusses on knowledge work as a key characteristic of the changing economy. It was implemented as a web-based survey but can easily be included in other survey modes as well, as shown by the PIAAC and NEPS study, which used computer-assisted telephone interviewing and partly face-to-face interviews for collecting information on job tasks. It is short enough to be included in multi-topic surveys. It was able to discriminate between different groups of higher education graduates. And it proved to be a reliable and valid measurement of the three dimensions of knowledge work that were identified on theoretical grounds.
We postulated that the concept of knowledge work involves novelty, complexity, and autonomy. Novelty was mainly interpreted and, consequently, operationalised in terms of innovative work. Complexity was conceptualised as being composed of the three subdimensions variety, dependency, and dynamics. Autonomy was also considered to be a multi-dimensional construct, including scheduling autonomy, work method autonomy, and criteria (or strategic) autonomy. However, statistical analyses did not confirm our operationalisation of dependency to be a constituent part of complexity. And they revealed two distinct forms of dynamic complexity: a more reactive one, indicating that knowledge work entails responding ad hoc to new requirements; and a more proactive one, which refers to the demand of actively dealing with uncertainty, ambiguity and unpredictability. Because the proactive facet of the dynamics subdimension accounted for a substantial proportion of the variance in job characteristics like occupational position and education-job match while the association of these variables with the reactive facet was considerably weaker, we conclude that the proactive-dynamics factor is a stronger indicator of knowledge work than the reactive-dynamics factor.
Regarding the result that dependency did not emerge as a valid subdimension of complexity, we cannot exclude the possibility that this concept was poorly operationalised. As mentioned above, dependency or interdependence refers to the degree to which the tasks of one person or of different individuals are dependent from each other. In addition, dependency concerns the relationship between the individual and the environment. In contrast to these manifold meanings of dependency, we mainly restricted the measurement to the extent to which one's job is affected by the work of others-a phenomenon that Morgeson and Humphrey (2006) call "received interdependence". Actually, two of the three dependency items remaining in the analyses (see Table 10) were taken from the subscale "received interdependence" of the WDQ as translated by Stegmann et al. (2010), the third item is informed by this questionnaire. Since this type of dependency also applies to assembly-line workers, for example, it seems plausible that it does not correlate with other components of complexity and knowledge work. Future research should search for a more appropriate operationalisation of dependency. It should also address the question as to which types of dependency are essential to complexity and how they can be conceptualised.
Another limitation of our study refers to the measurement of the novelty dimension. While we are convinced that the focus on innovative work is justified, the assumption of a reflective measurement model may be questioned. In reflective measurement models, the indicators are expected to correlate highly, they are interchangeable and manifestations of the underlying latent construct (Jarvis et al. 2003). The correlations between the indicators occur because they are influenced/caused by the latent factor. In contrast, formative measurement models are characterised by indicators that capture different aspects of the latent construct and, therefore, do not necessarily correlate with each other (ibid.). The causal relationship is from the manifest measure to the latent factor. Because the correlations of the item "In my job I develop new products or services. " (n5, see Table 9) with other variables of the novelty dimension are rather weak, we eliminated it from further consideration. However, this item can be considered to be a strong indicator of innovation. Therefore, it might be more adequate to choose a formative measurement approach and to use appropriate statistical techniques such as partial least squares structural equation modelling (PLS-SEM; Hair et al. 2017Hair et al. , 2018 for analysing this knowledge work dimension. To develop an English-language version of the questionnaire and to examine ways to further shorten the instrument are other tasks that remain to be accomplished in the future. Even though we constructed and tested a German-language questionnaire, an English version is also available, consisting of original items in English and items that we translated from German to English. These translations have not yet been checked and refined using backward translation or other methods for ensuring the quality of the translated items. In addition, an empirical cognitive and standardised test is still to be conducted. Regarding the length of the questionnaire, the empirical results presented in this study provide indications of how the measurement instrument could be shortened if need be. While we do not recommend reducing the number of items per (sub)dimension, dropping an entire factor might be worth being considered.
Despite the limitations and potential for improvements, the present version of the questionnaire for measuring the knowledge work requirements of higher education graduates is ready to be used in future empirical studies. The technical and data problems encountered in the development study raised doubts as to whether the results are trustworthy. But it was possible to replicate the findings in a representative survey with higher education graduates who were more advanced in their career. Interesting research questions that could be addressed using data collected by the questionnaire refer to types of knowledge work (personcentred instead of variable-centred approach) and their variation within the highly qualified workforce, changes in the occurrence and intensity of knowledge work over time, and intra-individual changes in job requirements during the course of the professional career. Empirical evidence on these issues can shed more light on the relationship between higher education and employment and might provide assistance for higher education curriculum development. But of course the new survey instrument and data only collected from graduates are of limited use when it comes, for example, to assessing the demand for and distribution of graduate jobs, that is, jobs where "a substantial portion of the skills used are normally acquired in the course of higher education" (Green and Henseke 2016, p. 3). M. Trommer et al.