THE DETERMINANTS OF HUMAN CAPITAL FORMATION DURING THE EARLY YEARS OF LIFE: THEORY, MEASUREMENT, AND POLICIES

In this paper, I discuss a research agenda on the study of human capital accumulation in the early years, with a particular focus on developing countries. I discuss several methodological issues, from the use of structural models, to the importance of measurement and the development of new measurement tools. I present a conceptual framework that can be used to frame the study of human capital accumulation and view the current challenges and gaps in knowledge within such an organizing structure. I provide an example of the use of such a framework to interpret the evidence on the impacts of an early years intervention based on randomized controlled trial. (JEL: O15)


Introduction
In recent years, a considerable amount of attention has been devoted to human capital accumulation. Scholars have looked at the role of human capital in the process of economic development and stressed the fact that many developing economies that have experienced fast increases in growth have also experienced considerable increases in human capital. Macroeconomists and development economists have been interested in the relationship between human capital and GDP growth and have proposed models with human capital externalities. 1 The process of growth and development, at the same time, if associated with the adoption of skill intensive technologies, will induce an increase in the returns to skills and, therefore, a change in the incentives to accumulate skills. 2 Moreover, human capital is seen as relevant for distributional issues: cross sectional inequalities in a variety of dimensions (including cognition, health, socio-emotional skills) among individuals in many societies seem to emerge very early in life and seem to be strongly linked to inequality of human capital. This is particularly true of certain societies such as Latin America, as discussed, for example, in Lopez and Perry (2008)). 3 One could therefore argue that understanding the process of formation of human capital over the life cycle, and in particular how specific skills that are remunerated by the market develop, is key for the design of policies that want to reduce inequality in the long run.
It is becoming increasingly clear that human capital is a complex object with many different dimensions. Labour markets in different economies reward different skills in different ways, or , in other words, different skills play differ-in the early years. These may include, for instance, the effect of nutrition or exposure to toxins on brain development in utero or in the first few years, or even the effect that specific parental practices and traits (attachment, stimulation and so on) might have on development. On the other, the mechanisms that might give rise to specific forms of investment on the part of parents that eventually generate extremely unequal outcomes. The scope for interactions and synergies among different disciplines, including medicine, neuroscience, psychology, psychiatry, epidemiology, genetics, economics are obvious.
The early years seem to be extremely important in the whole process, both because events during those years seem to have very long run consequences and because very young children seem to be very malleable or, conversely, particularly vulnerable to negative environmental factors and different types of shocks. These considerations make the early years particularly salient for policy interventions. Not only early years interventions might be more effective in closing developmental gaps, but could also make subsequent policies, aimed at, say, school-aged children, more effective. Heckman and his collaborators have been particularly vocal in stressing the importance of the early years.
The fact that early years are important does not mean, however, that everything is determined by (say) age 3 or by some other specific date. Indeed much recent research shows that there exist other important windows of opportunities, such as, for instance, adolescence (see for instance Blakemore and Mills (2014)). The early years, however, can be particularly important not only because of the development that is achieved in those years, but because that same development might facilitate and enhance subsequent growth and the productivity of subsequent investments. An interesting research question is whether different ages should be targeted by different interventions and whether specific traits and domain develop more rapidly during certain phases of the child's life cycle.
The importance of the early years and their salience for policy is particularly relevant in developing countries. The Lancet series in 2007 and 2011 (see McGregor et al. (2007), Walker et al. (2011), Engle et al. (2011)) have claimed that there are 200 million children at risk of not developing their full potential and most of these children are in developing countries. These children are particularly vulnerable because of the high incidence and burden of infectious diseases, under nutrition in the perinatal period and early childhood, micronutrient deficiency, lack of clean water and limited hygiene, as well as many psychosocial factors, such as violence, lack of stimulation, maternal depression and poor parenting practices. The damage inflicted on these children is likely to be permanent, and delays accumulated in the early years will be difficult if not impossible to fill.
There is overwhelming evidence that socio-economic disparities are associated with developmental delays and that these delays emerge very early on and grow dramatically during the first few years of life. For instance, Rubio-Codina et al. (2014) show that in Bogotà, Colombia, significant differences in cognitive and language development among children of different socio-economic background emerge at around 12 months and grow considerably over time. Paxson and Schady (2007) show that, in Ecuador, the difference in vocabulary at age 6 between children in the fourth decile and children in the first (poorest) decile of the wealth distribution is equivalent to 3 standard deviations of a z-score. This is equivalent to a delay of 2.5 years in language development. These children, who will start attending schools designed for 6 year olds, will not be able to benefit from that experience and will accumulate further delays. Fernald et al. (2012) report similar evidence from India, Indonesia, Perú and Senegal.
While these analyses are based on cross sectional data, a few studies have used longitudinal data from developing countries. Hamadani et al. (2014), using a longitudinal data set from Bangladesh, show that significant cognitive delays between children of different socio-economic background emerge as early as 7 months after birth and increases as the children age. By the time they are 64 months the difference in cognitive development between the poorest and less poor children is as large as 1.2 standard deviations of a z-score. This is a remarkable difference as all the households in the study are quite living in small rural villages and are quite poor. Schady et al. (2015) report evidence based on longitudinal studies from several other developing countries.
The salience of the early years for policy is also confirmed by the growing evidence that well designed and well targeted interventions can achieve spectacular results. A number of long term longitudinal studies that have followed children who received intense and high-quality interventions in the 1960s, 1970s and 1980s are now available and, in some cases, show strong effects on a variety of adult outcomes. Some of the best known programmes, which I discuss in some detail in Section 3, are the High Scope Perry Preschool Project, the Abecedarian, and, in a developing country context, the INCAP nutrition intervention in Guatemala and the home visits and stimulation intervention in Jamaica.
Of course, given that some of these interventions are intensive and costly, they should be justified by a cost-benefit analysis. However, when, despite its intrinsic difficulties (partly related to the long term nature of the benefits) have been performed, the implied internal rates of returns seem extremely high.
An example of such an exercise for the Perry Preschool Project is contained in Schweinhart et al. (2005). Many recent discussions have stressed that the rate of returns on early years is very high, and presumably higher than a number of alternative investments. Heckman et al. (2009) and Heckman (2012), for instance, put the return to the High School Perry Preschool Project at between 6 and 10%.
The existence of such a differential is an indication of important frictions that prevent investment in human capital in the early years.
The type of frictions that generate such inefficiencies can be many, ranging from basic credit constraints and imperfections in credit and insurance markets, to information problems and myopic behaviour, to the lack of altruism.
Imperfections to credit markets can, in turn, be generated by many factors, linked to asymmetric information and difficulties in enforcing contracts on investment whose return is uncertain and is received many years after the initial investment. The fact that returns on human capital are enjoyed by individuals who are different from those who make the investment (children and parents) might also be a problem. Poor parents might also lack the information and sophistication to assess the size of the returns to education. Or given the stress to which they are subject, they might like the ability of formulating and exe-cuting long term plans that include constant investment of time (stimulation) and resources for their children. In addition to these 'efficiency' arguments that can justify policy interventions in human capital, an important justification for interventions targeted to early years is a redistributive one: given the size of the returns of these interventions and their very dynamic nature, they might be extremely effective in reducing inequalities and in fostering 'equality of opportunities'. The fact that early years interventions can be effective and the fact that large gaps in development (which are later associated with large differences in earnings, health and other welfare indicators) emerge very early makes these interventions potentially very important. These are policies that have the potential of greatly increasing the efficiency of an economy whilst at the same time reducing the level of inequality and disparities both in economic and other domains. However, not all policies are effective. And the design of policies that are effective at scale, given the available resources (including human resources) is particularly difficult. Having established that interventions to foster the accumulation of human capital in the early years is desirable, the biggest challenge is to develop policies that are scalable in a variety of different contexts and can be implemented with the resources available. A well-designed and effective policy needs a good understanding of the mechanisms that drive its impacts. This challenge is what motivates the research agenda that I describe in this paper. And an understanding of these mechanisms requires a unifying model that frames the main issues. I start my discussion below with the elements of such a framework in Section 2 where I sketch the main components of the framework without specifying its details. As I discuss in Section 3, much has been learned but much is still unknown.
The framework I present in Section 2 helps in organising what we know and what we need to learn. In Section 4, I present a specific example of the conceptual framework and I exemplify the use of such a framework by discussing a specific intervention and apply the theoretical framework to the analysis of its impacts. I borrow from two recent articles that have performed this analysis. In Section 5, I discuss the role that parental beliefs can play in child development. After that, I discuss two methodological issues: the controversy about the use of structural model versus an a-theoretical analysis of policy interventions and the importance of measurement for the entire research agenda. Section 7 concludes with some reflections on future challenges.

A theoretical framework
One first step towards the understanding of the mechanisms behind human capital formation is the construction of a coherent theoretical framework. In this section, I sketch one such a framework and discuss its features. In section 4, I will then use a particular specification of the framework I present here without details to illustrate a possible use and interpret the evaluation of a specific intervention. The main components of the conceptual framework I consider are: the process of human capital formation and the decision process that determines investment decisions. The latter, in turn, depends on household preferences, information and resources.

The production function of human capital
The work on the production function for human capital has a long tradition in economics, going back to the seminal work of Gary Becker (see Becker and Tomes (1994), Becker (1975)). More recently, Heckman and his collaborators have greatly advanced the study of human capital formation and proposed a very useful framework (see, for instance, Cunha et al. (2006), Cunha and Heckman (2008), Heckman (2007)). We consider Human Capital as a multidimensional object that starts evolving very early in life, possibly before birth. I will be calling these different dimensions 'factors'. One factor could be cognition, another factor could be health and nutritional status, yet another factor could be socio-emotional skills. I will not specify how many factors are relevant and whether a given factor could or should be decomposed in several factors. The different human capital factors change over time according to a process that depends on past levels of the factors and on several environmental variables some of which are fixed (such as parental background) and others that are changing over time.
Among the latter set of variables one could distinguish between variables that are chosen by parents or other individuals and/or institutions and others that can be safely considered as exogenous variables. The main difference between the two sets of environmental factors is that the former, which I will call 'investments', are chosen by agents who might be reacting to the evolution of the various factors while the letters can be safely considered as having an evolution that is independent of what happens to the various dimension of human capital.
I will call the process of formation of human capital its production function. Environmental factors and shocks, inputs of various nature and the existing level of human capital (in its various dimensions) enter the production function in complex and non-linear ways. Some arguments of the production function could be complement, while other might be substitutes. The presence of lagged values of the factors in the production function makes the all process dynamic and, in the presence of complementarities among different arguments, can create windows of opportunities that make investment in certain periods particularly salient and important for future developments. A flexible specification of the production function, when bringing this framework to data, is therefore essential, in order not to preclude the identification of interactions and complementarities.
From the point of view of researchers, some factors are observable, while others are not. The same applies to the environmental factors and investments that enter the production function. The omission of relevant inputs can imply the introduction of important biases in the estimation of the production function. I will discuss briefly these issues in what follows: they are an important area of research. Investments are chosen by parents, making them endogenous variables in the production function. The endogeneity of investment clearly poses a problem for the empirical identification of the parameters of the production function. If parents react to specific shocks to the child's development that might be unobservable to the researcher, the productivity of investment will be underestimated if parents compensate these shocks, while it will be overestimated if they tend to reinforce them. It is therefore important to model parental behaviour and determine if enough data are available to identify the parameters that inform it as well as the parameters of the production function.

Preferences
In the model I am proposing, parents are assumed to maximise an objective function which depends on their current consumption and on their children developmental status. Higher development implies higher welfare, as smarter, healthier children are more likely to command higher resources as adults. The dependence of the objective function on children development can be driven by altruism towards the children or by the fact that children can provide support during old age. The fact that parents maximize some sort of objective function does not necessarily mean, as I discuss below, that they make optimal choices.
One first issue that needs to be addressed is whether the number of children is taken as given or whether fertility choices are also modelled. Of course, the choice between these two modelling alternatives depends on the nature of the problem that one is interested in analysing. However, if it is assumed that the number of siblings is a variable that enter the production function of the human capital of a given child, it might be necessary to take a stance on this issue. There is an extensive literature on the quantity/quality tradeoffs in the determination of fertility choices in developing countries that is relevant in this context (see, for instance, Becker and Lewis (1973), Willis (1973) andBecker (1991)).
In the presence of more than one child, another important issues is the specification of parental preferences across different children. One view could be that parents maximise the total resources their children can command and, therefore, might want to focus investment on the smarter children, if, given the nature of the production function, these are the children for whom such an investment would be most productive. A possible justification of such an assumption is that parents could enforce transfers among siblings to compensate the children who receive the lowest investment. If such transfers are un-enforceable or perceived as such by parents, then it is possible that they would try to compensate initial differences among siblings and, possibly, focus their investment on the weakest children. This would be the case if they have a taste for equality among their offspring.
Often in models of parental behaviour, households are considered as a unitary decision unit. In reality, households often include more than one adult and these adults might not share the same objectives and tastes. How decisions are made within the households will then be determined by implicit or explicit bargaining processes between fathers and mothers or possibly other adults present, such as grandparents.

Resources, information and beliefs
An obvious constraint parents face is that of resources. The resources parents can access depend on their human capital (the wage they can command on the labour market) and their non-labour income. The evolution of these variables can depend on a variety of factors, including changes in economy-wide prices and wages and idiosyncratic shocks to productivity. In the presence of uncertainty, parental investment strategies will depend on the ability they have to absorb shocks, which in turn depends on the availability of different smoothing mechanisms, ranging from individual savings, to formal and informal insurance contracts, to credit, to changes in labour supply of various family members. An important resource that could constitute an important constraint on parents behaviour and that is often ignored in the literature is information.
Parents make decisions taking as given the production function of human capital. They invest time and material resources in their children as they will expect these investment to have a return in terms of human capital development. How much they will invest will depend, in addition to their taste and their material resources, on their perception of the production function and, in particular, their beliefs about the productivity of the various inputs.
Assuming that parents maximize a certain objective function taking as given resources and the production function does not necessarily mean that parents behave optimally. It is possible that they mis-estimate the returns to certain types of investment. Information can, indeed, be an important constraint and a scarce resource. I discuss these issues in Section 6. 4 This theoretical framework needs to be fleshed out with specific details.
The analysis of different problems requires the specification of different details of the model. In section 5, I use a similar model (with some stark simplification and remorseless omissions) to analyse and interpret the results of a randomized controlled trial run to evaluate a policy. This general structure is also useful to organize the various components of a research agenda and to take stock of what we know and what we do not and need to learn for a better understanding of the process of human capital formation and for the design of policies to foster it.

Knowns
Much has been learned on the importance of the early years and on some of the mechanisms that make these years so salient for human development and for adult outcomes. The evidence that early years events have long run consequences is extremely strong. Almond  and other methods to isolate the causal impact of early life shocks on subsequent outcomes. Almond (2006), for instance, documents the impacts of the in utero exposure to the 1918 influenza pandemic in the US. He finds that individuals exposed to the pandemic in utero experienced 'reduced educational attainment, increased rates of physical disability, lower income, lower socio-economic status, and higher transfer payments' (Almond (2006), p.672).
There is a huge literature that associates child development to socioeconomic factors. Duncan et al. (1994), for instance, stresses the effect that poverty as well as the duration and timing of exposure to poverty can have on children's development. More recently Hackman et al. (2010) reviews the approaches taken in neurosciences in this context and stressed the need to understand the causal links and the identification of the processes that lead to the observed associations. The analysis of specific mediating factors (such as parenting practices) can be particularly informative. Hackman et al. (2010) argue that useful evidence on these pathways can come from animal studies that can shed light on the biological channels that can be affected by specific practices.
The  Barker (1995)). Economists have more recently paid attention to this hypothesis and have unearthed a substantial amount of evidence on long run effects of foetal growth on a variety of variables, including test scores, earnings and educational attainment (see Almond and Currie (2011b)).
An impressive study that studied individuals born around the Dutch famine caused by the Nazi embargo in 1944-45 (see Heijmans et al. (2008)), identified epigenetic modifications and, in particular, in the expression of the Insulin-like growth factor 2 (IGF2 ) gene. 5 The Dutch famine study identified such effects by comparing the genetic material of subjects exposed to the famine while in womb to their siblings born after the famine.
After birth, nutrition in the very early years seems to be important. Some studies 6 have found association between breastfeeding, early height per age and other indicators of nutritional status in the early years and subsequent outcomes, both in cognitive development and health. Whilst it is difficult to establish the causal link between breastfeeding and subsequent development, a number of papers have now presented some strong evidence suggesting that breastfeeding causes a number of positive outcomes. Kramer et al. In addition to breastfeeding, nutrition seems to be particularly relevant for children health status and, more generally, for children development. Many papers have showed that stunting in the early years can lead to long term adverse consequences. Below, I discuss the evidence from the influential INCAP intervention in Nicaragua, where children in that study where followed over a period of 40 years. The INCAP study was one of a number of cohort studies in five countries (Brazil, Guatemala, India, the Philippines, and South Africa) that followed children over a period of time, and related both maternal and child nutrition to long term outcomes. These influential studies, reviewed in Victora et al. (2008), found strong associations between the nutrition status of mothers and children and a variety outcomes, such as height, schooling, income or assets, offspring birthweight, body-mass index, glucose concentrations, blood pressure. 7 Similar associations are also found in a data set from Bangladesh analysed in Hamadani et al. (2014), which I have cited above. This study, however, while controlling for nutrition and physical growth factors in the first months of life, focuses on the home environment and stimulation. In particular, the study finds a strong association between indicators of home environment as measured at 18 and 60 months and cognitive development at 60 months among poor Bangladeshi children. As I mentioned above, socio-economic variables are strongly associated to cognitive development in that sample. Similar association are documented in (Paxson and Schady (2007), Rubio-Codina et al. (2014) with data from Ecuador and Colombia and by Fernald et al. (2012) andSchady et al. (2015) with data from several other daveloping countries. However, in the Bangladesh study, after controlling for the quality of the home environment, the association is much less strong. Similar results are found in the mediation analysis conducted in Rubio-Codina et al. (2015). This evidence stresses the importance of the home environment and stimulation: these factors seem to be particularly important to explain a large fraction of the variability in children cognitive development and, presumably, adult outcomes.
Along the same lines, Schady (2011) shows that in a longitudinal study of relatively poor children in Ecuador, the unimodal distribution of PPVT scores at age 3, becomes a bi-modal distribution by age 5, and that the two modes of the distribution correspond very closely to children of high-and low-TVIP mothers, respectively. This evidence illustrates powerfully the importance that maternal (and more generally parental) inputs have in the development of children language and cognitive skills.
The other fact that seems apparent from the literature, is that human capital cannot be considered a monolithic and unidimensional object. Rather, it is a complex construct that is made of many different components. This multidimensionality is important and relevant in two different ways. On the one hand, from an economic point of view, it is clear that different skills command different prices in the labour market reflecting, probably, the different role they play in the production process. Cognitive skills are certainly important but other skills, what has been called socio-emotional or 'soft' skills also play a very important role.
Socio-emotional skills, which include the ability to interact with others, but also to delay gratification, to focus and pay attention, to be organised, are important for several reasons. First, they might have a direct value in the production process and, therefore, might be remunerated in the labour market.
Second, and more subtly, might facilitate the accumulation of cognitive and other aspects of human capital. 8 Third, there is some evidence that these skills are malleable over longer time periods, while there is evidence that cognitive skills might become difficult to affect after the first few years. As such, these skills might be particularly salient for policy.
The fact that certain skills developed in the early years might affect the ability to accumulate other dimensions of human capital later on is an reflection of the fact that the different domains of human capital follow, over the life cycle of children who enter young adulthood and adulthood, intertwined paths that interact continuously among them and with other inputs in the process of human development. This process is characterised by what the literature defines as 'dynamic complementarities' (see, for example, Cunha et al. (2006) and Cunha et al. (2010)). Certain skills, such as 'socio-emotional' ones (see for instance, Duckworth et al. (2009)), accumulated in the first five years of life, seem to be key to the ability to accumulate cognitive skills in subsequent periods. The presence of these interactions and dynamic complementarities might give rise to 'key' periods and 'windows of opportunities' that could be particularly salient from the point of view of policy design.

Unknowns
The picture that is emerging from this voluminous and growing literature that spans different fields, therefore, is one that is starting to make clear several important features of the process of human development and of the gaps that are accumulated by vulnerable children. The Lancet 2011 review, for instance, states: "Three translational processes influence how risk factors and stress affect brain and behavioural development: the extent and nature of deficits depend on timing, co-occurring and cumulative influences, and differential reactivity " (Walker et al. (2011), p.1326).

The mysteries of human development
And yet, many important details are still unknown or extremely vague. These range from the biological mechanisms that affect the process of human development since conception and during the first years of life, to the factors that influence parental decisions and parental practices. For example, the evidence on the impact that micronutrient deficiency during the first years of life may have on children development is still very patchy, as is apparent from the discussion in the recent Lancet series. Despite the fact that many children in developing countries present important deficiencies in many micronutrients, the authors of the series conclude that ".. [t]here are insufficient data to establish whether supplementation with multiple micronutrients is more effective than iron alone in improving development." (Walker et al. (2011), p.1328). Analogously, when discussing infectious diseases, the survey states that "...[e]vidence is insufficient to establish if early parasitic infections affect child development" (Walker et al. (2011(Walker et al. ( ), p.1329. In a similar vein, although the emergence of evidence of epigenetic effects (in animal studies) is fascinating, whether this evidence is of conceptual and practical relevance for the development of human capital is still contentious issue. Analogously, whether specific genetic configurations mediate the impact of environmental factors is also not completely established, despite some studies pointing out to these effects. 9 Several recent studies have stressed the importance of complementarities among different inputs; which is echoed in the importance of 'co-occurring and cumulative influences' mentioned in the quote above from the Lancet review. The work of Heckman and several co-authors has been particularly forceful in this respect (see, for instance, Cunha et al. (2010)). At the same time, the size of these complementarities and the nature of the dynamic relationship between different inputs are still not fully understood. A number of studies now reject the linearity of the production function. 10 But we still do not know the details of how the production function of human capital evolves in the early years and how the foundations for further learning are posed.

Parental behaviour
In addition to the characterisation of the production function for human capital, the other aspect that is key for the design of policies targeted at reducing developmental gaps of vulnerable children, both in developed and developing countries, is the characterisation of parental investment and practices. What determines parental choices? What are the constraints that parents face? How do parents react to a specific policy? These are all questions that are key to the successful design of early years interventions. And yet, much still needs to be learned.
Parental decisions are complex and several factors, such as available resources, mother labour supply possibilities and beliefs about optimal parental practices, interact to determine them. Parents will invest in children by dedicating time to them and buying toys and books depending on the costs of these investments, how effective they think these activities are and on the amount of resources they have. They will also consider the tradeoffs between spending time with children, work and leisure. Moreover, it is likely that parents' choices react to the evolution of the child's development, to possible shocks that might affect children, and to their understanding of how their investments can remediate (in the case of a negative shocks) them.
Finally, parents often have to make decisions to allocate scarce resources among several children, who differ in the their age, gender, perceived ability and so on. In his seminal contribution, Griliches (1979) conjectured that parents might tend to alleviate pre-existing differences in abilities. Despite the importance of these issues, not many studies have looked at them (see, for instance, Behrman et al. (1982), Behrman et al. (1994a), Becker and Tomes (1976)). There are several papers that consider gender biases in investment, which is an important special case of within household allocation of resources. 11 Rosenzweig and Wolpin (1988) find some evidence in favour of Griliches' conjecture, while Rosenzweig and Zhang (2009) find that parents in China exhibit higher education expenditure on children with higher birth weight, therefore exhibiting reinforcing behaviour. Behrman (1988)

finds that parents in South
India exhibit some degree of inequality aversion (although they seem to favour boys). In a very recent paper, Yi et al. (2015) consider different dimensions of human capital and find that, in response to early health shocks affecting a sample of twins, parents in China might be pursuing a compensating strategy in terms of health investment and a reinforcing strategy in terms of educational investment. A recent survey, (see Almond and Mazumder (2013)) discusses some of these issues and, in particular, whether parents reinforce or compensate the effect of shocks to the accumulation of human capital or initial conditions. In the conclusions, they state: "[w]e expect this area to be a focus of continued research attention because the nature of the behavioral response and its importance to long-term effects are still being debated. " (see Almond and Mazumder (2013) p. 318).
There is a vibrant literature on models of intrahousehold allocations that I cannot summarise here. It is clear, however, that in the presence of two decision makers who differ in their tastes, it is likely that changes in their relative bargaining power, allocations could change. Thomas (1990) was one of the first papers to recognise that male and female labour incomes have a different impact on children's development. Economists have looked at many different different models of intrahousehold allocations that differ from those that would prevail under a unitary framework. One of the most successful approaches has been that of the so-called collective model proposed in Chiappori (1988) and Chiappori (1992). The collective model is attractive because it is agnostic about the specific bargaining process couples engage in and it only assumes that the resulting allocation of resources is efficient. In this context, an important observation about the resources allocated to children is made by Blundell et al. (2005) who note that in the collective model a shift in relative bargaining power in favour of one of the two partners results in an increase in the resources allocated to children only if the marginal propensity to consume on children goods for that person is higher than that of their partner. That is, it is not the absolute taste for children that determines the effect of a shift in the resources that go to children but the relative marginal propensity to consume. This result has implications for the effect of programmes that target specific subsidies to women, such as most recent Conditional Cash Transfers. Related to the issue of intrahousehold allocation of resources is the more general issue of the role played by the family and the family environment, over and above the resources that different family structures can provide children care givers. In many different contexts, vulnerable children often grow within single adult households. Our understanding of the implications of these different family environments have for children development is still very limited.

Interventions and policies
In the introduction, I have mentioned a few interventions , both in developed and developing countries, that have been shown, with the help of randomised controlled trials and longitudinal data, to have had large and sustained impacts that have been visible over long periods of time. One of the best known cases is that of the HighScope Perry Preschool Project (PPP), developed in Ypsilanti, Michigan in the mid-1960s. 123 disadvantaged and high risk children, living near the Perry elementary school in that town, were recruited into a study when aged between 3 and 4. 58 of them (randomly chosen) were assigned to a high quality pre-school programme. The study followed them into adulthood. The pattern of results that emerged from that study, which have been analysed in a number of papers 12 , is particularly interesting for a variety of reasons. Although the intervention initially boosted cognition (as measured by the Stanford-Binet IQ test), this effect faded a few years later. By age 8, treated boys were indistinguishable, in terms of IQ, from their control counterparts. For girls, the effect of the programme on IQ was reduced by remained statistically different from zero. However, as noted by Heckman et al. (2013) the programmes effects on other personality and social skills, such as those measured by 'externalising behaviour', remained statistically significant. More importantly, the programme seemed to affect academic achievement and, in the long run, a variety of economic outcomes and criminal behaviour. One possible interpretation of these results, therefore, is that even when interventions (especially those delivered after age 3) 13 have a limited impact on IQ, they might affect the long run welfare of children and adult outcomes through other channels, for instance, through the impact on social emotional skills.
Another well known study is that of the Abecedarian (ABC) project that was developed in the mid-1970s in North Carolina. In that study 111 disadvantaged children were randomly assigned between a treatment (57) and control (54) group. The programme consisted of two stages, one designed for children between 0 and 5 and one for children aged between 6 and 8. The first stage was very intense, including play-based adult-child activities to support children's language, motor, cognitive development and socio-emotional competence, including task orientation for up to 9 hours each day for 50 weeks/year (see Ramey et al. (1976) and Sparling and Lewis (1979)). The two stages of the intervention were evaluated with a double randomization design and the first stage has been shown to have a variety of long run impacts. 14 Most recently, Campbell et al. (2014) show that ABC had an impact on a variety of health outcomes, including the prevalence of obesity and blood pressure when the subject were in their mid-30s. In addition to PPP and the ABC project many other interventions have been studied in the US and other developed countries. 15 Some successful interventions, however, have also been implemented in developing countries. A first program that is worth mentioning is the INCAP study in Guatemala, a nutrition intervention that was evaluated through a randomised controlled trial and whose subjects were followed for over 40 years. Remarkably, even the offspring of the original subjects were observed. The intervention consisted in providing, from 1969 to 1977, a nutritional supplement, rich in calories, in the treatment villages. The children in the control villages were instead provided with a similar beverage, which, however, lacked the additional calories. Since 1971, both treatment and control beverages were fortified with micronutrients.
As the study went on for several years, children in both treatment and control villages entered the study at different ages: some from birth, some when they were already a few years old. This intervention found impressive long run impacts. Interestingly, the gains in various dimensions, including adult wages, were significant only for those children that were exposed sufficiently early to the intervention (see, for instance, Hoddinott et al. (2008)  intervention, both treatments (nutrition and stimulation) seemed to have an impact on cognitive development, and the effect seemed to be cumulative, to the point that the development of children receiving both of them was not very different from that of non-stunted children from the same neighbourhoods observed over the same period. After the end of the intervention, the children were observed at ages 7-8, 11-12 and 17-18. Whilst the effect of the nutrition intervention faded completely, that of the stimulation one was significantly different from zero at all observation points and by sizeable amounts. A more recent follow up (Gertler et al. (2014)), at age 22, observed significant effects on earnings, which were increased by 25%, enough for the treated to catch up with the earnings of a nonstunted comparison group.
The few examples I have cited demonstrate that well designed and well targeted interventions can yield spectacular results. This is particularly true for early years interventions. Notice, for instance, that while in the case of the PPP the initial impact on the IQ of the treated children fades away a few years after the end of the intervention (although gains in other dimensions, in particular socio-emotional skills remain significant), in the case of the Jamaica intervention, the IQ impacts remain significant many years after the end of exposure and into adulthood. Such difference might be explained by the fact that the Jamaica study was targeted to children much younger than those targeted by the PPP. In the case of the ABC project, the IQ impacts also lasted longer. The fact that the ABC, as the Jamaica intervention, also started earlier than PPP is intriguing. But one should also consider the fact that the ABC was probably more intensive than both PPP and the Jamaica intervention. These unanswered questions resonate even in my brief summary of the impacts of well known interventions, such as the Perry Pre-school Project, the ABC project and the Jamaica intervention. PPP, which started by and large, past age 3, seemed to have affected socio-emotional and 'soft' skills in the long run, which in turn seem to have had an impact on other outcomes, ranging from health to economic variables. ABC and in particular the Jamaica intervention, instead, seemed to have affected cognition and intelligence in a sustainable fashion. Are these differences due to the timing or the content of the intervention? Should the content of intensive interventions be targeted to specific domains? To what extent the gains in specific domains (such as socioemotional skills) allow children to exploit better education opportunities?
There is still not enough evidence about these issues. And these questions, to a large extent, overlap with the main research questions that I have discussed above: How do interventions get their impacts? What is the nature of the production function of human capital? What do parents do and how do they react to interventions? Do interventions crowd parental investment in or out?
Above all, policy makers struggle to build cost effective and affordable interventions that can be expanded and sustained at scale. Cost is only one aspect of scalability. The availability of appropriate infrastructure, the human resources in the territory, monitoring and supervision schemes that guarantee fidelity and effectiveness of interventions are big issues, especially in developing countries.
A proper understanding of the mechanisms behind human development in the early years, both in terms of the features of the production function for human capital and of the determinants of investment in human capital, is key to the scalability of policies. In addition, to design policies that are effective and that can be deployed on a large scale, it is also key to understand individual behaviour and how it reacts to specific interventions.

A theoretical framework and its use
In this section, I will present a specific example of the theoretical framework I sketched in Section 2 and then use it to interpret the impacts of an intervention evaluated with a randomised control trial. In the process, I will draw on Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015) and Attanasio, Fernández, Fitzsimons, Grantham-McGregor, Meghir, and Rubio-Codina (2014).

The model
In what follows, I will borrow from the model used by , who extend the approach proposed by Cunha et al. (2010). I will use some of the empirical results in this former article in my discussion in Section 5.2.
I will assume that parents in household i choose investment to maximise utility that depends on their children human capital and consumption. Their choices are made considering a budget constraint and a production function for human capital. At this point I assume that parents have information about the production function of human capital that corresponds to the actual production function. In Section 6, I will explore models in which parents have a distorted view of the production function of human capital. Given what I want to stress and the context to which I will apply this model, I use a static framework. If the focus had been on liquidity constraints and on 'crucial windows' in the process of development, it would have been better to formulate the problem as a dynamic one, where parents enjoy utility at different points in time and, possibly, enjoy the returns to human capital investments only much after the investment on human capital was made.
To formalise, let H i,t be the human capital of a child of age t being raised in household i. H i,t is a multidimensional vector, reflecting the different com-ponents of human capital, such as cognition and socio-emotional skills and health. The production function of human capital is assumed to depend on the initial level of human capital H i,t , on background variables Z i,t (either fixed or time varying-including mother m, father f and other r), on investments in human capital X i,t (including materials M and time T ) and on a vector of random shocks e H i,t . The 'shocks' e H i,t can also be interpreted as reflecting inputs in the production function that are not directly observed or considered by the researcher. 16 The production function is given by: The variables H i,t , Z i,t , X i,t and e H i,t are multidimensional: where I have assumed that, in this particular case, human capital has three dimensions (cognitive skills c, socio-emotional skills s and health h. Given the available data and the specific context under study, preliminary factor analysis can be helpful in making the adequate modelling choices.
Parents are assumed to maximise: and where C i,t is consumption and P x t is the vector of prices of investments X i,t . The production function g t in equation (1) is assumed to be time varying, so that its parameters or even its shape can be different at different points in time. Notice also that in this simple model there is no saving and only one child. Additional complications and meaningful dynamics could be added to this framework but do not add much to the main message I want to stress.
For the time being, I assume that parents know the production function in equation (1) and take it as a technology constraint to their maximisation problem. I will discuss how to relax this assumption in section 6. Under this assumption, the problem in equation 2 can be solved to derive investment and consumption functions for the parents. Their choices will depend on their tastes, on the parameters of the production function, on prices P x t and on the available resources Y i t . The investment functions could be written as: shown to be important to explain certain correlations; 17 there is limited scope for heterogeneity of parameters and preferences. Most importantly, this structure assumes that all relevant inputs and factors are included and incorporated into the model and that those excluded are completely captured by the term e H i,t , which is assumed to be uncorrelated with other factors. If all the variables in equations (1) and (2), with the exception of the shocks e H i,t and e x i,t , were observable, it would be possible to bring this model to the data in a relatively straightforward fashion, by specifying functional forms for the utility function and the production function. In that case, the main problem in estimating the production function that determines human capital at age t + 1 would be the fact that one of the inputs, namely the investment, depends on the shock e H i,t . Parents might be reacting to shocks that affect children development in a compensatory or reinforcing way (depending on their preferences, their resources, the nature of the shock and their beliefs on the technology). To obtain consistent estimates of the parameters of the production function, one would need to take this endogeneity issue into account.
An attractive approach to this problem is to use an Instrumental Variable or a control function strategy. In either case, identification stems from the availability of variables that affect investment choices and do not enter the production function directly. Prices P x conditions as the source of identification. Another important caveat to the use of prices as a source of identification for the role of investment in the production function of human capital is the availability of enough variation. Data from a single time period and a single location might not provide enough variability. However, in some situations one can use geographic variation in prices.
The other major issue to tackle in bringing the model in equations (1) and (2) to the data is the fact that most of its variables are not directly observable.
Instead, what researchers usually have is a collection of imprecise measures of the factors that constitute human capital and of the factors that enter its production function. In this respect the approach proposed by Cunha et al. (2010) is particularly useful. 18 They explicitly consider a measurement system that relates the factors of interest in the model to the available measures. In particular, they consider the following system: where m kj t is measurement k corresponding to factor j, α jk t are the loading factors that relate factor j at age t to measure k at age t and kj t are measurement errors that make the observable variables m kj t noisy signals of the factors. The way equation (5) is written implies that each measurement k is affected only by a single factor. This assumption can be somewhat relaxed, but some exclusion restrictions (that is some factors excluded from certain measurements) are necessary to achieve identification. I will discuss some of these issues in the application of this model in section 5.2.
The approach proposed by Cunha et al. (2010) is particularly useful because it considers simultaneously the theoretical framework, with its conceptual issues (including the nature of the production function, the interaction between different inputs, the endogeneity of investment) and the measurement system, with its own set of issues. It also provides good discipline in the use of mul-18 See Wolfe and Behrman (1984) for an earlier similar approach. tiple measures and a good way to summarise the available information within a theoretical coherent fashion. Notice that an important step a researcher implementing this approach has to take is to map measures into factors. Cunha et al. (2010) use an old theorem by Kotlarski (1967) to establish the non-parametric identification of the joint distribution of the factors and of measurement error. In particular, what is required for the identification of these joint distributions from the empirical distributions of measurement is at least two measurements for each factor and three for at least one. It is also necessary that the measurement error is independent across measures for at least two measures. The intuition of this result is quite clear: to identify the distribution of the factors it is necessary to average out measurement error.
Although the identification is non-parametric, in practice researchers often specify a (flexible) functional form for the joint distributions of the factors and proceed to the estimation accordingly. Once the joint distribution is identified, the estimation of structural relations, such as the production function and the investment function discussed above, is relatively straightforward. One possible approach, for instance, developed in Attanasio, Meghir, and Nix (2014) and used in Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015), is to take draws from the joint distribution estimated into a first step and use this simulated data to estimate the structural relation of interest by standard techniques (such as non-linear least squares or non-linear instrumental variables).
Notice that such relations represent a restriction among the conditional means of several of the factors. As such, they have implications for the joint distribution of the factors one estimates in the first step of the procedure.
Normality, for instance, will imply a linear (or possibly log-linear) relationship between the means of the various factors. As such, for instance, it would be inconsistent with a non-linear production function that implies the presence of complementarities between the various inputs. Suppose, for instance, that one wants to allow the production function in equation (1) to be a CES function in which initial conditions, background variables and investments interact with a certain (finite) elasticity of substitution to generate human capital at age t + 1. Then the joint distribution of age t + 1 human capital and the age t human capital and investment factors is necessarily non-gaussian. It is therefore important, if one does not want to pre-empt answering questions about the nature of the production function, to work with a flexible specification of the joint distribution of the factors. These issues are discussed at length in Attanasio, Meghir, and Nix (2014).
The issue of endogeneity of investment can also be dealt easily within this approach. The instruments considered in the model above, such as prices P x t and resources Y i t , can be added to the measurement system in equation (5) as additional factors (possibly observed without error) and their joint distribution can be estimated. In a second step, then, data for the instruments can be drawn from the joint distribution along with data for the factors and apply a non-linear instrumental variable or a control function approach. Attanasio, Meghir, and Nix (2014) and Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015) use the latter.

Using the model
Having set up a framework for the analysis of the accumulation of human capital, I will now show how it can be profitably used in the context of the evaluation of an intervention aimed at fostering the development of young disadvantaged children. I will start with the description of the intervention and its impacts, before moving on to the use of the evaluation data to estimate the production function within the framework laid out in Section 5.

An intervention and its impacts
As I mentioned in the introduction, one of the most successful interventions targeted to vulnerable young children in the early years in developing countries was the Jamaica study I referred to. Whilst the impact of that intervention The intervention The first step of the project was the adaptation of the Jamaica curriculum to the Colombian context. This involved not only the translation of the curriculum, but also its cultural adaptation. The Jamaica curriculum is delivered through weekly home visits, roughly one hour long, during which a trained visitor engages in a series of structured activities with the target child and their mother (or main care giver). The activities are designed to be appropriate for the developmental status of the child. They become progressively more complex as the child develops. The activities put much emphasis on language (through language games and a continuous encouragement of the mother to engage the child with language in everyday activities) and cognitive development (through stimulation games, including puzzles and other toys, books and so on). The visits are well structured, in that each visit is described in one page of the curriculum, which specifies what activities are to be performed and the rough order in which they should be performed. The activities are explained in the curriculum in fairly simple and direct language, so to be accessible to visitors who are not necessarily well educated. The intervention also provides the visitors with some materials (including conversation scenes, books, and toys) and includes teaching mothers how to build a number of toys from recycled materials, such as plastic bottles, wooden blocks etc.
One important innovation relative to the Jamaica study was the use of the infrastructure of an existing welfare programme to deliver the intervention. In Colombia, as in many other Latin American countries, there is a large Conditional Cash Transfer Programme, called Familias en Acción, (FeA), which is targeted to the poorest 20% of the population. Within this programme, households receive cash if they comply with certain conditions, which include sending children to school and, in the case of young children, taking them to growth and development checkups in the local health centres. The programme has also an important social component, in that beneficiary mothers meet periodically to discuss a variety of issues in what are called Encuentros de Cuidado.

Roughly every 50 or 60 beneficiaries of FeA elect a representative, called
Madre Líder (ML), who is in charge of organising the Encuentros de Cuidado and of the relationship between the beneficiaries and the programme officials.
Effectively, the MLs constitutes the first port of call for any beneficiary that might have a problem with the programme. The ML are not paid by the programme and perform their activities on a voluntary basis. Such a charge, however, is seen as a prestigious position that confers a status to the ML in the neighbourhood.
The ML's, although beneficiaries of FeA themselves, are typically more educated, more entrepreneurial, and, as their title would imply, show more leadership qualities than a typical beneficiary. We therefore had the idea of using them to deliver the intervention. In particular, with the help of the programme, in the towns where the study was conducted, we contacted some MLs, trained them and hired them for the duration of the intervention.
The use of local women, identified through an existing welfare programme, is key for the scalability of the intervention that is being investigated. First, obviously, there is the issue of cost. Local women are likely to be cheaper to hire than social workers. Second, we identify women that are likely to be effective in delivering the intervention through the network of a pre-existing welfare programme that is very widespread. Such an intervention, therefore, could be replicated throughout Colombia as the programme is present in every municipality of the country. Finally, and more subtly, an intervention that aims at changing parental practices and behaviour, might be more effective if its key messages are delivered and channeled through women in the community.
The ML's might be more attuned with, closer to and more trusted by the mothers, whose behaviour the intervention tries to change, than external social workers.
Of course this approach is not without problems. The ML's are typically much less educated than social workers and, therefore, the quality of the intervention could be considerably diluted. The ML's commitment the to the intervention might also not be complete. These issues imply that mentoring, monitoring and supervising might be key to the success of such an intervention. The necessity of mentors, supervisors and monitors increases the cost of the intervention. Moreover, the intervention itself has to be designed so that it can be delivered by visitors with relatively low level of education and literacy. III, Spanish-language short forms, and the Infant Characteristics Questionnaire (ICQ, which measure children temperament), and others. We also measured mothers' and children's height weight and haemoglobin levels to assess anaemia. The socio-economic survey, in addition to a wide variety of household level variables, contained detailed information on the home environment, including several components of the HOME index.
The intervention ran for 18 months. At the end of that period, we collected a follow-up survey, within which children were assessed again in several dimensions. We also collected information on mothers, home visitors and, more generally, the household. Attrition between the baseline and follow up was not large, as we managed to re-contact 1229 of the original children. Moreover, attrition was not different between the control group and the various treatment arms.
Impacts. The fact that we allocated the 96 towns in our sample randomly to the various types of intervention (stimulation, micro-nutrient supplementation and the combination of the two) and the control group allowed us to 21 The results are very slightly different from those in Attanasio, Fernández, Fitzsimons, Grantham-McGregor, Meghir, and Rubio-Codina (2014) because of small differences in the specification of the regression model. 22 The standardisation was performed considering the raw scores for the control group. We first estimated the mean of the z-score as a flexible function of age and gender. We then estimated a similar function for the standard deviation and obtained the z-score for each of the sub scale considered by subtracting from the individual raw score the conditional mean and dividing the result by the estimated standard deviation. We report all the impacts in terms of standard deviation of the these z-scores. tice some modest impacts on fine motor skills, which are often considered as a cognitive skill in children of that age.
The main points that should be taken from the table is that the stimulation intervention had a significant impact on cognitive development and on receptive language. The impacts on expressive language are smaller and not statistically significant from zero. There are also some impacts on temperament (which might be an indicator of socio-emotional skills, as measured by the ICQs. There is no significant impact of the nutrition intervention, either on its own or in combination with the stimulation intervention. Remarkably, the nutrition intervention did not have an impact on physical growth and nutritional status, as discussed in Andrew et al. (2014).
One issue is whether the impact found on cognitive development is significant not only from a statistical, but also from a substantive and economic point of view. To interpret the size of the impact, in Figure 1, I report the standardised cognitive scale of the BSID in Bogotá, plotted against age. The two dotted lines, refer to the cognitive development of children living in households in the bottom and top quartile of the wealth distribution in Bogotá. The gap between these two groups is equivalent to about .8 of a standard deviation.
The red line refers to the control group in the RCT. These children are similar to the bottom quartile of the Bogotá sample. The blue line is the cognitive development of the children in the treatment group of the intervention. As can be seen, the intervention fills about a third of the gap in cognition between the bottom and top quartile in Bogotá. If this impact is sustained over time, it is not negligible and its economic benefits in the long run could be substantial.
Having said that, it is difficult, given the available evidence, to convert a gain in cognition or some other developmental outcome for a Colombian child into a long run gain in, say, earnings or academic achievement. These 'anchoring' issues are discussed in Cunha et al. (2010) As a first indication of the mechanism that might have given rise to the impacts we observe in Table 1, in Table 2, we report the impacts that the intervention had on various parental investments (see Attanasio et al. (2013)).
What is evident from this table is that the stimulation intervention incremented considerably parental investment, as measured by several indicators in the data. As I will discuss below, for some reason, parents were convinced to invest more (in time and commodities) in their young children. This evidence constitutes a first hint of the way the impact of the stimulation intervention might have worked.

Estimating the model and interpreting the impacts
The next step in the study of the intervention that I have been describing, is the estimation of the model discussed in Section 5. Here I will draw on Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015), where my co-authors and I specify two production functions, one for cognitive development and one for socio-emotional skills, and two investment functions. Cluster robust standard errors in parentheses. Treatment effect significantly different from zero at a: *=10%, **=5%, ***=1% level, using a one sided hypothesis test (for all outcomes the one-tailed alternative hypothesis was b>0). All effects are standardised using the estimated standard deviation of the control group. All estimates control for sex, age (secord order polynomial), tester effect, region effect, baseline level of all test outcomes. Treatment effects estimated on a homogeneous sample of 1260 children (318 controls, 318 stimulation only, 308 MNP only, 316 stimulation + MNP). Number of unique play materials refers to the last seven days. Number of unique play activities refers to the last three days.
We let the children outcomes we consider depend on initial conditions, parental investments, parental background variables and shocks. The specification we use is that of a CES production function.
where θ j i,t represents factor j; (j = c, s) at age t for child i. θ T i,t+1 and θ M i,t+1 are investments in time and materials respectively and θ m,c i,t and θ m,s i,t are maternal cognitive and socio-emotional skills.
Whilst such a specification might be considered restrictive, it allows for complementarities between the various inputs and nests as special cases several interesting cases, such as that of separability (which would occur for ρ j = 1) or that of Cobb-Douglas (ρ j = 0). We also tried other cases, such as a nested CES, which include equation (6) as a special case and could not reject the restrictions that would yield it.
In equation (6), the term A j represents 'total factor productivity', while the random variable η j i,t are random shocks that affect the development of skill j at age t. The subscript κ in equation 6 allows the coefficients of the production function to be a function of the treatment (d = treatment, n = no − treatment). Finally, we consider two investment factors, θ M and θ T , the former representing commodities and the latter representing time investment. Both factors are allowed, in principle to affect both cognitive and socioemotional skills.
For the two investment factors, we take a linear approximation of equation (4): where the vector W t includes all the determinants of investment in equation (4). Notice that we let the parameters of the investment functions ψ s,κ depend on the treatment status of the children, to reflect the possibility that the intervention changes the way parents approach the investment problem, as I discuss further down. As we allow the intervention, which is assigned randomly, to influence investment, one could argue that the assignment could be a good instrument for taking into account of the endogeneity of parental investment in the production function. However, this strategy is precluded if we consider the possibility that the intervention may also affects the production function directly. That is, despite being randomly allocated, the treatment is not a valid instrument as it can enter the production function directly.
Within this framework, we can see that the intervention can affect child development in three different ways. First, it can change the parameters of the production function, increasing either the productivity of specific inputs or total factor productivity. Second, it can change parental investment, inducing, for some reason, parents to invest more in their children. Table 2 presents some evidence of this second mechanism. Finally, it is possible that the intervention improves mothers' skills. By estimating the parameters of equation (6) and the distribution of factors, we can test these hypotheses explicitly.
In order to estimate the parameters of equation (6), we follow a two-step procedure, which is discussed extensively in Attanasio, Meghir, and Nix (2014).
In particular, we first estimate the joint distribution of the factors and measurement errors. We augment the measurement system in equation (5) to consider also the distribution of the instruments we use, which we estimate jointly with the distribution of factors and measurement errors. Although these distribution are non-parametrically identified, we make some flexible parametric assumption to obtain them more precisely. In particular, we assume that the factors are jointly distributed as a mixture of two log-normal distributions, while the measurement errors are assumed to be jointly log-normal. We perform Maximum Likelihood estimation implementing an EM algorithm.
Having estimated distributions for the factors (including the instruments), we draw from it to create a data set and estimate both the investment function and the production function. This is performed by implementing a control function approach and non linear GLS on the simulated data. To compute standard errors and confidence intervals, we bootstrap the whole procedure, taking into account the clustered nature of the data (that is, allowing for correlation within each municipality in the sample).
From this procedure, the importance of using a flexible functional form assumption for the joint distribution of the factors is clear. The production function in equation (6) imposes some restrictions on the conditional means of the various factors (at age t and t + 1). In particular, it implies certain non linear relations between the mean of the factors at t + 1 and those at t. The non linear structure in equation (6) would be inconsistent with, say, joint normality of the factors distribution.
I will not report the tables of estimates of the investment functions and the production functions in Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015). However, the main findings in that paper can be summarised as follow: 1. The production function seems to be well approximated by a Cobb Douglas production function. The elasticity of substitution between the various input considered is not statistically different from 1. Additive separa-bility, instead, is strongly rejected. This is true both for the production function of cognitive skills and that for socio-emotional skills.
2. Initial conditions matter. Initial cognition is a very important determinant of cognition in the second period and initial socio-emotional development is important for subsequent socio-emotional development.
Cross effects are also somewhat important: initial cognition (at ages 12-24 months) is important for socio-emotional development at ages 30-42 months. Initial socio-emotional development, however, does not seem to affect subsequent cognition. These last two results contrast with what Cunha et al. (2010) find on a US sample at much older ages. In particular, they find that early socio-emotional development seems to be important for subsequent cognition. It should be stressed that there is not much evidence on this issue for the age group considered here.
3. Parental investments also matter. Investment in 'materials' seems to matter for cognitive development, while investment in 'time' matters for socio-emotional development. This evidence is also consistent with the mediation analysis in Rubio-Codina et al. (2015) on data from Bogotà, where it is found that play material seem to be more relevant for cognition and fine motor skills, while time investments relate more to language and socio-emotional development.

Parental background has mainly an effect through parental investment.
Once we control for investment choices, maternal skills are not very important. Once again, this evidence is consistent with the results on the data from Bogotà in Rubio-Codina et al. (2015).

Allowing for endogenous investment is important. The coefficients on
investment is estimated to be considerably lower when the production function is estimated by non-linear least squares ignoring the endogeneity of investment. This finding is important not only for the identification of the marginal product of investment in the production function of human capital but also because the direction of the bias is indicative of the nature of parental investment. A downward bias in the estimates of the coefficients when endogenous parental reactions are ignored, probably indicates that parents tend to compensate rather than accentuate shocks. 23 6. The intervention shifts significantly the distribution of the two investment factors considered. Parental investment in time and material is considerably higher in treatment villages than in control ones. This is consistent with the simple mean comparisons reported in Table 2.
7. The parameters of the production function do not seem to be affected by the intervention. This is also true for the total factor productivity. This finding (and the one about investment) are important for interpreting the way in which the intervention obtained the effects reported in Table  1. Rather than making parents (or other factors) more productive, the intervention increased parental investment in child development.
The value of this exercise should be apparent from the list of main findings. First, the estimation of the production function of human capital allows the characterisation of the process through which young children develop and the role played by different factors. This is a first step towards filling some of the gaps in our knowledge of such process. The fact that the nature of dynamic complementarities between different dimensions of human capital is different from what was found, for instance, by Cunha et al. (2010) at different ages is an indication of the fact that the process of human capital formation is quite complex and we are still far from a full understanding of its features. 24 Such an understanding is key for the design of policy. The nature and size of dynamic complementarities, for instance, is key to identify crucial periods and windows 23 On this point see the discussion in Almond and Mazumder (2013). OLS would yield biased estimates if there is an omitted initial condition that is negatively correlated with the investment or in the presence of measurement error in investment. The factor structure takes the latter into consideration. 24 Of course there may be many other reasons, in addition to age, behind the difference in results between Cunha et al. (2010) and , such as the different contexts of a developing and developed country of opportunities to target interventions. Moreover, if certain dimensions at a certain age turn out to be particularly important, one might want to use interventions that target that specific dimension.
Second, the findings above give a good idea of the way in which the intervention we have described worked. It seems that, for some reason, the home visits induced parents to invest more, both in terms of money and time, in their children. The next logical step in this research agenda is to understand why parents were not investing 'enough' before the intervention. (2014) discuss an information intervention in Malawi that increased children nutritional status by increasing children protein consumption (which was in turn financed by an increase in male labour supply). The questions these results (and others in similar areas) pose are therefore: why was this not happening before the intervention? Why did parents not invest before the intervention in Colombia? Why parents were not working harder to feed their offspring with more proteins before the intervention in Malawi? Several possibilities exist: it is possible that these interventions change parental tastes so to make them more altruistic towards their children or changing the valuation they give to children outcomes. Or, in the case of the Colombian stimulation programme, it is possible that the intervention changed the psychic cost of interacting with the children. An alternative possible answer is that they were not aware of the productivity of their investments.

Beliefs
Their choices, as in the model sketched above, depended on their perception of the production function. If they held a distorted view of the production function and, in particular, underestimated the marginal productivity of parental investment, an intervention that would change that view and moved them towards the 'correct' one, would increase investment and improve outcomes.
The fact that disadvantaged children are exposed to much less stimulating environments is increasingly documented. 25 The view that the parents of disadvantaged children seem to underestimate the productivity of investment is consistent with some of the hypotheses discussed by Lareau (2003), who argues that middle-class families, in their parental investment strategies, use what she defines 'concerted cultivation', while working class and poor families use parental strategies that rely on 'natural growth'. Unlike their better off counterparts, many poor parents do not think children need special inputs, and develop naturally, unless they are affected by severe shocks.
An interesting research agenda, therefore, is to try to estimate parental beliefs on the nature of the production function of human capital. There are several possible approaches to the identification of parental perceptions of the production function. One possibility would be the direct elicitation of such beliefs. This is a good example of the design of innovative measurement tools that I discuss in Section 7.2. Cunha et al. (2013) implement such an approach in an innovative study that looks at the beliefs of pregnant disadvantaged mothers in a hospital in Philadelphia. In Attanasio, Cunha, and Jervis (2015), we have started the analysis of subjective beliefs elicited in the second follow up of the children in the Colombian experiment discussed above. Preliminary results indicate that subjective beliefs seem consistent with the idea that parents see investment as productive and necessary especially for children with some problems and delay. This is also consistent with the compensatory nature of parental investment identified in Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015).
Obviously the elicitation of parental beliefs on the production function is not easy. This is a very promising research agenda, but much work is needed on validating different measures and on establishing what is the best way to structure the questions.
An alternative approach to the direct elicitation of beliefs is to try to infer them from investment choices. As I mentioned in Section 5, the parameters of the investment function 4 depend on individual preferences and on individual perception of the production function. To be able to disentangle them we need to impose some structure on the problem and some variation in the data that allow us to identify taste parameters independently from the parameters of the production functions as perceived by the parents.
In , we use the idea that an intervention, by providing information (but no resources) to parents, might be changing individual perceptions of the production function. If such an intervention is randomly allocated to different groups of individuals (as is the case, for instance, in the case of the Colombian intervention above) one can assume that 'treated' parents have acquired knowledge of the actual production function and one can use data on child development and parental investment from this group to identify the taste and technology parameters in equations (1) and (4). Having obtained taste parameters, one can then use investment choices of the 'control' parents to identify the parameters of the production function as perceived by these parents and therefore assess the extent to which their beliefs are 'distorted'.

Research tools
In this section, I discuss two methodological issues that are relevant not only for what I have discussed so far, but also at a much more general level. First, I will briefly go over the debate between the proponents of structural models versus those who prefer simpler approaches that makes little or no use of economic and behavioural models in analysing data and, in particular, in evaluating the impacts of social policies. I will then move on to discuss the opportunities afforded by new measurement tools and how they should be constructed.

Structural models and Randomized Controlled Trials
When looking at data and at what can be learned from correlations, economists are trained to look at behavioural responses that might prevent the inference of causal relationship among certain variables. Over the last few decades, this set of issues has been taken extremely seriously by most applied researchers in economics. These are, of course, identification issues, which can be addressed either by the availability of exogenous variation (such as that induced by a controlled experiment) or by the imposition of some restrictions that might be derived from economic theory or other knowledge and that can achieve point or set identification. A part of the profession has taken the view that restrictions derived from theoretical models are essentially arbitrary and that reliable causal evidence can only come from the comparison of means of different samples exogenously exposed to different treatments. Another part of the profession, instead, does not mind to impose restrictions justified by economic theory and possibly functional form assumptions, to achieve identification. The approach taken by the first group is often identified (misleadingly, 26 in my opinion) as the 'reduced form' approach, as opposed to the 'structural' approach. The fact that the profession thinks very carefully about the source of variation in the data that is used to identify certain parameters of interest is an extremely positive development, which distinguishes economists from other social sciences. However, to reduce the empirical analysis to simple comparison of means of different groups in a Randomized Control Trial is, in my opinion, very limitative and narrow. Experiments can be very useful because they introduce variation that is, (if the experiment is constructed carefully) by construction, exogenous. This variation can then be used to estimate behavioural models that are richer and use weaker assumptions than models estimated without the luxury of the experimental variation. Inference from such models is crucial for the design and evaluation of public policies: without a model, it is impossible to extrapolate the results of an experiment to a different context or to estimate the impacts of a slightly different policy in the same context. More importantly, without a model of behaviour, it is not possible to understand the mechanisms behind the impacts that one observes in an experiment. I should also add that the exercise of thinking through the lens of a model of individual behaviour or, even better, a model that incorporates general equilibrium effects that take into account the aggregate consequences of a (large) intervention is where the comparative advantage of economists lies in this context. Randomized Controlled Trials have been around in many sciences for a long time and have been used in social sciences for a long time too. Moreover, there is no reason why economists should be running randomised trials in education, nutrition, child development or disease control. Many researchers in these disciplines have a much deeper understanding of the specifics of the interventions and of the problems that they try to address. What economists can offer, however, are models of individual behaviour that generate the responses that one observes in the data (including, in some situations, general equilibrium effects); specific ways to model the selection and endogeneity issues that affect the working of most interventions in fundamental ways. These models can then be used to extrapolate the results of a specific evaluation to wider contexts. The work on the ECD intervention I have discussed in Section 5.2 should give an example of the approach I have in mind. In that context, the estimation of the production function for human capital helps to understand how the intervention had its impact. As discussed in Attanasio, Cattan, Fitzsimons, Meghir, and Rubio-Codina (2015), while the experiment can be used to measure the impact of the intervention, further structure is necessary to estimate the production function and, in particular, the role that parental investment plays in explaining children development. In that context, we used variation in prices and family resources (rather than the experiment) to 'instrument' investment. This approach allowed us to consider the possibility that the intervention affected directly the production function.
Other examples are available in the literature. For instance, in the context of the conditional cash transfer programme PROGRESA, in Mexico, whose impacts have been estimated using a Cluster Randomized Controlled Trial, Todd and Wolpin (2006) and Attanasio et al. (2012) used the evaluation data to estimate a structural model of enrolment decisions in school, which they use, amongst other things, to infer the impact of versions of the program with a different grant structure. In the context of India, Duflo et al. (2012) used the data from a Randomized Controlled Trial of an intervention aimed at reducing absenteeism of school teachers by providing a system of incentives to estimate a structural model of labour supply in which effort depends on the non-linear structure implied by the programme. These exercises make a profitable use of the experimental variation to understand the mechanisms behind the impacts.
These instances indicate that RCT and structural models are not substitute, but complements. RCTs allow economist, social scientists and policy makers to estimate the impact of interventions in a rigorous and, at the same time, simple way. If these experiments are complemented with rich enough data, they can then allow researchers to estimate richer behavioural models that can be used to extrapolate the results of the experiment to different contexts or to slightly different interventions. These models can also be used to interpret the intervention impacts and understand the mechanisms that generate them. This understanding is useful both to perform welfare analysis and to design better interventions. Finally, the results of the experiment can and should be used to validate and test different models. Data should talk to theory and improve it.
What is central to this discussion is the availability of rich data, that gather information not only on the outcomes of interests but on many environmental variables. These data are necessary to estimate the structural models that can interpret the impacts.

Measurement
Many strong assumptions that are sometimes made to achieve identification of structural models are necessary because of the lack of information on certain variables that, while crucial to individual choices, are typically not observed in standard socio-economic surveys. A good example is that of subjective expectations about future and uncertain variables. In many dynamic models, Analogous considerations apply to a variety of other situations, such as individual beliefs on the nature of the returns to certain investments. In the case of the Colombian intervention we have discussed above, parental investment depends clearly on parents' perception of the production function. The standard practice when modelling investment choices, is to assume that parents know the form and the parameters of the production function. And yet, as I discuss below, in many situations this is clearly not the case.
One attractive possibility, which has received considerable attention in recent years, is that of the direct elicitation of subjective perceptions, be it of subjective probability distributions or of the return to investments. This approach has a long history. Tom Juster and his colleagues in Michigan played a big role in developing alternative and innovative measurement tools. Juster (1966b), cited by Manski (2004), was probably one of the first researchers to try to collect subjective expectations data in a survey. The measurement of subjective expectations is one example, but others exist. Juster (1966a), for instance, studied liquidity constraints in consumption choices by eliciting consumers elasticities in the demand for auto loans to interest rates and maturity. The study cleverly allocated different hypothetical scenarios to randomly chosen groups of consumers.
This type of approach, however, where survey respondents are asked hypothetical questions, has faced much resistance, for a long time, in the economic profession. Economists have refrained from using information elicited through hypothetical questions that do not relate to actual choices individuals make.
Economic surveys typically focus on revealed preferences and give no space to subjective answers, or as Manski (2004)  Measuring the development of young children in different domains accurately is important both to evaluate the effectiveness of different interventions and to better understand the process of child development. As I mentioned above, the nature and size of dynamic complementarities between different dimensions of human capital is crucial for policy design: it is necessary to identify the key periods in child development and the role played by specific skills in each period in fostering further development in subsequent stages.
Without accurate measures this is not possible.
Analogous considerations are also relevant for measuring inputs in the process of human capital accumulation. Children are exposed to a variety of environmental stimuli that are likely to play important roles in their development. Modelling and understanding the process of child development and human capital growth in the early years requires good measures of inputs, including parental investments (in time and commodities), school or child care inputs, nutrition and so on. Measuring the quantity and quality of the inputs in the process of human capital formation is as hard as measuring children outcomes.
Given  Fernald et al. (2008) and Fernald et al. (2013)). Another interesting instrument to measure the quality of the home environment is the LENA software, which is used to decode day-long recordings to assess the quality of the language environment children are exposed to. 30 LENA has recently been used together with LWL to analyse pathways of language development in young children by Weisleder and Fernald (2013

Measurement and Theory
New measurement tools, when properly validated can obviously be very valuable for a variety of purposes. As I hint above, the development of such tools could yield some low hanging fruits. The constriction of new measurement tools, however, is far from trivial and, as I mentioned above, poses a number of challenges. Moreover, there are some important principles that should drive the construction of new tools.
Which tools are needed should be driven by theory and by the knowledge accumulated from previous empirical studies. In the case of human capital, the theory of child development should define what domains are relevant and should be subject to measurement. More generally, in different contexts, the relevant theory should inform the construction of new measurements. This has been the case in the past. For instance the development of the system of National Accounts was to a large extent induced by the macroeconomic theories that had been developed in previous years and by the necessity to bring those models to data.
As it becomes more common for researchers in economics to be involved in data collection and have the possibility of influencing the measurements deployed in field surveys, it is also important that the needs of proper econometric approaches inform data collection. For instance, in the case of the factor models I discussed in section 5.2.2, identification requires at least two measurements for each factor and that the errors associated with each measurement be uncorrelated. Data collection could be organised so that such assumptions are likely to be satisfied in the data. In the case of the Colombia study I discussed, some measures of child development (such as the Bayley-III) were collected by a psychologist working with the child, while others, such as the MacArthur-Bates inventories, were collected by an interviewer working with the mother. Assuming that the measurement error on these different measures, collected on different days, by different individuals and based on child observation or maternal report, is probably not very far-fetched. The other consideration to be made is that the 'perfect' measurement probably does not exist. Measurement error is always going to be present to an extent. Moreover, while certainly related to concepts of interest, often available measurement do not coincide with the theoretical concepts that researchers are interested in. In this sense, the factor model in Cunha et al. (2010) is particularly attractive because it makes explicit the presence of measurement error and keeps the theoretical structure and available measures on parallel levels related by the measurement system. The context of child development and human capital is not the only one in which this is relevant. Models of risk sharing and consumption smoothing typically studied in the literature can be interpreted as factor models where the theoretical framework poses some restrictions on the empirical measures. From a practical point of view, the consideration that often it might be worthwhile to invest resources in the collection of two (or more) imperfect measures rather than pursuing the unachievable task of constructing a perfect measure, made by Browning and Crossley (2009) is certainly relevant, (see also Schennach (2004)).

Conclusions: a research agenda in child development
In this paper, I have discussed a large research agenda that has grown around the recent renewed interest in the accumulation of human capital during the early years. It has become increasingly clearer that the early years are extremely important and that what happens to individuals early on has long lasting consequences. Vulnerable children, living in adverse conditions, accumulate lags that might be difficult to remediate later in life. This mounting body of evidence indicate that the early years might be particularly salient for policy interventions, as strongly argued by Heckman (2008). Much work is still needed, however.
In Section 4, I have suggested already what I think are the main challenges for current research on early child development and the accumulation of human capital. It might be however useful to summarize them here. Again, the theoretical framework, whose component I sketched in Section 2, is useful to organize this discussion. The two big components of such a research agenda are, in my opinion, the characterization of the production function of human capital and the characterization of parental behaviour. Our understanding of the production function of human capital in the early years is still very incomplete. Human capital is now understood as a multidimensional object, where different domains, ranging from physical growth to cognition and language, to socio-emotional skills, develop in a intertwined fashion over time. The nature of these dynamic interactions is still not completely understood. We need to quantify the complementarities between different components of human capital and the various inputs that enter the production function and, crucially, how these complementarities change over the life course, as children develop. Parental investment (and the inputs from child care or schools) have different dimensions and these different dimensions can affect different components of human capital differently. The pathways through which these investments manifest into developmental outcomes need to be fully characterized. This evidence is key for the design of effective policies, as they are key for the identification of windows of opportunities and for the identification of specific domains that should be targeted in specific periods by specific forms of investment. From a methodological point of view, a systematic use of flexible latent factor models can be useful. An explicit treatment of measurement error and the recognition that complete measurement of all relevant factors and inputs can be extremely difficult, if not impossible, is important. An analysis of the biases that can be introduced by ignoring certain domains of human capital or certain types of investment would be very useful. Much of the available studies make some very strong assumptions on the dynamics of human capital. For instance, all the studies I am aware of assume a Markov structure, so that the current level of development is a sufficient statistics for the effect of past levels of human capital in the production function. It would be important, in particular for the identification of key stages, to check whether such an assumption is a realistic one or whether it is violated in practice. Or, for tractability, it is often assumed that the relevant periods in the development of human capital coincide with those for which developmental outcomes are available. Datasets containing good quality data for a long period and with a sufficiently high frequency could be used to investigate how robust inferences are when some of these assumptions are violated. Furthermore, additional theoretical and empirical research is needed to establish what type of biases are introduced in the study of the production function from the omission of important factors that might be unobserved in many data sets. Parental investment, which is crucial in shaping child development, depends on parents' objectives, on their resources and on their beliefs about the nature of the production function. And yet we have only a partial understanding of each of these components. Much work is needed in studying parental tastes and objective, especially when considering the allocation of resources among several siblings of different gender and possibly ability. And as discussed above, gender issues can also be relevant as mothers and fathers might differ in their preferences and in their attitudes towards children. We also have a limited understanding of and information about parental investment. Parents can and do do many different activities to foster their children's development, which range from spending time with them on different activities, to buying toys and books, to contracting services, such as private lessons etc. Different inputs might be targeted at different domains of human capital. Better information on these items is needed to model parental behaviour empirically. Finally, parental choices will crucially depend on parental beliefs about the production function. A better understanding of these issues is, in my opinion, key in characterization of parental investments in children.
A number of interventions, both in developed and developing countries have proven to be effective in achieving sustainable impact that, in some cases, have had large long run effects on adult outcomes. However, the mechanisms through which these interventions work are not fully understood. Moreover, the biggest challenge probably lies in designing affordable interventions that are effective at scale.
In order to tackle these outstanding issues, research needs to combine a variety of methods. I have argued for the development of new measurement tools, which seem particularly important in this area, both to measure intermediate outcomes and inputs. Empirical analyses that exploit the variation induced by the introduction of new interventions seem particularly profitable. In this sense evaluation studies can be particularly useful as they, at the same time, evaluate the impact of new interventions and can be used to gain a better understanding of the mechanisms that generate these impacts. For the latter objective, however, it is key to use of models of individual behaviour.