Subnational Sustainable Development: The Role of Vertical Intergovernmental Transfers in Reaching Multidimensional Goals

Achieving sustainable development hinges on two critical factors: the subnational implementation of public policies and the efficient allocation of resources across regions through vertical intergovernmental transfers. We introduce a framework that links these two mechanisms for analyzing the impact of reallocating federal transfers in the presence of regional heterogeneity from development indicators, budget sizes, expenditure returns, and long-term structural factors. Our study focuses on the case of Mexico and its 32 states. Using an agent-based computational model, we estimate the development gaps that will remain by the year 2030, and characterize their sensitivity to changes in the states' budget sizes. Then, we estimate the optimal distribution of federal transfers to minimize these gaps. Crucially, these distributions depend on the specific development objectives set by the national government, and by various interdependencies between the heterogeneous qualities of the states. This work sheds new light on the complex problem of budgeting for the Sustainable Development Goals at the subnational level, and it is especially relevant for the study of fiscal decentralization from the expenditure point of view.


Introduction
to SCGs improves the provision of public goods and services, and helps to reduce personal and cording to different analysts, relies heavily on population size (Arechederra and Urzúa, 2017).
The contributions, on the other hand, appear in the national budget as the so-called Ramo 33, 113 a tranche of shared revenues comprising 8 different funds, each one aimed at equalizing regional 114 disparities in specific dimensions of development (e.g., health and education). The procedure 115 to allocate these transfers varies from one fund to the other. Approximately 29% of the STR 116 is directly transferred to states and municipalities, while the residual is managed by the federal The waste of resources produced by the misalignment of incentives between the central authority and its functionaries can be ameliorated through regulatory and judicial schemes, 176 which are exogenous parameters, imputed from empirical data. The model recognizes that, in 177 reality, these schemes tend to be imperfect and may depend on the quality of the available tools where I i,t+1 is the level of indicator i at time t + 1; α i is a growth factor that incorporates long- where β is a normalizing parameter capturing, among other things, the productivity of the 199 resources invested in government programs associated with indicator i; 11 C i,t is the amount of 200 11 It is important to highlight that β helps to correct for the influence of the population size in the budget of a state. On the one hand, population effects are removed from the indicators by taking per-capita rates. One the other, budgets could also be normalized per capita, but it would be inconvenient for the type of analysis presented in section 5.3, in which resources are reallocated across states. Instead, the budgets are left in their total values, and the resources effectively used (after inefficiencies) by the i th public servant in period t; and n is the eter β that minimizes the difference between the mean probability of success γ i,t endogenously 220 generated by the model (averaged across indicators, periods, and simulations) and the rate of 221 positive growth periods across all the empirical indicators of a given state. 14 222 calibration seeks to find a β that normalizes these data so it can work well as part of equation 2 in terms of generating probabilities. In other words, if two states have similar indicator performances, but one has a larger budget due to its population, then the β of the more populous state has to be smaller to generate a probability γ ∈ [0, 1]. Thus, β reflects that it is not the same to spend one peso in a more populous state than in an uncrowded one. This can be confirmed in Appendix D, where all the parameters β are plotted against population data.
12 Consult Appendix C for more details on the interpretation, estimation, and implications of SDG networks. 13 While GDP-focused models make an effort in modeling consumer behavior, they need to sacrifice realism in terms of the number of development dimensions that they can incorporate, as fully specifying every micro-level mechanism of the SDGs is unfeasible. Our model locates on the other side of this trade-off. It focuses on the micro-mechanisms of the policymaking process and provides a more stylized-macro-view of the SDGs. This is mainly motivated by our need to establish a causal link between government expenditure and development while considering all the dimensions of the latter contained in the SDGs. But even with this stylized view, when the model is calibrated, we can capture the aggregate outcome of micro-level decisions through the network's links and weights as well as the free parameters of the SDG indicators' evolution. For example, a negative link (trade-off) between an indicator in an economic SDG and an environmental one reflects a negative externality produced by firms' erroneous incentives and the available polluting technology. On the other hand, a small α reflects structural long-term considerations that hamper the potential of public funds on attaining development goals when government programs are based on ill-conceived consumers' and firms' incentives.
14 The reader may ask why should one use this specific calibration algorithm, while there exist many methods for this purpose. Due to the interdependencies between indicators through the network and the endogenous responses of  To define the numeric goals of each indicator, we apply a combination of methods. Firstly, 240 whenever possible, we base the goals on those defined by OECD (2020), which uses regional-level 241 data (in the case of Mexico, state-level). These goals are adopted whenever the indicator has a 242 close counterpart in the OECD data. 16  the agents, the fitness landscape of this optimization problem is dynamic and rough. In fact, conventional numerical methods for convex optimization fail. Heuristic methods such as differential evolution, and even Bayesian approaches such as the Tree-structured Parzen estimator also fail. The method developed in Guerrero and Castañeda (2021b) employs a multi-output gradient descent algorithm. The calibration is simultaneous for all parameters, so the method is efficient and minimizes indicator-specific errors; and Guerrero and Castañeda (2021b) show that it yields high fitness scores. 15 The LNPP (https://www.lnpp.mx) is a think-tank located in Mexico City that, among other tasks, maintains a panel of hundreds of indicators at the state level. 16 Note that the OECD dataset has very short time series, often with only one or two observations for the indicator of a given state. Therefore, we do not include any of this information in our database. 17 We omit criterion 4, which takes into account values of countries that have already surpassed the official SDG target, as international values may not be comparable to state indicators. full list of indicators and the method used to assign a goal to each one.

248
• Criterion 1 sets the goal to an SDG target when the indicator describes a variable whose 249 definition explicitly implies the existence of a quantitative threshold to be reached. For 250 example, one of the SDG targets is to eradicate extreme poverty; thus the value assigned 251 for the goal of indicator 'Percent of the population in extreme poverty' is set to 0. We 252 also include some indicators that are not part of the SDG ensemble, but that have explicit 253 targets in Mexico's National Development Plan (NDP). For example, for the indicator 254 'Fertility rate of women aged 15-19', we set the goals to half the most recent level, as 255 stated target of the NDP. We use this criterion to set goals related to gender or ethnic 256 inequality by using the value that represents full equality. We apply this criterion to 9 257 indicators.

258
• Criterion 2 sets the goal to an implied value based on the principle of 'Leave no one 259 behind'. For example, we set a goal of 1 for the indicator 'Proportion of 1-year-old infants 260 with a full basic vaccination record'. We apply this criterion to 11 indicators.

261
• Criterion 3 sets the goal to an optimum value defined by scientific evidence. For example, 262 we set the optimum number of policemen per 100,000 inhabitants to the worldwide median.

263
This criterion is applied in 1 indicator.

264
• Criterion 5 sets the goal to the average value of the top 5 states according to their best 265 year of performance in the data. For a state that is above this average, we set the goal to 266 its most recent value, assuming that its performance in such SDG indicator has already 267 been successful and no gap needs to be closed. We apply this criterion in the majority of   To explain Figure 4, let us define a retrospective gap as the total progress made by an The HDI was designed to measure multidimensional development (health, education, and standard of living) and to compare well-being between countries. To display differences within countries, a Subnational Human Development Database project was conducted by Smits and Permanyer (2019). This Subnational Human Development Index has been constructed for 161 countries and includes Mexico for the period 1990-2018. changes), both averaged across states within the corresponding cluster. The dots on the top of 293 the translucent bars correspond to the development goals. Thus, a white dot indicates that, on 294 average, the prospective gap is smaller than the retrospective one, i.e. that the indicator had a 295 better historical performance than the one needed to reach its goal by 2030.     indicator performance, and decentralized public spending. and presented in percentages. More specifically, we compute where i denotes the indicator and T is the number of years. 20 Thus, the SDG gap for T = 12 331 can be interpreted as the gap-in percentage terms-that has not yet been closed in 2030 for 332 indicator i in a given state. Appendix D provides plots with the parameters obtained through 333 the calibration procedure.  pct_w orke rs_ed ucat ed fem _ho mic ide_ rate pct _wo me n_g ovt pc t_w om en _n on -ag ac ce ss _c hil dc ar e ge nd er _e q_ re se ar ch ge ne r_ eq _w or ke rs pc t_ ru nn in g_ w at er pc t_ ac ce ss _s ew ag e w as te _w at er _t re at su rf _w a te r_ q u a lit y w at er _t re at _c ap ac it y pc t_ w at er _b y_ tr uc k sp en d_ w at er _t re at gd p_ pe r_ ca p em p_ ra te inf orm _e mp _ra te und ere mp _ra te une mp _rat e_y oun g intl_ com mer ce direct _foreig ht_inv est ag_produ ctivity labor_prod pct_enrolled_IMSS construc_ cost_inde x contra ct_cos t_inde x prop _reg _cos t_ind ex bus ines s_c ost_ inde x ban k_a tm _ra te pa ten ts_ rat e re se ar ch er s_ ra te bu dg et _s cie nc e di sa st er _r es il_ in de x en er gy _i nt en si ty de bi t_ ca rd _p re v cr e d it _c a rd _p re v p ct _h ig h w ay _p av ed ho te l_ st ay s_ pe r_ ca p ge nd er _e qu ity _s ec on d eq ui ty _in _d ra in ag e eq ui ty _in _h ea lth gin i pc t_n o_b as ic_ se rv pro per ty_ reg vol_ gar bag e gree n_bu sines s_rat e forest_ fires forest_dev_ program budget_forest_dev reforest_plants pct_area _refores t crime _rate pen al_re sol_ rate pct _de bt_ ser v jud ge s_p er_ po p pro se cu te_ pe r_p op co m pu te rs _in _g ov t pc t_ pa rti ci p_ el ec ex to rt io n_ ra te bu rg la ry _r at e co m m er c_ bu rg _r at e ve h ic le _t h e ft _r a te ki d n ap Note that the number of simulation periods is different from the number of years. When calibrated, the model must run for at least the number of years in the data. Guerrero and Castañeda (2021b) show that 50 or more simulation periods yield robust results (because the budget parameter β adjusts for the volatility of the indicators). Thus, once a specific number of simulation periods is chosen, a proportional equivalence between this amount and the number of years in the dataset is established. Therefore, when performing prospective simulations, the number of simulation periods to be used must hold according to the established equivalence of the calibration. In this application, we establish 50 simulation periods for 14 years of data. Thus, the number of simulation periods required to run 12 years forward in time is approximately 43. differences as well. For instance, the SDG gap for 'Workers enrolled in IMSS as a percent of the total population' 21 is below the cluster average in the most developed states, but not in clusters 344 2 and 3. The same happens for 'Percent of young people aged 19-29 with income below the 345 welfare line' and for 'Rate of informal employment'.

346
Two implications are derived from these simulation results. Firstly, SDG gaps will hardly 347 be closed by 2030, even for the richest states belonging to Cluster 1, assuming that their annual 348 budgets remain at their historical levels in real terms. Secondly, we cannot expect regional 349 convergence shortly if disruptive policies are not implemented, since SDG gaps tend to close more  (reaching a 50% reduction). We perform these two calculations for each state and present the 371 disaggregated results for each indicator in Figure 6.

372
The first thing to notice when comparing panels (a) and (b) is that, on average, decrements in the budget exert a stronger impact than increments do. This asymmetry is a reflection of equations 2 and 1, which suggest that a better financial situation is a necessary condition 375 for development (otherwise γ = 0) but not a sufficient one (since growth is always limited 376 by α). Moreover, in clusters 1 and 2, but not in cluster 3, several indicators show extreme  from an enlarged budget represent years saved, while the ones originating from a budget shrink- Note: Each curve consists of the aggregated responses in convergence times of indicators in the same SDG and cluster to 1% budgetary changes. These curves can only be constructed for indicators that reached their goals in or before 2030. Thus, the specific quantities may result from a selection bias in the indicators. This is not an issue in Figure  6. Here, the reader should rather focus on the qualitative nature of the results. Source: Authors' own calculations.
where ω j is the population weight for state j.
where Ω k is the set of indicators belonging to SDG k. the objective is to achieve more equitable development across the 32 entities and all the SDGs.

454
The outcome for Mexico City is particularly salient since the optimal distribution suggests an 455 23 In Appendix F, we perform a constrained optimization using similar restrictions to those presented in official documents. This exercise shows that the method can be adapted to the circumstances of different empirical realities. 24 The linear relationship between the empirical and simulated shares (with a correlation of 0.804) is partly the consequence of population size. Yet, a strong linear relationship does not imply a close similarity in shares since, even in a high correlation setting, there can be numerous cases of over or under-assignments.   In Figure 9, we display these distributions by SDG. As reference points, we also include the  The first thing to notice in Figure 9 is that the distributions of fiscal transfers vary with To provide quantitative insights into the different optimal distributions of federal transfers, 505 we compute the coherence index developed by Guerrero and Castaneda (2020). This index 506 measures the Euclidean distance that the empirical distribution has with respect to the optimal, 507 and to another one that is the opposite to the optimal (where the smallest allocation goes 508 to the state that received the largest one in the optimal distribution and so forth). If the 509 empirical distribution is closer to the optimal one, the index is positive, denoting a certain level 510 of coherence (or similarity). If it is closer to the anti-optimal one, the index is negative and 511 indicates that the allocation is incoherent. More specifically, the index is computed as where P is the empirical distribution, P * is the optimal one, P • is the anti-optimal one, and 513 d(·) is the Euclidean distance between two vectors. The index goes from -1 to 1, so a positive 514 unit means that the empirical and the optimal allocations are identical. We report the coherence index in Table 1 The remaining results in Table 1 are also interesting in so far as they quantify the extent 525 of the improvement that can be attained when using the optimal allocations instead of the    states with similar progress, which means that every peso spent in that entity was substantially 576 more productive than in any other state-hence the significantly lower SDG gap in panel (d).

577
Consequently, due to its structural factors, QUE can achieve more through sheer expenditure.  of an indicator approaches infinity when the budget tends to zero due to equation 2 (hence 588 the asymmetric shape of the sensitivity curves presented in Figure 7). In contrast, MEX has 589 substantial self-generated resources, so the costs (in SDG gaps) produced by removing transfers 590 are less than the benefits from assigning them to other states. This lack of resources among 591 states with weak fiscal capacities diminishes when focusing only on participations (see panel (c)).

592
In this case, the costs of transferring resources from MEX to states from cluster 3 outweigh the 593 benefits. In fact, since MEX presents better performance on the budgetary frontier than most 594 states from cluster 3, the optimization algorithm finds that there are more potential gains in 595 allocating further participations to MEX.  26 There seems to exist critical points (in the allocation of transfers) at which a state's development collapses, but after which one cannot achieve substantially larger progress. Importantly, this critical point is potentially linked to various factors included the quality of governance, long-term structural factors, and spillover effects.

Discussion and conclusion
Through a computational approach, we open a new window for the study of regional development 630 and fiscal federalism from the perspective of public spending. The models used for these analyses, 631 and others, should be sophisticated enough to describe the key elements behind the complexity 632 of the phenomenon under study. This paper is a first attempt in this direction, focusing on the   As a summary of the most important results of our study, we mention the following. (i) 654 We find that most of the SDG indicators will attain levels below their goals in 2030, in spite 655 of the fact that these were established with reasonable criteria. This outcome indicates that  (iii) We identify over-and under-allocations of federal transfers across specific states, noting 661 that these inefficiencies may vary with the specific objective function that describes the nation's aspirations. For example, potential objectives may include having a wide set of goals like the 663 whole range of SDGs, or focusing on a specific topic such as poverty or public health. (iv) 664 With our coherence index, we show that some objective functions, when used as development 665 modes (e.g., 'quality of education'), are more accurate than others (e.g., 'reduced inequalities' 666 and 'sustainable cities and communities') in describing empirical budget shares. (v) We also 667 find that, if a comprehensive objective function were to be used to allocate transfers, there could 668 be substantial savings in terms of convergence time to reach the 2030 Agenda.

669
Finally, we would like to emphasize that in this paper we study problems associated with  For a more disaggregated summary of the data, Table A.2 lists each indicator, the SDG it is 775 assigned to, and the method used for imputing the numeric goal for that indicator, as described    1  2  3  4  5  6  8  9  10  11  12  15  16  All   min  3  6  6  6  8  3  4  9  3  6  9  10  3  3  max  6  6  14  14  14  13  14  14  6  13  9  12  13  14  mean 5.17 6.00 10.54 12.50 12.83 8.14 11.00 11.20 3.75 10.00 9.00 10.80 9.43 9.  Natural disaster resiliance index disaster resil index 5 9 Energy intensity of the economy (MWh per 1,000,000 GDP per year) energy intensity 5 9 Debit cards per 10,000 inhabitatnts debit card prev 5 9 Credit cards per 10,000 inhabitants credit card prev 5 9 Paved highways as a percent of all highways pct highway paved 5 9 Number of patents solicited per 1,000,000 inhabitants patents rate OECD 10 Index of equity in access to health services equity in health 2 10 Index of educational equity in middle school gender equity second 5 10 Index of equity in access to drainage equity in drainage 5 10 State Gini coefficient gini OECD   There are n agents (or public officials), each in charge of a public policy that is specific to a 814 single policy issue. To implement the mandated policy in a given period t, agent i receives P i,t 815 resources from the central authority (or government). With these resources, the public official 816 tries to leverage two potential benefits: (1) the reputation from being a proficient public servant 817 and (2) the utility derived from being inefficient according to where F i,t+1 represents the benefit or utility obtained in the next period. The first summand 819 in equation 7 captures the benefit of being proficient. ∆I * i,t is the change in indicator i with 820 respect to the previous period (its performance), relative to the changes of all other indicators.

821
More specifically, the relative change in indicator i is computed as and it captures the idea that the central authority compares and evaluates the relative per-823 formance of each public official, and their implemented policies, through the corresponding 824 development indicators.

825
Going back to the first summand in equation 7, we find that the relative change in the 826 indicator is pondered by Pi,t . Here, C i,t is the fraction of the allocated resources P i,t that are 827 effectively used towards the policy. We call it the contribution of agent i.

828
Next, let us focus on the second addend of equation 7, which corresponds to the utility derived 829 from being inefficient. Here, P i,t − C i,t is the benefit extracted from not devoting resources to 830 the policy. Thus, when dividing by P i,t , it represents the level of inefficiency. Monitoring and 831 penalties may hinder inefficiencies. This is captured by factor (1 − θ i,t τ ). Variable θ i,t is the 832 binary outcome of monitoring inefficiencies. If θ i,t = 1, it means that the government has 833 spotted agent i in inefficient behavior. In that case, i is penalized by a factor τ , such that the 834 benefit from these private gains are reduced.

835
To model the binary outcomes of monitoring efforts, we assume that, every period, an 836 independent realization of θ i,t takes place for each indicator. This is nothing else than a Bernoulli 837 process with a probability of success λ i,t determined by where P * t is the largest allocation in period t. Parameter ϕ in equation 9 corresponds to the 839 quality of the monitoring efforts.

840
If an agent becomes more inefficient and their benefits increase, then reinforcement learning 841 takes place, becoming more inefficient the next period. If, in contrast, the government is able 842 to penalize, according to the learning process, they become more proficient the next period.

843
Formally, action X i,t of agent i can be modeled as where sgn(·) is the sign function. In order to map action X i,t into the value of the effective 845 resources, we define 846 C i,t = P i,t 1 + e −Xi,t .
where q i,t is the propensity to spend in policy issue i in time t, and B is the budget available in 869 time t.

870
The evolution of the policy priorities takes place through the propensities. In the first period, .
Then, as time progresses, the propensities are updated according to The previous equation is rather intuitive. The term U (0, 1) is a random draw from a uniform  This is why equation 15 conditions the efficiency bias in the allocation of the budget to successful 881 outcomes of the monitoring random variable θ i,t . Thus, the government tends to be more 882 inquisitive with policymakers whose inefficiencies have been spotted in the past.

883
B.3 Indicator dynamics 884 We model indicator dynamics through a random growth process. Let γ i denote a probability 885 associated with the growth process experienced by indicator i. This probability depends on a 886 combination of network effects (i.e., incoming spillovers) and budgetary allocations. Therefore, 887 the growth process is modeled as independent Bernoulli trials with a probability of success where β is a normalizing parameter and S i,t are the net amount of spillovers received by indicator where ξ(·) is the binary outcome (0 or 1) of a growth trial.

C SDG networks and their estimation 895
The adjacency matrix (A j,i ) in equation 2 captures the structure of interdependencies between 896 SDG indicators (policy issues) and, hence, when estimated with historical data reflects relation-897 ships that cannot be modified in the short-term. In our ABM, an SDG (or spillover) network 898 is considered an exogenous input that describes the context that prevails in a state. Conse-

918
It is important to recall that these networks should not be interpreted as causal relations, but 919 as conditional probabilities. This means that a link A → B does not imply that ∆A guarantees 920 ∆B. Accordingly, in this setting, a spillover affects the probability of success γ i of a public 921 policy in the model, but not necessarily the magnitude of the outcome. Although sparsebn 922 assumes that no temporal dependence exists between observations, we reduce this possibility    AGU  BCN  BCS  COA  COL  CMX  JAL  NLE  SIN  SON  TAM  CAM  CHH  DUR  MEX  MOR  QUE  ROO  TAB  TLA  YUC  CHP  GUA  GRO  HID  MIC  NAY  OAX  PUE  SLP  VER  ZAC   AGU  BCN  BCS  COA  COL  CMX  JAL  NLE  SIN  SON  TAM  CAM  CHH  DUR  MEX  MOR  QUE  ROO  TAB  TLA  YUC  CHP  GUA  GRO  HID  MIC  NAY  OAX  PUE  SLP  VER  ZAC   pct_below_welfare  pct_vulner_inc  pct_extreme_pov  pct_overcrowd  pct_hungry  maternal_mort  births_with_doc  infant_mort  pct_full_vaccine  aids_mort  diabetes_mort  teen_fert  pct_no_healthcare  doctors_per_pop  nurses_per_pop  hops_beds_rate  life_expect  enroll_secondary  enroll_preschool  finish_high_school  absorp_high_school  absorp_college  Note: Each dot size is proportional to the forecasted gap size. The background colors correspond to the three clusters. Source: Authors' own calculations.
As specified in the Fiscal Coordination Act, the contributions are tied to specific 'themes', while 934 the participations follow a particular formula. With full information, such features could be eas-935 ily coded into the optimization algorithm. For the purpose of this paper, and to avoid excessive 936 specificities, we provide an example of how to restrict the fitness landscape by considering some 937 of the aforementioned criteria. For this exercise, we assume that contributions remain the same 938 as the empirical ones since they obey specific policy purposes. Thus, the redistribution takes 939 place only in the participations.

940
The Fiscal Coordination Act defines a weight vector w 1 , . . . , w 32 where w k = 0.6C 1,k + 941 0.3C 2,k + 0.1C 3,k and w k denotes the weight given to state k. In this formula, the terms C 1,k , 942 C 2,k and C 3,k are called the 'incentive coefficients', as they build on information about economic 943 performance (through the state-level GDP for C i,k ) and local tax-collection performance (in 944 relative changes for C 2,k and magnitudes for C 3,k ). These coefficients also consider the state