Network analysis reveals structural patterns in Italian interprovincial migration flows (2002–2023).
•
Backbone extraction isolates statistically significant links, enhancing signal-to-noise ratio in the network.
•
Community detection identifies central provinces acting as migration hubs within the network structure.
•
Policy insights target spatial asymmetries via HE quality, education-labor alignment, and remote work enablers.
Abstract
Internal migration plays a crucial role in shaping regional economies and governance. To gain insights into this phenomenon, we propose conducting a network analysis of migration flow data among Italian provinces from 2002 to 2023, with a focus on eliminating irrelevant connections. This approach combines network filtering and clustering techniques, which refine the network structure to highlight meaningful connections and make mobility trends easier to interpret. Unlike traditional methods, which often struggle with complex or fragmented migration data, our approach enables us to identify both persistent migration hubs and emerging shifts in regional mobility. Our findings confirm the central role of northern provinces such as Milan, Turin, and Bologna, but also reveal a strengthening of connections between Southern areas like Naples and Bari, suggesting a growing regional appeal. While South-North migration remains dominant, high-skilled individuals are increasingly choosing Rome and other central provinces, indicating a diversification in migration patterns. Similar trends emerge for low-skilled workers, although age-related differences influence distinct mobility routes. These results offer actionable insights for policymakers. By investing in research, innovation, and entrepreneurial development in the South, improving university quality and academic infrastructure, strengthening connections between education and the local labour market, and enhancing infrastructure for remote work, South-North migrations and inequalities could be mitigated. Through a clear and accessible analysis, this study provides a practical tool for designing policies that promote more balanced regional development.
Internal migration, which refers to the permanent relocation of individuals within a country's borders, plays a crucial role in shaping the geographic distribution of the population. This dynamic process has significant implications for various socio-economic factors, impacting communities and regional development [1,2]. The research literature on migration is concerned with the causes and the consequences of this phenomenon. The former is interested in the micro and macro elements (i.e., the individual characteristics like age, gender or educational level and the spatial characteristics, respectively) that affect the propensity to migrate of a single individual, a family or – by adopting a more aggregate perspective – migration flows. The latter, instead, addresses the different consequences of migration on places such as the effects on labour market and economic growth.
Furthermore, during the last decade, from the successful application of the network science to the human migration, a new literature strand has emerged. By implementing different network analysis measures and models, in fact, several studies have been carried out with the aim of investigating the structure and dynamics of migration [3,4]. However, the vast majority of these studies are an empirical attempt to implement a network approach to the international migration (i.e., the relocation to a different country) [5]. Internal migration, instead, even though its greater relevance and proportion — sustained by the absence of cultural, religious, or linguistic barriers and by the transport innovations that make easier and cheaper this kind of mobility — has received less attention.
Hence, given the relevant size of within-country migration, the investigation of the structural and dynamical properties of internal migration deserves further efforts. The analysis of the network structure of such mobility, in fact, should play a pivotal role in both academic discussions and policy debates to formulate effective strategies for a specific economy as it “enables the policymakers to get a straightforward insight on the holistic network structure, as well as a fine-grained view on the relative attractiveness of particular settlements and migration links.” ([5] p. 20). This is especially true for countries characterized by significant internal territorial imbalances, such as Italy, which represents our context of study; a country market by significant internal economic imbalances with a sharp economic divide between the more developed Centre-North and the poorer South (Mezzogiorno) that risks to be exacerbated by the consistent and persistent emigration flows from the lagging regions. Unlike in many developing countries, in fact, where migration—primarily consisting of unskilled labour moving from low-income rural areas to wealthier urban centers—has been one of the main convergence factors between the regions; internal migration in Italy presents a distinct pattern as it is increasingly composed of graduates and high-skilled individuals in general. Furthermore, while in other developed countries with similar migration trends—such as Spain—existing literature has documented the positive effects of migration in promoting regional convergence [6], in Italy, the persistent emigration from disadvantaged regions is feeding a divergence process that is exacerbating internal dualism rather than mitigating it.
These disparities, in fact, have long fueled persistent internal migration flows, predominantly from the South to the more industrialized North. Research has shown that the magnitude and direction of these internal flows are influenced by better socio-economic (income levels, wages, and employment opportunities) and institutional characteristics in the North that is inducing a population redistribution toward this area [7].
As for the economic characteristics, Furceri [8] found that, between 1985 and 2001, net migration patterns were mainly influenced by variations in regional income differences, whereas disparities in unemployment rates had no observable effect. Similarly, Basile and Causi [9] observed that while economic variables like unemployment rates, disposable income, and industrial development became more significant in explaining migration flows between 1996 and 2000, they were less influential in the earlier period (1991–1995). Etzo [7], using gross migration flows to distinguish between push and pull factors, and to account for spatial interactions between regions, provides evidence of a crucial role of GDP per capita as the main economic determinant of internal flows.
Further expanding on these results, Piras [10] highlights the central role of macroeconomic factors in shaping internal migration in Italy from 1970 to 2005. The study also underscores the importance of human capital, revealing that while at the destination this factor does not significantly influence migration, its presence at the origin acts as a constraining factor and this effect is particularly strong in migration from the Centre-North to the South.1
Among the key push factors, labour market disparities stand out [7]. For instance, Dotti et al. [11] highlighted how more dynamic local labour markets attract university students from other regions, particularly in science and technology fields. Furthermore, in Italy, the predominant concentration of high-tech small medium enterprises (SMEs) in the Centre-North, suggests an industrial specialization advantage that further reinforce the attractiveness of this macroarea.
Beyond socio-economic and labour market aspects, institutions also play a fundamental role in migration decisions. Nifo and Vecchione [12], demonstrated that the quality of provincial institutions significantly influences the likelihood of graduate migration, with an impact comparable to that of income differences. Additionally, the quality of research and teaching at universities plays a crucial role in shaping migration patterns among Italy's youngest and highly skilled workers, influencing their decisions on where to relocate [13]. Furthermore, recent studies have increasingly focused on how non-economic factors influence migration choices, emphasizing the crucial role of social capital as a key intangible driver of relocation decisions [14].
Owing to the pronounced dualism between the country's two macroareas (Centre-North and South), recent decades have seen a growing age and skill selectivity of internal flows, which is contributing to the diffusion of skills and knowledge across regions, making internal migration a crucial growth enhancing mechanism in the host economy. The growing mobility of the youngest and brightest individuals, in fact, increases the human capital content of migration and lowers the average age of internal migrants, fostering human capital accumulation in more developed areas of the country, simultaneously marking the self-nourishment of the phenomenon and worsening territorial disparities.
Fratesi and Percoco [15], for instance, highlight how migration contributes to the accumulation of human capital in the receiving region. The core idea is that migration influences the destination place not only by expanding the workforce but also by enhancing the human capital base and, in turn, the economic performance. Piras [16] examines migration's effects by separating the quantity impact from the composition effect, demonstrating that the highly unbalanced and unidirectional Italian mobility of skilled and qualified workers from the Mezzogiorno offsets the quantity effect associated with labour force redistribution.
For what concerns the effects of regional mobility on Italian local labour market, Basile et al. [17] suggest that the net migration of highly skilled workers from South to North (i.e. long-distance migration), has exacerbated regional unemployment gaps by increasing unemployment at origin and reducing it at destination. Conversely, the migration of low-skilled workers has had the opposite impact, helping to lower unemployment at the origin while contributing to its rise in receiving regions.
Di Berardino et al. [18], providing an analysis of the impact of internal migration on the institutional quality of Italian provinces, reveal that migration enhances the quality of institutions quality only when the level of human capital is considered.
Pinate et al. [19] consider the possibility that migrants with different educational levels affect specific types of innovation in distinct ways. Specifically, they found that medium- and high-skilled migrants are positively associated with three types of intellectual property right, namely patents, trademarks and design rights.
These findings suggest that, over the long-run, migration processes may contribute to feed a vitious circle to the detriment of the already backward regions of the Mezzogiorno: northern regions that attract and retain highly skilled workers continue to benefit from enhanced human capital and innovation, while less developed southern regions that experience a brain drain face greater challenges in breaking out of a structural disadvantage. In order to counteract the persistent territorial imbalances observed in Italy, the literature has identified several targeted policy interventions that could contribute to a more equitable distribution of human capital and economic dynamism across regions. In particular, enhancing local university quality, fostering university–firm linkages, investing in quality-of-life improvements, and ensuring better dissemination of information on local opportunities are regarded, in addition to strengthening productive and innovative capacity, as vital strategies for mitigating brain drain and promoting more balanced regional development [13,20]. These policy interventions can also be part of a broader response to socio-political challenges, including high youth unemployment rate and the rapidly aging population. According to the latest ISTAT data, Italy has one of the highest youth unemployment rates in Europe, particularly in the Mezzogiorno, where the unemployment rate of individuals aged between 15 and 24 is 20 percentage points higher than in the central-northern regions (36.7 % and 16.8 %, respectively in 2023)2. This lack of opportunities exacerbates the phenomenon of brain drain, further weakening local economies and contributing to demographic imbalances. The outflow of young individuals, in fact, accelerates the aging process in already demographically fragile areas, straining local welfare systems and public services.
In this context, analyzing migration between settlements using network analysis indicators and models can contribute to providing a more comprehensive picture of migration in Italy and a clearer understanding of the geographical structure of this phenomenon.
This approach complements the research literature on the causes of migration (who and why) and its consequences (how), by addressing the spatial dimension of migration (where), offering a holistic view of the entire system while accounting for the interdependencies within its structure [2]. Internal migration flows, in fact, besides being the product of different socio-economic incentives between different pairs of origins and destinations, also reflect the connectivity between locations and the intensity of functional linkages between regions [21].
Methodologically, an important contribution stems from an analytical tool largely adopted in several economic and non-economic fields: the network analysis. This approach offers a robust methodological framework for studying internal migration in Italy, enabling a comprehensive investigation of both the structural properties and dynamic interactions that underpin migration flows. Unlike traditional econometric models, which tend to concentrate on aggregate determinants and pairwise migration flows, network analysis provides a systemic perspective that uncovers the complex topology of migration patterns. This includes the hierarchical organization of territories, the pivotal role of key provinces as central hubs, and the emergence of community structures driven by mobility dynamics. Such a holistic approach is especially pertinent in the Italian context, where enduring regional economic disparities and historical migration trends create interdependencies that standard regression-based methodologies cannot fully capture. By utilizing network analysis, we transcend conventional estimations of migration determinants to examine how migration flows collectively shape the spatial structure of mobility. This perspective allows us to assess not only the volume of migration but also the intensity and significance of interregional linkages, highlighting the functional roles of various provinces within the national migration system. Specifically, network measures like centrality indices, clustering coefficients, and community detection algorithms enable us to categorize provinces based on their migratory influence and identify cohesive subgroups that exhibit similar mobility behaviors.
Building upon this, we enhance traditional network analysis methods by introducing a filtering technique that identifies both positive and negative migration links (i.e., a signed network) and by applying machine learning techniques to examine the low-dimensional representation of the migration system. This approach, as proposed by Gürsoy and Badur [2], provides a deeper and more nuanced understanding of migration patterns. Specifically, the backbone extraction method refines the network by filtering out statistically insignificant connections, ensuring that our analysis zeroes in on the most meaningful and structurally relevant migration pathways. Representation learning facilitates a low-dimensional visualization of the migration system, helping to reveal hidden structures and long-term migration trajectories that may not be immediately evident from raw flow data. These innovations allow us to untangle the fundamental mechanisms driving internal migration and provide a clearer understanding of the spatial redistribution of human capital within the country.
Beyond academic contributions, this approach has significant practical implications. Mapping migration flows through a network lens offers policymakers a nuanced understanding of regional mobility dynamics, enabling the design of more effective interventions to address economic imbalances and support local development strategies. By identifying key migration hubs and uncovering patterns of human capital movement, network analysis can inform Italian targeted policies aimed at reducing brain drain from disadvantaged regions, optimizing labour market integration, and fostering regional resilience. Consequently, our study not only addresses a critical gap in the literature on internal migration in Italy but also serves as a valuable tool for developing evidence-based migration and regional development policies.
In doing this, the study exploits a unique dataset elaborated by the Italian National Institute of Statistics (ISTAT) in the Migratory movements of resident population – registrations and cancellations to the registry office that consists of bilateral migration flows across Italian regions (NUTS-3). The data, segmented by five educational levels (ranging from no formal education to tertiary education) and three age groups, enable a comprehensive analysis of the migration network, distinguishing between high-skilled and low-skilled young (15–34), adult (35–64), and elderly individuals (over 64).
The reminder of the article is structured as follows: Section 2 presents the literature background. In Section 3 the data collection and the methodology are illustrated, while Section 4 is devoted to the discussion of the results. Finally, the main findings and conclusions are summarized in Section 5.
2. Related literature
Migration represents a complex phenomenon that has fascinated scholars and researchers belonging to different scientific fields including, among the others, sociologist, psychologists, economists, and geographers for a long time. Network scientists, instead, have paid greater attention to human migration only very recently. Fortunately, during the last few years, we assisted to a significant growth in the applications of a network analysis to migration compared to the very limited number of studies available at the beginning of the decade [5]. However, it must be noted that— although internal migration involves a number of people greater than international one— network analyses on migration, besides being rarely performed, have been mainly conducted on international flows. It is commonly believed, in fact, that migration is a phenomenon that implies a transfer of residence between two different countries [5].
Among the first empirical attempts aimed at implementing a network analysis to migration, Fagiolo and Mastrorillo [22] utilize a complex network approach to characterize the international migration network. They conceptualize it as a weighted-directed graph, with nodes representing world countries and links denoting the volume of migrants moving between countries at a specific moment. The main findings highlight the modular structure of the international migration network, which exhibits a small-world binary pattern featuring disassortativity and significant clustering. The authors also show that a significant portion of the binary and weighted topological structure of the network can be explained by factors such as spatial distance, country population, relative GDP per capita, as well as bilateral dummy variables that account for common official language, common religion, and South-North linkages.
In a similar vein, Davis et al. [3] consider the global distribution of international migrants as a network of nodes (i.e., countries) connected by links given by migrants moving from one country to another, and show a rise in the interconnectedness of the human migration network, with migrants' destination choices influenced by factors such as language, religion, geographical distances, and colonial histories. More specifically, the analysis reveals a rising trend in network transitivity (i.e., the interconnectedness of nodes sharing a common connection), a reduction in average path length, and a shift toward higher values in degree distribution. These findings indicate a progressive reinforcement of the small-world effect within the migration network between 1960 and 2000.
A complex network perspective has been adopted also by Tranos et al. [4] in order to explore the network topology of migration flows. In particular, by using centrality measures and community detection to approach international migration as a global network, the main results reveal that physical distance, border effects and culture contribute to generate migration flows among OECD countries. Moreover, education seems to emerge as a crucial predictor of cross-country migration as a higher educational level represents both pull and push factor.
Peres et al. [23] found that top1 networks3 are interconnected in a stable manner, with western countries consistently attracting the vast majority of migrants from a growing diverse range of origin locations.
Danchev and Porter [24], by examining the spatial community structure of world migration, find not only a moderate increase in migration interconnectedness but also evidence of a heterogeneous connectivity pattern in the world migration network. While some communities have become increasingly globally integrated over time, other geographic regions remain as isolated from the rest of the network in 2000 as they were in 1960.
For the intra-European migration, Baláž and Karasová [25] conducted a cross-country analysis of bilateral migration flows during the periods 1974–2004 and 2005–2013 highlighting a substantial stability of the network's topology which remains dominated by a rich club pattern as most of exchanges occurs between a small number of big countries.
Considering the spatial level of the previous studies, the literature review clearly shows an important gap due to a lack of analysis on internal migration and especially on European countries [5]. When the complex network of human mobility is considered, in fact, a great deal of attention is paid on US inter-county migration. Maier and Vyborny [26], for instance, in an early attempt to adopt a network analysis to study migration patterns at the state level in the US, highlight that migration at this aggregate level produces almost complete graphs thus raising the necessity to adopt filtering methods. When the threshold is set at the level that connects all states, the study identifies the tendency of clustering with sub-groups characterized by typically adjacent states forming geographically bounded communities.
Xu [27] analyzes several analyzes various structural properties of migration networks, including connectivity, clustering, assortativity, and centrality, emphasizing the role of migration dynamics and inter-area distance in shaping these properties. The analysis reveals that the US migration network exhibited a fundamental hub-and-spoke disassortative structure during the period 1990–2011 “in which a small number of highly connected areas exchange high volume and long distance migration while connecting to many less connected areas.” (Xu, 2017 p. 798).
Goldade et al. [28] analyze the network structure of migration in the US at the county level during the housing boom (i.e, 2004–2007) and the financial crisis (i.e., 2008–2011), and investigated whether and to what extent the political affiliation of a county correlates with the migration patterns. The study finds a lack of high clustering for high degree nodes indicating the absence of mutual migration among all counties, while low degree nodes tend to show high clustering. Moreover, an unexpected result is the stability of migration dynamics during a decade characterized by economic and political instability.
Charyyev and Gunes [29] also focus on different periods of economic prosperity (housing boom) and recession (housing bust) by analyzing migration network between US counties from 2000 to 2015. The analysis shows a stability of the nodes at county and state level as they tend to remain active, and a considerable fluctuation associated to the links which indicates changes in the migration patterns during the time.
Outside the US, among the studies that apply a network analysis approach to internal migration, Caudillo-Cos and Tapia-McClung [30], providing a comparison between the patterns of general migration (i.e., individuals aged 5 and over) and highly educated migrants during the period 2005–2010, highlight a weaker geographical component in the high-skilled population as the communities of graduates network are dispersed throughout the country.
Pitoski et al. [31], by implementing a network analysis of internal migration in Croatia, provide evidence of a hierarchy of importance of some settlements in the network. They also find a high presence of reciprocity which indicate an important contribution of internal migration to urbanization and a systematic abandonment of large cities in the eastern part of the country. The community detection algorithms inform about compatibility between the current administrative subdivision of the country and that implied by migration flows.
Chen et al. [32] provide a graphical visualization of internal mobility within England and Wales in 2019, revealing that a large part of internal migration tends to occur within groups of geographically close local authorities, and that London and other more urban cities, exhibit positive net internal migration rates (i.e., the difference between inward and outward mobility).
Finally, Gürsoy and Badur [2] provide a novel toolset that allows to investigate selected migration laws from a complex network perspective. The authors, in fact, “identify a set of classical migration laws and examine them via various methods for signed network analysis, ego network analysis, representation learning, temporal stability analysis, community detection, and network visualization” [2]. The findings show a general stability of the migration network between 2008 and 2020 and well-defined trajectories of migrations with major flows counterbalanced by migrations in the opposite direction. Furthermore, if in general migration links tend to be geographically bounded, this migration law does not hold when considering more developed cities.
This literature review highlights several methodological and data-related limitations in previous studies. Many focus predominantly on international migration networks, often overlooking internal migration despite its higher prevalence [3,4]. Others conceptualize migration as a weighted-directed graph but fail to incorporate temporal dynamics, thereby limiting their applicability to evolving migration patterns [22]. The quality of network analysis outcomes is also highly dependent on the characteristics of the data employed. Notably, few studies differentiate migration flows based on socio-economic attributes such as education level or age, despite the critical importance of distinguishing between high-skilled and low-skilled migration in assessing regional disparities [15,33]. Additionally, community detection in migration networks presents further challenges. The results of community detection algorithms can vary significantly depending on the parameters chosen, complicating cross-study comparisons [25].
In the present paper, the methodology elaborated by Gürsoy and Badur [2] is implemented in order to analyze the patterns of internal migration among Italian provinces during the period 2002–2023, excluding 2020 due to the lockdown's disruptive effects.
The following section is completely devoted to the description of this novel toolset and the data on bilateral migration flows.
3. Data and methods
3.1. Data
The bilateral migration flows among the 103 Italian provinces for the years 2002–2023 are collected by the ISTAT in the Migratory movements of resident population, registrations, and cancellations to the registry office. For each year, the full set of interprovincial migration flows forms a 103 × 103 square matrix. By excluding the main diagonal—where values are zero by definition, as intra-provincial mobility is not considered—the number of observations per year amounts to 10,506. Furthermore, the dataset includes information on migrants' educational levels, allowing for an in-depth analysis of the overall migration network as well as distinct patterns based on skill level. Specifically, migrants are classified into two groups: high-skilled (those with at least secondary education, equivalent to more than eight years of schooling) and low-skilled (those with primary education or less, up to eight years of schooling). As a result, separate matrices of interprovincial movements are constructed for total migration and for each skill group. Additionally, the availability of information on migrants' age class has allowed us to further refine the analysis by identifying possible differences between high- and low-skilled young (aged 15–34), adult (aged 35–64), and elderly individuals (over 64) (see the Appendix provided as supplementary material).
3.2. Methodology
In this section, we offer an in-depth summary of the four-step methodology utilized to investigate the structure and dynamics of internal migration at the provincial level in Italy. To initiate the process, we established a directed and weighted network to represent the migration in-flows and out-flows between provinces. In the subsequent step, we applied a backbone extraction method to reduce the network's size while retaining its essential characteristics. In the third step, we create a low dimensional representation of nodes, exemplified by provinces in our context. At the end, to enhance the examination of the migration system through reduced dimensions, we applied well-established clustering methods to detect community structures within the migration networks.
3.2.1. Definition of directed and weighted network
The primary abstraction of the Italian internal migration network from data is a weighted and directed network. For each annual period, we created a unique network. Within this network, the links were emblematic of individual mobility, signifying the transition of people from an origin province to a destination province. The weight assigned to each link precisely quantified the volume of migrating individuals. In a formal context, we characterized this network as where denoted the set of nodes identified as , , and so forth, represented the set of directed edges, and symbolized the weighted adjacency matrix. This structured network encompassed nodes and edges, with the matrix element illustrating the count of migrants moving from node to node . We exclude self-loops, so the weight between a node and itself () is always 0, denoting, in our practical application, internal migration within the same province. The total outflow from node is denoted as , and the total inflow to node is denoted as . The sum of all weights in the network is denoted as . This systematic representation provided a comprehensive understanding of the migratory dynamics over distinct years, capturing the nuances in the flow of individuals between origin and destination provinces.
3.2.2. Backbone extraction
In network analysis literature, there is an active field focused on extracting meaningful structures from dense networks, eliminating noise-like connections. This pursuit, known as backbone extraction, aims to pinpoint significant connections for each object within the network [34]. The implementation of this approach varies across dimensions, categorized into statistical and structural methods, with hybrid approaches combining both [35]. Statistical methods employ hypothesis testing and p-values, while structural methods work on the network topology to derive a backbone with specific characteristics.
This paper addresses backbone extraction in migration networks. Migration networks are inherently dense and characterized by heterogeneous node strengths and edge weights, indicating a multiscale nature. The goal of deriving a meaningful signed network is achieved in this work by employing the method proposed by [34]. We denote the signed network as , where is a sparse adjacency matrix with values , indicating negative, absent, or positive links. The corresponding weight matrix is denoted as For extracting the signed edges in a network is crucial to establish a null model for comparing observed edge weights with expected values, allowing differentiation between chance occurrences and significant deviations in either a positive or negative direction. In the null model, we assume that each node in-strengths and out-strengths are predetermined, meaning that each node possesses a specific number of stubs or half edges. A unit-link signifies a connection between two stubs from different nodes. When a stub seeks connection, it randomly selects another available stub. The likelihood of connecting to a specific node is proportional to the number of available stubs for that node. The weight of the link between two nodes reflects the number of unit-links connecting them. This process resembles a sampling without replacement problem, akin to the well-known urn problem. To better understand, consider node : there are marbles in the urn (excluding loops), marbles are drawn from the urn without replacement, and there are marbles in the urn for each . This process aligns with the hypergeometric distribution, facilitating the computation of statistical quantities for the links between all nodes and . The mean of the hypergeometric distribution, associated with the link from node to node , is denoted as and is determined by Eq. (1). However, this formulation does not preserve the sequences of in-strength and out-strength in the system, making it inappropriate for the intended purposes.(1)(2)
To ensure the preservation of in-strength and out-strength sequences, certain conditions must be met, as detailed in Eq. (2), where is the weight matrix under the null model and represents the expected weight from node to node in this null model.
The iterative proportional fitting procedure (IPFP), described by Ref. [36,37], is adopted to meet the above requirements. IPFP utilizes the out-strength and in-strength sequences ( and ) as marginals, and the prior matrix , where its diagonal elements are set to 0. The objective of IPFP is to estimate the weight matrix that minimizes the relative entropy, also known as the Kullback-Leibler divergence [38], as formulated in Eq. (2).
IPFP iteratively adjusts the values of non-zero elements in the matrix through row-scaling and column-scaling operations until a desired level of precision is attained. It has been demonstrated that IPFP converges to an optimal solution [39]. Each element of the estimated weight matrix represents the expected value of the corresponding link under the null model being utilized. To further analyze the estimated values and their reliability, it is necessary to establish confidence intervals. These intervals rely on a dispersion measure, with the variance of estimated using the associated hypergeometric distribution. The well-known standard deviation is then obtained from this variance, as given by Eq. (3):(3)
A combination of two filters, the significance filter, and the vigor filter, is then applied to eliminate links with weights that do not significantly deviate from expected values. The significance filter removes links that meet the criteria,where and are user-defined hyperparameters specifying the desired significance thresholds for negative and positive signed edges. Since the significance filter might be too permissive, retaining very weak links, we also consider a vigor filter, expressed as takes values in the range , with when . Vigor filter remove links that fall within the range defined by the condition thatwhere and are user-defined hyperparameters determining the desired thresholds for positive and negative signed edges.
The resulting signed backbone, extracted using both filters, displays features commonly linked to signed networks, including reciprocity, structural balance, and community structure. It is worth noting that, extracting negative links from networks with positive edge weights relies on the assumption that links with small edge weights or the absence of links may represent negative connections. While this assumption may not apply to many networks, it is deemed acceptable in intrinsically dense networks, such as migration networks.
3.2.3. Network representation learning
Efficient analysis of network data relies on the chosen representation [40]. Traditional forms, like adjacency matrices, only capture node relationships with neighbors, presenting computational challenges for large-scale networks. To overcome this, high-dimensional data is reduced to a more manageable lower-dimensional structure. Network representation learning (RL), a dimensionality reduction technique [41], achieves this by mapping high-dimensional network data to a lower-dimensional vector space while preserving network topology. It minimizes redundancy and noise, retaining essential structural details. The acquired vertex representations must meet three conditions:
1.
they should be of low dimensionality, much smaller than the size of the vertex set;
2.
they should be informative, maintaining the proximity of vertices through consideration of network structure, attributes, and labels;
3.
they should be continuous, with real values, facilitating tasks such as vertex classification and clustering.
The essential goal is to construct an embedding matrix with dimensions , where represents the number of nodes, is the embedding size and denotes the embedding vector for node . This ensures that the similarity within the embedding space closely mirrors the similarity observed in the network. Formally, the task involves estimating the embedding matrix while satisfying the condition , where is a similarity function in the observed network, and is a similarity function in the latent space.
In our setting, we begin by determining the underlying signed backbone network , with representing the similarity in the observed network. For measuring similarity in the latent space, we opt for cosine similarity, characterized by a natural range of , aligning with the range of the input data. Once we define the similarity functions in both the observed space and latent space, we proceed to formulate the loss function as outlined in Equation (4).(4)
Stochastic gradient descent [42] is employed to discover the optimal that minimizes . The process involves initializing with random values, iterating over non-diagonal elements of in a randomized order, calculating gradients for and , and updating them using a learning rate . Each iteration over all elements of is termed an epoch, and this process is repeated for 100 epochs for each network in a given year.
The optimization process may converge to diverse yet equally plausible solutions, stemming from the random initialization of or the intrinsic stochastic nature of the optimization procedure. Notably, any rotation or reflection of the determined solution yields identical information. Consequently, a direct comparison of embedding vectors between different time steps is not feasible. Optimal rotations and/or reflections might exist between embeddings of different years, preserving cosine similarity while altering the spatial orientation. This occurrence is identified as the Procrustes problem [43], extensively discussed by Gursoy & Badur [34].
3.2.4. Community detection
In the last step, we address the objective of partitioning network vertices into clusters, where vertices within the same cluster display dense connections while having fewer connections to vertices from other clusters [44]. These cluster structures, often referred to as communities, are prevalent in a diverse range of networked systems, and carry significant implications. In migration networks, clusters represent provinces with concentrated migration flows. Analyzing spatial interactions aids in understanding dependencies, cultural exchanges, and migration's impact on interconnected areas, guiding policy decisions and resource allocation while predicting future migration trends. The literature offers a variety of network clustering algorithms that utilize different metrics to measure similarity or the strength of connections between vertices [[45], [46], [47], [48]]. In this context, to understand the community structures in the network, we employed the agglomerative hierarchical clustering [49], based on learned representations to cluster vertices and density-based spatial clustering of applications with noise (DBSCAN [50]). DBSCAN groups data points based on density, forming clusters and marking outliers. Its effectiveness relies on hyperparameters, such as eps determining neighborhood radius and minSamples specifying the minimum objects for a dense region. Adjusting these parameters is crucial, as smaller eps values create denser clusters, while larger values may merge or introduce noise. Conversely, elevating minSamples values generally yields larger and more stable clusters, whereas smaller values can result in more clusters and heightened sensitivity to noise.
In the hierarchical method, we measure the dissimilarity between two clusters by considering the maximum distance between their respective data points, using the complete linkage method [49].
The distance calculation between two provinces, and , is expressed as , where cosine similarity quantifies the similarity in the latent space. Consequently, we derive distances for all province pairs within the range. These distance values serve as inputs for our clustering analysis.
4. Results of internal migration among Italian provinces during the period 2002–2023
In this section, we summarize the key findings and insights obtained from the analysis of internal migration among Italian provinces during the period 2002–2023. Initially, we present the outcomes of the overall network analysis, followed by a detailed exploration of the spatial dynamics characterizing the internal flows of both high-skilled and low-skilled migrants, further divided by age groups (young, adult, and elderly). The analytical procedures employed leveraged tools implemented within the R programming environment and Python4.
4.1. Findings across the overall network
In delineating the characteristics of the overall original networks, we take into account the numerical representation of migrants entering and exiting a province, denoted as the in-strength and out-strength of a node, respectively. These metrics condense the total influx and outflow of migrants, portraying a comprehensive picture of the migratory trends within each province, as displayed in Fig. 1.
It can be argued that there is a significant diversity in node strengths, with a smaller number of cities undergoing both greater influx and outflow of migration. These results align with the observation that the majority of real-world networks display a multiscale nature in both their node strengths. Additionally, they validate the migration law, suggesting that the magnitudes of migration flows between sending and receiving locations within migration systems frequently follow a power law distribution.
Next, we extract the network signed backbone, maintaining a desired level of sparsity, either based on the statistical significance of the links (using its significance filter) or the strength of the links (via its vigor filter), as specified in Section 2. We obtain the signed backbones using the significance filter, preserving 5–100 % of all potential links, and assess the backbone structure in relation to reciprocity and structural balance. The key features of migration backbones are depicted in Fig. 2. In particular, Fig. 2a highlights the percentage of reciprocated links, both positive–positive and negative–negative, and the proportion of conflicting links relative to all links. The visualization reveals that a majority of signed links reciprocate with links of the same sign. Rare instances of contrasting links are observed even when the backbone's size is considerably large. These results suggest that the directional cues of links in our migration networks consistently adhere to the principle of reciprocity.
Fig. 2. Characteristics of migration backbones: Reciprocity (a) and Structural Balance (b).
It is worth noting that most empirical signed backbones generally exhibit some level of structural balance, succinctly captured by the phrases “friend of a friend is a friend” and “enemy of a friend is an enemy”. In a triadic relationship (three nodes connected by edges), structural balance occurs when the relationships are either all positive or two negatives and one positive [51]. This principle is based on the idea that balanced relationships contribute to stability within a network. Weak structural balance extends the concept of structural balance by allowing for a certain level of imbalance in triadic relationships. In weak structural balance, there is a tolerance for one negative relationship within a triad, but the other two relationships must be positive. This concept acknowledges that complete structural balance may be overly restrictive in certain real-world scenarios [52]. Therefore, Structural Balance and Weak Structural Balance in a signed network are defined as the proportion of balanced triples over all triples.
The findings presented in Fig. 2b provide compelling evidence of a predominant structural balance within our network nodes, signalling a noteworthy phenomenon attributed to positive relationships. This structural balance is not confined solely to districts in close geographical proximity; rather, it extends its influence across regions with shared migration destinations spanning considerable distances. This observation gains further significance when considering the intricate interplay of positive relationships among nodes. The interconnectedness of three neighboring districts serves as a prime example, where a dynamic exchange of migration occurs among them. Moreover, these districts collaboratively contribute to a specific well-developed province, establishing a network pattern that transcends local boundaries and embraces a broader regional perspective. The underlying dynamics that contribute to this structural balance are multifaceted. Beyond geographical proximity, the shared migration destinations over extended distances signify a network influenced by factors such as economic ties, employment opportunities, or cultural affinities.
We proceed by extracting the backbones that comprise the most significant and intense links. Specifically, we preserve the top 7.5 % of links in terms of significance and subsequently discard any links with absolute vigor values less than 0.33.
In capturing and emphasizing meaningful, strong positive relationships within the network, while simultaneously mitigating the impact of weaker negative connections, we selected the vigor threshold, strategically set to ensure that the weight assigned to a positive link is at least twice the random expectation, while the weight assigned to a negative link is capped at half of the random expectation.
Fig. 3, Fig. 4a provide the essence of migration patterns in the years 2002 and 2023. The node sizes in these graphs are thoughtfully scaled to mirror the proportional populations of each location in their respective years, offering a comprehensive visual insight into the dynamic shifts in population distribution over the specified timeframe.
Fig. 4. Spatial overall network of migration backbone in 2023 (all educational levels and age classes).
Within both figures, we observe a network of positive connections at the local level, accompanied by links that span geographically distant locations. Notably, these positive connections tend to converge towards districts distinguished by significant economic and social activities, exhibiting a discernible South-North directional trend.
These findings substantiate the existence of a clear defined migration routes: the predominant occurrence of migration at a local level, and the observation that long-range migrations typically revolve around major areas of substantial economic activity.
Given that not all nodes in a network contribute equally to determining the network structure, we gain insight into the diverse significance of nodes in shaping the network structure through the estimation of centrality indices (Fig. 5). Specifically, for each node, we calculated the degree, betweenness, and closeness metrics. The degree metric denotes the number of direct connections that a node has within the network. In our specific analysis, Milan, Rome, Turin, Naples, and Florence emerge as key nodes in the migration network, highlighting their significant connectivity with other provinces. These crucial urban centers stand out for their high degree values, indicating a broad network of direct connections with other locations. This phenomenon suggests a strong allure and socio-economic relevance in terms of migration flows. Rome, as the capital, stands as a political and cultural epicenter, while Milan reaffirms its status as a prominent national and international economic and industrial hub. Naples, with its strategic position in the South, and Florence, rich in history and culture, play key roles in the country's social and economic fabric. Turin, with its industrial history and position in the northwest, significantly contributes to the migration network, consolidating its status as a prominent urban centre in the Italian economy.
Fig. 5. Centrality measures of the overall network (year 2023).
Regarding betweenness, which measures how often a node occurs on all shortest paths between two nodes, Milan, Rome, Naples, and Turin have risen to prominence as crucial nodes within the migration network. These cities play a pivotal role in connecting diverse provinces, facilitating the movement of people across geographical boundaries. Their strategic positioning and extensive connectivity underscore their vital contribution to shaping migration patterns and fostering socio-economic integration across Italy. Moreover, the examination of closeness confirms the importance of the cities of Rome, Milan, and Naples in the overall migration network, whose highest values for this metric indicate their significant levels of connectivity within the network. This suggests that these urban centers are geographically well-positioned to access a wide range of other nodes within the migration network efficiently. On the other hand, provinces like Vibo-Valentia, Verbania-Cusio-Ossola, and Nuoro occupy positions at the periphery of the network in terms of the above-mentioned metrics.
For a more targeted analysis, we also investigate the ego networks of the most prominent provinces.
In the year 2002, we depict the ego networks of Milan, Rome, and Naples in Fig. 3b, c, and d, respectively.
The three districts under analysis generally exhibit negative connections (absence of links or small edge weights) to their geographically farther neighbors; by contrast, there are mostly local positive links with the provinces of the region. For example, Rome's ego network draws long-range links from the southern regions of the country, primarily from Basilicata and Calabria. Simultaneously, Naples' individual network establishes long-range positive connections with areas in the North-East and North-West of Italy.
These connections persist unchanged in the ego networks related to the year 2023, indicating continuity in the patterns of connections observed during the analysis period.
Further, unreported investigations, suggest that the overall migration backbone has not notably changed from 2002 to 2023, since the results remain consistent and stable over this extended timeframe.
Proceeding to our examination of clustering outcomes, we utilize the DBSCAN algorithm, configuring its hyperparameters, with eps and min_samples set to 0.14 and 3, respectively. Initial experiments revealed that this configuration, along with similar setups, produces satisfactory clustering results. The identified communities from the DBSCAN partition are visualized in Fig. 6, with distinct colours representing various clusters. This visualization offers a comprehensive illustration of the spatial distribution and relationships identified within the network. Provinces not affiliated with any cluster are depicted in light blue and labeled as −1 in the legend.
In 2023, a total of 12 clusters were identified, marking an increase of 2 clusters compared to the year 2002. Upon closer scrutiny of the two sets of clusters, it is apparent that the clusters associated with Sicily, Sardinia, and the Adriatic spine exhibit consistency and stability. Overall, our analysis underscores that the formation of communities is significantly shaped by geographical proximity. The findings imply that districts in close proximity tend to cluster together, irrespective of variables such as population density or economic activity.
The outcomes of hierarchical clustering are presented in Fig. 7. Cutting adequately the dendrogram, we observe that the findings from both clustering methods demonstrate a certain degree of agreement. Once again, geographical proximity emerges as a significant factor in delineating communities. Consequently, we can assert that the conclusions drawn from the clustering methods exhibit robustness.
These findings allow us to explore in greater depth how different areas cluster based on similar socio-economic and cultural characteristics. The analysis of community structures reveals that the clustering observed in both years highlights significant regional patterns in internal migration flows, providing insights into population movement dynamics across Italy. In the northern provinces of Piemonte, Lombardy, and Veneto, a high degree of interconnectedness emerges, indicating robust migration patterns driven by economic opportunities and a thriving job market. This interconnectedness not only facilitates the movement of individuals seeking better employment but also strengthens regional economies by diversifying the labour pool. In central Italy, particularly around Rome, communities exhibit a mix of local connections and long-distance migrations. The capital's cultural and economic significance plays a crucial role in shaping migration dynamics, attracting individuals from various provinces drawn by educational opportunities, cultural institutions, and a dynamic job market. This interplay between local and external influences fosters a unique social fabric where traditional ties coexist with new, diverse interactions.
Conversely, clustering patterns in the southern regions, including Calabria and Sicily, reveal substantial outflows, underscoring persistent challenges such as economic disadvantage, limited job prospects, and inadequate infrastructure. The emigration of skilled labour to more prosperous northern areas not only depletes local talent but also exacerbates socio-economic inequalities, hindering these communities' growth. This ongoing trend underscores the need for targeted policies aimed at revitalizing these regions and retaining local talent.
Additionally, the results highlight the unique status of the islands, such as Sardinia and Sicily, which display patterns of isolation from the mainland. This geographic separation may reflect specific socio-economic conditions, such as restricted access to resources and opportunities, which influence migration flows differently from the rest of Italy. The islands often struggle with transportation challenges, economic development constraints, and limited employment prospects, leading to distinctive migration patterns that require tailored policy interventions.
4.2. Internal flows of high-skilled and low-skilled migrants
The previous section offers a comprehensive overview of the structure of migration flows across Italian provinces, highlighting their stability over the past few decades. Unlike many other high-income countries, which have experienced a decline in internal mobility [53], Italy's migration flows have remained relatively constant [54].
However, this stability results from two opposing trends that emerge when internal migrants are categorized by educational level. While the share of high-skilled migrants within total migration has been steadily increasing, the share of low-skilled individuals has been decreasing. Specifically, the proportion of individuals with at least a secondary education reached an exceptional 66.5 % in 2023 (up from 43.5 % in 2002), while the opposite occurs for low-skilled individuals declined from 56.5 % to 33.5 %. The increasing skill intensity characterizing Italian interprovincial migration—marked by a growing share of high-skilled individuals—modifies the spatial distribution of skills and knowledge, with significant economic consequences for both origin and destination provinces. When these migration flows, which embody higher human capital, are directed towards wealthier areas, the resulting dynamics contribute to a rise in territorial disparities.
Therefore, given the importance of considering migrants' educational levels, we focus on the network structure of two subgroups of internal migrants, categorized by their education levels into high-skilled and low-skilled. In other words, the same empirical approach used to analyze total migration flows is applied to investigate the migration patterns of these two groups.
The migration flow networks in Italy for high-skilled and low-skilled migrants in the years 2002 and 2023 are displayed in the Appendix. The networks of high-skilled and low-skilled individuals largely overlap. When comparing the networks from 2002 to 2023, northern provinces such as Milan, Turin, and Bologna continue to serve as key hubs, maintaining strong intra-provincial connections. Specifically, positive local connections (short-distance mobility) emerge across the country, particularly between neighboring provinces often within the same regional (NUTS-2) context. Non-local positive links, however, are detected for long-distance migration from the South to the Centre-North. This latter trend highlights the dramatic direction of Italian internal flows, which clearly follow an asymmetric and unbalanced pattern with a heavy concentration in the central-northern provinces. This indicates both a loss of people with lower levels of education and a constant outflow of human capital from the Mezzogiorno.
According to Biagi et al. [55], long-distance migration in Italy is mainly driven by labor market conditions and economic imbalances, following a disequilibrium model where initial differences in wages and unemployment push people to move to more developed areas, ultimately helping to restore regional balance. In contrast, short-distance migration, particularly from large northern cities to their hinterlands or nearby provinces, is motivated quality of life factors, supporting an equilibrium model in which wage differences partially compensate for spatial variations in non-economic factors [55].
As anticipated, the loss of human capital has significant economic consequences for the Italian dualism. According to established literature, the substantial share of high- and medium-skilled individuals (i.e., human capital) who choose to relocate annually from the South of Italy to the Centre-North contributes to widening the economic disparities between the two macroregions [15,56]. It is important to highlight that, from 2002 to 2023, the South has experienced a loss of 1,295,763 individuals with at least a secondary education, corresponding to an average annual outflow of 58,898 high-skilled individuals. In contrast, during the same period, the opposite flow (from the central-northern provinces to the South) amounted to 557,777 individuals (25,353 per year), indicating, as expected, a negative balance for the Mezzogiorno, which continues to lose a significant share of tertiary and secondary educated individuals in favor of the Centre-North.
In summary, the socio-economic backwardness of the southern provinces and the resulting out-migration of high-skilled individuals contribute to the impoverishment of the Mezzogiorno in terms of its human capital base. This, in turn, slows down the economic development of the region and exacerbates the North-South divide. This situation represents a crucial challenge for Italian policymakers, who must address and potentially reverse these detrimental dynamics. Among the current regional development policies in Italy, measures specifically targeted at labour market and entrepreneurship focus on strengthening active labour market policies through the Garanzia di Occupabilità dei Lavoratori (GOL) Program, which involves significant investments in training, upskilling, and reskilling for unemployed individuals and young people, threby ensuring uniform service delivery across the country. In addition to this program, initiatives such as Garanzia Giovani have been introduced to enhance employability by supporting young people at risk of inactivity, while also countering those firms that misuse the scheme to hire qualified personnel at extremely low costs. Additional instruments, such as the ‘Fondo per il credito ai giovani’ (also known as ‘Fondo studio’), have been established to promote access to credit, sustainability, and transparency for Italian families, thereby enabling deserving young people with insufficient financial resources to pursue further education or complete their studies.
Furthermore, it is important to consider potential counterarguments to our findings. Recent theoretical contributions suggest that the prospect of emigration, particularly for highly educated individuals, can act as an incentive for individuals to invest in education as the potential for higher wages abroad increases their returns on education [[57], [58]]. However, if only a small portion of those who invest in education ultimately decide to emigrate, the sending region would benefit from a more skilled workforce, potentially driving productivity, innovation and long-term development. Additional positive feedbacks of migration on the origin economy, as highlighted by Rapoport [59], include remittances—whose amount tends to decrease as the level of education increases [60,61]—as well as the return of migrants who have acquired new skills abroad, even though it is generally acknowledged that return migration is not particularly significant among highly skilled individuals Moreover, migration facilitates the creation of networks that can enhance trade, investment flows, and the diffusion of knowledge.
In addition to these potential positive effects for the sending regions, the in-flows of high-skilled in the Centre-North produce several beneficial consequences. One key benefit is the promotion of innovation and the creation of new knowledge. According to the endogenous growth theory, a higher high-skilled immigration—enhancing local human capital—represents a crucial engine of knowledge, technologies and innovations, with positive consequences for the long-run economic growth [62,63]. Additionally, high-skilled migrants contributing to the productivity of existing workers, lead to wage increases for both skilled and less-skilled labour in the region [64,65]. They also enhance the region's competitiveness by stimulating investment in new technologies and fostering knowledge transfer through professional networks [66].
By 2023, however, connections involving southern provinces, particularly between Naples and Bari, have strengthened significantly, indicating a growing attractiveness of these regions. Moreover, the overall network appears more dispersed, reflecting a diversification in migration patterns. To better understand these dynamics, we also analyze migration patterns by skill level and age group, highlighting the specific trends that shape Italy's migration landscape. The migration of both high-skilled and low-skilled young and adult individuals continues to show a trend of movement from southern provinces to the North. However, while migration in 2002 was predominantly directed northward, by 2023 the data suggest a shift toward more diversified destination choices, with Rome emerging as a key hub for both groups of migrants. Notably, the movement of older individuals follows distinct routes, rather than a simple northward trajectory. This indicates that Italy's migration network is more interconnected than previously recognized, influenced not only by skill level but also by age-specific factors and changing destination preferences. Overall, the interconnectedness of these migration networks reflects broader socio-economic dynamics, underscoring the importance of understanding the motivations behind migration across various demographic groups. The persistence of these patterns highlights the ongoing challenges and opportunities faced by regions such as Sicily and Sardinia in retaining young talent, while also revealing the evolving nature of migration in Italy. The network representations provided in the Appendix offer a detailed perspective on these trends, enabling a more comprehensive visualization of the shifts observed over time.
5. Conclusions
This study examined the structure and dynamics of internal migration at the provincial level in Italy from 2002 to 2023, using network analysis to investigate and compare migration flows over time. A directed, weighted network was built to map migration flows, with backbone extraction highlighting key connections. Network embedding simplified migration patterns, while clustering identified migration communities. The analysis confirms a power-law distribution in migration, with only a few cities experiencing both high inflows and outflows. The extracted backbone reveals stable migration relationships, characterized by reciprocity and structural balance. Migration flows show strong local connections extending toward economically active districts, reinforcing a South-to-North trend. Rome, Milan, Naples, Turin, Bologna, and Florence act as key migration hubs, linking regions and shaping national mobility. Policies should target these nodes to reduce regional disparities, fostering university-business collaborations to drive innovation and job creation. Southern regions, particularly Calabria and Sicily, face persistent population losses, especially among highly educated individuals, due to limited job opportunities and infrastructure deficits, fueling migration to the Centre-North.
To mitigate this ‘brain drain’ phenomenon, it is crucial to invest in education and professional training, ensuring that local workers develop skills aligned with market demands. Providing tax incentives to businesses that hire locally can stimulate job creation and encourage talent retention. At the same time, improving infrastructure and public services would enhance the quality of life, making these regions more attractive for residents. Supporting startups and leveraging local resources can further drive economic growth, creating new opportunities for young professionals. Strengthening collaborations between universities and industries would also help graduates secure internships and job placements, fostering a more balanced regional development and reducing the gap between northern and southern Italy.
The analysis reveals that migration communities are strongly influenced by geographic proximity, with nearby districts tending to cluster together. Both hierarchical clustering and DBSCAN support this finding, highlighting the significance of proximity. These results are important for migration policy design, as backbone extraction and clustering techniques offer valuable insights. Backbone extraction helps identify key migration routes, which is useful for addressing uneven human capital distribution across regions. Targeted interventions can support high-emigration areas, such as investing in labor markets, education, and infrastructure. Clustering analysis further refines this by revealing migration communities that share persistent ties, offering a foundation for coordinated regional policies. For instance, the North forms distinct clusters, suggesting economic integration policies should be supra-regional. In the South, migration flows towards the Centre-North highlight the need to address regional disparities, promote local growth, and improve employment. Monitoring these patterns over time helps evaluate policy effectiveness and identify emerging migration hubs, ensuring more informed and proactive migration strategies.
We further refined the analysis by categorizing internal migrants according to their education level and age group. Both high and low-qualification groups showed similar migration patterns, with strong local connections and long-distance movements from the South to the Centre-North. This trend leads to a concentration of migration in the central-northern provinces, contributing to economic disparities, particularly the South's loss of human capital. Additionally, older age groups followed migration routes that diverged from the typical northern trend, suggesting a more integrated migration network influenced by education, age, and changing preferences.
Anyway, the persistent outflow of human capital from the South suggests the necessity for policies that encourage the retention of talent in the region through investments in research, innovation, and entrepreneurial development. Moreover, the quality of life in southern regions plays a crucial role in attracting and retaining human capital. Policies aimed at reducing crime, improving infrastructure, and strengthening public services (e.g., education, healthcare, transportation) can make these areas more attractive to skilled workers and investors [12,20].
Given the strong relationship between university choice and subsequent job location, improving the quality of universities in the South could help reduce students and graduates out-flows, fostering greater local human capital accumulation [13]. In this regard, enhancing academic and research infrastructures, establishing technological hubs, diversifying university offerings and promoting excellence in specific fields could increase the competitiveness of the Mezzogiorno. These interventions could also incentivize the return of skilled workers (or reduce the need to migrate) and attract foreign investments.
Strengthening the connections between universities and the local labour market is essential to ensure that acquired skills translate into professional opportunities within the region. Encouraging collaboration between universities and businesses, particularly in innovative and high-tech sectors, could create qualified job opportunities and promote talent retention in the territory [11,13].
Stimulating the demand for skilled labour in the Mezzogiorno through tax incentives for companies hiring graduates could help mitigate brain drain. Furthermore, support measures for startups and innovation could foster an ecosystem more conducive to the growth of knowledge-intensive sectors [67].
Investments in transportation infrastructure and digital connectivity can facilitate South working and reduce the disadvantages of working in peripheral regions. Additionally, measures to improve access to information on local opportunities could help counteract the forced migration of talent [20,68].
Migration not only reshapes labour markets but also plays a crucial role in knowledge transfer and regional development. The movement of communities (i.e, diasporas) intersect with labour migration and act as transnational knowledge bridges, connecting their host and home regions allowing them to obtain several beneficial consequences from international trade and foreign direct investment (FDI) up to international innovation flows [69]. In the Italian context, policies aimed at fostering co-inventorship, international mobility and intercultural network could enhance and strengthen the innovation ecosystem in underdeveloped areas [69]. Additionally, the bridging role of social capital—both within emigrant communities and in their interactions with host regions—could be strengthened through targeted initiatives, such as professional networking platforms, startup incubators, and incentives for return migration.
These targeted policy strategies are especially crucial given the evolving tastes and preferences of workers following the COVID-19 pandemic. The crisis, indeed, has spurred an important debate on the future geography of jobs, raising critical questions about where work will be performed in the years to come. In Italy, this issue is particularly pertinent due to recent internal migration trends that defy historical patterns, with flows now observed from the traditionally prosperous North to the South, facilitated by the rise of remote working opportunities [70].
Our examination of Italy's migration network not only provides valuable insights into the applicability of network analysis and the importance of these findings in understanding migration factors and planning targeted policies, but also offers valuable insights that can be extrapolated to other nations facing similar economic divides. For instance, the persistent northward flow—marked by the concentration of high-skilled migration into economically vibrant regions—highlights how structural imbalances can contribute to widen regional disparities. By identifying key nodes such as Rome, Milan, Naples, and Turin through centrality measures and community detection techniques, our study demonstrates that urban hubs can act as both attractors and catalysts for further economic concentration. These findings suggest that other countries experiencing analogous regional divides might benefit from adopting network-based analytical frameworks to pinpoint critical migration corridors and vulnerable peripheries. In turn, targeted policy interventions—such as bolstering local educational and economic infrastructures, promoting balanced regional development, and mitigating brain drain—could be designed to foster more equitable growth. Integrating these insights into broader migration management strategies may thus help to alleviate long-standing economic disparities and promote sustainable urbanization across diverse socio-economic contexts.
CRediT authorship contribution statement
A. Sarra: Writing – review & editing, Writing – original draft, Methodology, Conceptualization. D. D'Ingiullo: Writing – review & editing, Writing – original draft, Data curation, Conceptualization. A. Evangelista: Investigation, Data curation. E. Nissi: Supervision, Conceptualization. D. Quaglione: Supervision, Conceptualization. T. Di Battista: Supervision, Conceptualization.
A long-run analysis of push and pull factors of internal migration in Italy. Estimation of a gravity model with human capital using homogeneous and heterogeneous approaches
Patterns of internal migration of Mexican highly qualified population through network analysis
B. Murgante, S. Misra, A.M.A.C. Rocha, C. Torre, J.G. Rocha, M.I. Falcão, D. Taniar, B.O. Apduhan, O. Gervasi (Eds.), Computational science and its applications -- ICCSA 2014, Springer International Publishing (2014), pp. 169-184
F. Abergel, H. Aoyama, B.K. Chakrabarti, A. Chakraborti, N. Deo, D. Raina, I. Vodenska (Eds.), Methods for reconstructing interbank networks from limited information: a comparison bt - econophysics and sociophysics: recent progress and future directions, Springer International Publishing (2017), pp. 201-215
M. Bellandi, I. Mariotti, R. Nisticò (Eds.), Città nel COVID: Centri urbani, periferie e territori alle prese con la pandemia, Donzelli Editore (2021), pp. 263-290
Annalina Sarra is Associate Professor of Statistics at the Department of Socio-Economic, Managerial and Statistical Studies, University “G. d'Annunzio” of Chieti-Pescara, Italy. Her research interests span across spatial epidemiology, spatial statistics, IRT models, and Functional Data Analysis. Currently, her research predominantly focuses on Text Mining, Probabilistic Topic Models, and Social Network Analysis, with practical applications concerning issues such as Hate Speech and Online Misogyny
Dario D'Ingiullo is Senior Researcher in Applied Economics at the Department of Socio-Economic, Managerial and Statistical Studies of the University “G. d’Annunzio” of Chieti-Pescara. His research interests encompass regional economics, labour economics, and economics of innovation. His scientific production includes research works on social and human capital, institutional quality, innovative capacity, export performance, skill-selective migration, digitalization, structural change, and regional economic growth.
Adelia Evangelista is a research technician in Statistics at the Department of Socio-Economic, Managerial and Statistical Studies, University “G. d'Annunzio” of Chieti-Pescara, Italy. She specializes in conducting research in the fields of Functional Data Analysis and cluster analysis, with a particular emphasis on their practical applications within the realm of air pollution."
Eugenia Nissi is full professor of Statistics at the Department of Economic Studies, University of Chieti-Pescara “Gabriele d’Annunzio”. She is the Coordinator of the Master of Science in Business Economics at the University of Chieti-Pescara “Gabriele d’Annunzio”. Her main research interests are devoted to spatial statistics, efficiency analysis, composite indicators, and well-being analysis. She is the author of over 100 publications in the statistical field.
Davide Quaglione is Full Professor in Applied Economics at the Department of Socio-Economic, Managerial and Statistical Studies of the University “G. d’Annunzio” of Chieti-Pescara and Research Fellow at GRIF, Gruppo di Ricerche Industriali e Finanziarie, LUISS Guido Carli University, Rome. His research interests span across industrial economics, urban and regional economics, labour economics, and economics of innovation. His research works reflects a diverse range of domains encompassing topics such as cultural and human capital, digitalization and big data economics, migration, antitrust and economic regulation, regional development, infrastructure investment, and economic impact assessment.
Tonio Di Battista is a Full Professor of Statistics at the Department of Socio-Economic, Managerial and Statistical Studies, University “G. d'Annunzio” of Chieti-Pescara, Italy. His research areas focus particularly on Functional Data Analysis, Item Response Theory, and cluster analysis. His work involves studying and analyzing data in order to gain insights and draw meaningful conclusions that contribute to our understanding of ecological systems and the preservation of biodiversity.
This article is part of a special issue entitled: Statistical methods for evaluating services published in Socio-Economic Planning Sciences.
1 The introduction of network effects (i.e, migrant stock) further refines the understanding of migration dynamics, as highlighted by Ref. [33], who finds that once controlling for the role of networks, the impact of human capital on migration at the destination, albeit negative, turns out to be statistically significant, while its influence at the origin continues to be positive.
3 The authors propose “the extraction of international migration network based on each country's topmost migration stock with other countries (that is, country i is linked to country j only if j is i's most important migration country; otherwise, there is no link between i and j). Specifically, we built the top1 destination network Top1D (that is, country i is linked to country j only if j is i's top1 destination country for emigrants) and the top1 origin network Top1O (that is, country j is linked to country i only if i is j's top1 origin country for immigrants).” ([23], p. 2).
4 A user-friendly and uncomplicated framework for evaluating and selecting the most suitable backbone method is available in Netbone, a Python package accessible at https://gitlab.liris.cnrs.fr/coregraphie/netbone.