1. Introduction
In 2015, the eight Millennium Development Goals (MDGs), which began in 2000 and culminated in 2015, were replaced by the 2030 Agenda and the Sustainable Development Goals (SDGs). The final 2015 MDG Report stated that for the next 15 years, a new and more ambitious plan was required to transform the world if the globally desired future was to be achieved, where human needs and the requirements of economic transformation could be balanced with the protection of the environment and realizing peace and human rights for all [1]. This new universal agenda comprised 17 goals and 169 targets and aimed to complete the unfinished elements of the MDG era, but also ensure peace and prosperity for the people of the planet.
The concept of sustainable development emerged in the late 1980s as one of the development catchphrases of the era [2]. Arising from the debates on the contradictions between economic growth and environmental sustainability, to an emerging consensus on the three significant dimensions of sustainable development (economic growth, social inclusion, and environmental protection), to the 5Ps (people, prosperity, planet, partnership, and peace) that form the core of the 2030 Agenda [3], numerous scholars, practitioners, and policy makers have contributed to the advancement of sustainable development. Figure 1, summarizes the growth in searches for a few sustainable development-related phrases (and an index of their monthly Google trend) since 2014. In addition to the term “sustainable development”, for which searches show stable growth, the year 2015 (the year the 17 SDGs were adopted) shows a significant increase in worldwide searches for sustainable development; the index for the term “sustainable development” increased from approximately 40 to 70. This rapidly growing trend continued, even after 2015, and all relevant phrases have now merged close to the maximum value of 100, indicating the highest search pattern interest worldwide.
In addition to the greater ambition of the SDGs and the broadening definition of development [4], one of the biggest differences between sustainable development today compared with development in its early iterations is undoubtedly that it is accompanied by rapidly evolving technology and data science. Big Data, as a concept, emerged in the early 1990s about the same time as sustainable development. The importance of Big Data and the technological revolution that spawned it has arguably touched every aspect of human life, as evidenced by the stable and high level of the Google trend index (see Figure 1). Although numerous studies have contributed to our understanding of Big Data and its contribution to sustainable development, to the best of our knowledge, and although there is a UN Working Group dedicated to this issue, there is no research exclusively devoted to investigating the most up-to-date connections between the SDGs and Big Data. Nor is there any research dedicated to the related fields of Data Mining techniques and analytics platforms/services—both influential domains where, despite impressive progress, challenges are faced continuously. This paper seeks to summarize the impact of Big Data on sustainable development, in the context of the UN’s 17 SDGs, to present the readers with a good overview of where things stand at the moment, vis à vis the potential of Big Data in realizing the 2030 Agenda, as well as the challenges faced. In doing so, this should open up new avenues for future studies and research and further promote the integration of technology and sustainable development towards their shared goal.
The remainder of the paper is organized as follows: Section 2 introduces Big Data in the context of the current rapid technological revolution; Section 3 investigates the impact, potential values, and challenges of Big Data for each of the 17 SDGs; the paper is concluded in Section 4, where potential directions for future research are also outlined.
2. Big Data under the Technological Revolution
In recent years, rapid and continuous technological advancement has produced almost limitless volumes of information. It is estimated that every person on Earth will create 1.7 MB of data every second by 2020, generating over 2.5 quintillion bytes of data daily [5]. This data deluge or information overload has resulted in Big Data. The resultant analytics form a crucial key to every aspect of the digital economy and the potential and increasing significance of data-driven decision making is widely acknowledged. Numerous investigations have been conducted to reveal and illustrate the good use to which these data can be applied.
Like the concept of sustainable development, after decades of rapid development, Big Data are no longer a new term or concept. Although no longer new, Big Data have not stopped developing, expanding and integrating with other emerging and trending digital era products. In fact, this data-digitalization revolution promises to continually keep moving forward as technology evolves [6]. Nowadays, when referring to Big Data, instead of thinking about an independent solution or the size of the dataset being created, Big Data are now considered as a phenomenon closely linked to the broader digital revolution [7]. Few existing studies attempting to link sustainable development with Big Data have reflected adequately the most up to date nature of the Big Data phenomenon. The impact of accessing bigger size data on sustainable development [8] or the specific types of Big Data have not been addressed (i.e., urban Big Data in [9], household Big Data in [10], big Earth data in [11,12]). As Hassani et al. (2019) summarize in [13], the digitalization journey started from data management and warehousing before spreading to web-based intelligence and analytics, which in turn led to third generation mobile and sensor-based systems, followed by a fourth generation promoted by the evolving Internet of Things (IoT), machine learning (ML) and artificial intelligence (AI) technologies, and more recently 5G. Thus, it has been a long journey to the current modern digital world [13,14]. Advances and benefits are unfolding in parallel with these rapidly developing digital technologies and intelligent data analytics, including a greatly enhanced capability to gather, obtain and process data. There is no doubt that the Big Data phenomenon and relevant technologies have penetrated almost every aspect of our life. The expansion of Big Data is contributing to productive growth across industries, enabling new business in Big Data infrastructure, analytic platforms, and services to develop [13]. In doing so, Big Data can make an important contribution to sustainable development. For example, Big Data can contribute to the better measurement of the progress of the SDGs [15]. Big Data and related Data Mining techniques and analytics platforms/services also hold immeasurable potential for the better integration and implementation of sustainable development.
3. Values of Big Data to the United Nations Sustainable Development Goals
The integration of sustainable development will play an increasingly significant part in economic growth and social progress given the existing unbalanced relationship between humans and the planet, especially in urbanized cities [9]. In 2015, all United Nations member states adopted the 2030 Agenda for Sustainable Development with 17 SDGs at its core. Following the MDGs, the UN issued an urgent call for action by all countries in a global partnership [16]. The UN understood the importance of leveraging Big Data and its analytics from the very beginning, establishing the UN Global Pulse in 2009 to serve as an innovation hub for developing, scaling and promoting Big Data research for sustainable development and humanitarian action [17]. Furthermore, there is the UN Big Data Programme [18], which includes a Task Team dedicated to Big Data and the SDGs [19]. UN ESCAP has also shown strong interest in this area and recently published a blog and working paper dealing with Big Data and the SDGs [20,21]. Moreover, there is a remarkable working project where a metadata repository of all SDG indicators is being assembled by the UN system and other international organizations. This metadata repository can be accessed via the UN Statistics Division website website [22].
The following subsections investigate the potential benefits of Big Data and their relevant technologies for each of the SDGs. In doing so, the improvements to measurement of the SDG indicators using Big Data are illustrated. The contributions of the broader Big Data phenomenon, empowered by today’s technologies, are also exemplified.
3.1. Big Data and No Poverty
SDG 1, “no poverty”, sets out to “end poverty in all its forms everywhere” [16]. This goal is a continuation from the MDG era, during which the global numbers experiencing extreme poverty was reduced by almost half, from 1,751 million (about 16% of the world) in 1999 to 836 million (about 10% of the world) in 2015. The 2030 Agenda aims to further reduce that percentage to less than 3% by 2030 [1,16]. This achievement was greatly assisted by a rapidly developing China, which dramatically improved global aggregates [23]. This brisk development could not continue indefinitely, leading to a slowdown of the decline of global extreme poverty. This slowdown has been exacerbated by outbreaks of violent conflict, natural disasters and more recently a global pandemic, which is predicted to lead to an increase in extreme poverty for the first time in over a decade (2021 [24]). The need for action is clear from the Google Trend “no poverty” index (see Figure 2), where a structural change in 2015 is followed by a stable growth trend approaching the maximum.
According to the World Bank Group survey of SDG-related Big Data projects in 2015 [25], SDG 1 “no poverty” attracted the most attention of Big Data projects among all SDGs, with mobile phone data and satellite imagery data and geodata identified as the top two data sources. UN Global Pulse identified that spending patterns extracted from mobile phone data can provide proxy indicators for income levels [17], which has also been confirmed in [26,27], where the authors used historical mobile phone use data or call detail records to map the socioeconomic status of individuals and provide an accurate prediction of poverty, even within micro-regions. Satellite imagery data and geodata have also been playing a significant role in predicting and identifying poverty. Jean et al. (2016) [28] applied ML techniques to daytime satellite images and night-time maps to locate areas experiencing poverty. Similarly, in [29], using ML techniques, high spatial resolution satellite images were adopted to identify features such as building density, car counts, road density, pavement, road width, roof materials, which can all be used to estimate poverty within small areas. The authors of [30] promote the integration of poverty geography with digitalization trends, where modern Big Data empowered technologies (i.e., data platform, cloud computing, remote sensing, AI), can be used to help fight poverty. As noted above, developments in China made a significant contribution to reducing global poverty; over 70% of the poverty reduction globally, over the past 40 years, can be attributed to China [31,32]. According to [33], the successful implementation of Big Data empowered technologies, which made a significant contribution to poverty reduction in Guizhou Province of China, serves as a good illustration. For example, the “Poverty Alleviation Cloud”, launched in 2015, facilitated the integration of data from different government departments, helping to identify people experiencing extreme poverty, and making available relevant policies and welfare programmes via the same platform. Moreover, it also enabled customized poverty alleviation, such as matching skilled labor with suitable employment or local products with buyers worldwide. By promoting e-commerce and helping people to find employment, this program helps people to get back on their feet by widening their sources of income and, in doing so, preventing poverty.
3.2. Big Data and Zero Hunger
SDG 2, “zero hunger”, aims to “end hunger, achieve food security and improved nutrition and promote sustainable agriculture” [16]. As can be seen in Figure 3, although “zero hunger” does not track or show the same level as the “no poverty” Google Trend index, worldwide attention and interest still indicates a significant boost since the inception of the SDGs in 2015. The growing trend in recent years is especially evident.
As the negative impacts of climate change have become clear in recent decades, agriculture is a sector that is increasingly under scrutiny. It faces significant challenges owing to its dependency on natural resources. It is a central to SDG 2 as it is critical to the supply of food but it is also one of the main sources of Greenhouse Gas (GHG) emissions [34]. Researchers have been integrating Big Data technologies and sustainable agriculture using a diverse range of approaches [35,36,37], e.g., smart farming [38] and precision agriculture [39], both of which aim to assist data-driven decision making in agricultural management in order to improve the volume and quality of production. Big Data empowered technologies, such as smart sensor, IoT, cloud computing, big Earth data, and data mining techniques, are just some of the techniques being employed.
Apart from the mainstream focus on agriculture, it is noteworthy that Big Data analytics have also been used to investigate possible applications in the food supply chain [40,41], ensuring food security [42,43], food safety [44], personalized nutrition [45], as well as the reduction in food waste [46]. All of this work can help to achieve the goal of zero hunger.
3.3. Big Data and Good Health and Well-Being
SDG 3, “good health and well-being”, aims to achieve “healthy lives and promote well-being for all at all ages” [16]. As evident in Figure 4, the importance of health and well-being can be assessed thanks to the wide adoption of big health data platforms and health monitoring smart devices devices [47],. There have been a large number of studies cited in the literature which have studied the close integration of Big Data-related technologies and health and well-being [48,49,50,51,52,53,54,55]. In the early stages of Big Data development, Ginsberg et al. (2009) [56] discovered that early detection of seasonal influenza epidemics could be improved by monitoring the volume of relevant queries on search engines, although subsequent reports asserted that such predictions can be very inaccurate. Pioneering research such as this further extended the applications of Big Data-related technologies to the prediction and monitoring of infectious diseases [57]. Moreover, the availability of social media Big Data and rapidly advancing Big Data analytics has further improved care regarding mental health [58,59].
The big clinical data available nowadays have facilitated the development of precision medicine [60], where the information extracted from the data collected on each individual patient can help to improve patient profiling and achieve more accurate diagnoses. It can even be used to enable personalized medicine [61,62] or genomic medicine [63]. Although the availability of big medicine and healthcare data has empowered the advancement of Big Data analytics, implementations in health and medicine, challenges for data integration and synchronization remain, not least, ethical, privacy and security considerations [49,64].
3.4. Big Data and Quality Education
SDG 4, “quality education”, sets out to “ensure inclusive and equitable quality education and promote lifelong learning opportunities for all” [16]. The significance of quality education has always been a crucial pillar in the development of society worldwide as is evident from the, on average, 80+ Google Trend index in Figure 5. The era of digitalization has made available diverse and accessible learning resources to the majority of people via different and still rapidly evolving smart devices [65]. Big Data and its analytics have also made significant contributions to the transformation of education [66,67,68], with a particular focus on exploiting its rich and complex data [69,70].
The education sector has direct access to big education data, including information regarding student admissions, learning resources, teaching and learning performance, research progress and impacts, etc. These big education data provide a information full of potential for educational research. In recent years, research has mostly addressed the strategic development of intelligent tools/platforms in the implementations of learning analytics systems [71,72]. Real-time analytics empowered by Big Data, ML and AI technologies have enabled the possibility of a personalized education experience, where individualized learning [73] and evaluating [74,75,76,77] can help to eliminate unconscious biases and offer students a more encouraging learning environment. Data-driven learning analytics platforms offer the promise to advance the efficiency of learning and teaching by providing informed bespoke recommendations tailored to both the teachers and students based on their strengths and weaknesses [78,79]. It may also facilitate the monitoring of student behavior on campus to aid student safety [80], expand accessible learning resources [81] and help in satisfying the ever-changing demand of new courses/skills and curriculum development in a timely fashion. Moreover, Big Data empowered technologies have also been developed to predict the risks of school dropout to make available early warning systems [82,83,84].
3.5. Big Data and Gender Equality
SDG 5, “gender equality”, sets out to “achieve gender equality and empower all women and girls” [16]. According to UN Women [85], general gender inequalities exist in each of the 17 SDGs; as women play a crucial role in food production, ensuring better nutrition and education for children, there are substantial inequalities in term of educational opportunities, employment and welfare, access to healthcare, etc. SDG 5 is central to the achievement of all 17 SDGs, and key to delivering the transformative vision of the 2030 agenda [86]. As can been seen in Figure 6, there has been growing attention given to gender equality since 2014—the Google Trend index, in general, nearly doubled before and after the launch of SDGs in 2015.
The debate often tends to focus on the negative impacts of AI on employment, especially regarding the unbalanced vulnerabilities affecting gender as well as the fact that nearly 80% of AI professionals worldwide are male [87]. Integrating Big Data with traditional data sources in the meantime still offers tremendous opportunities to reduce gender inequality and social bias regardless of its form [88]. It will be crucial to collect more evidence on the impact of the technological revolution to better understand the evolving nature of gender inequality [89,90]. The Data2x organization, which aims to promote the visibility of gender sensitive data and advocates for the extraction of valuable insights from Big Data empowered technologies to support gender quality, published a report in 2019 summarizing a selection of case studies. For example, [91], Big Data such as mobile phone data, geospatial data and social media data have been used to identify gender gaps in education, employment, safety, social and financial status and credibility, etc. Ongoing projects in recent years have addressed the significance of obtaining equal access to/control of/capability with Big Data for women and girls to design more precise policies, particularly in regions or areas that are experiencing severe gender inequality [92]. Moreover, the expanding scale of social media data and mobile data along with the rising Big Data analytics today enables real-time monitoring of gender discrimination or gender inequality concerns worldwide [93].
3.6. Big Data and Clean Water and Sanitation
SDG 6, “clean water and sanitation”, aims to to “ensure availability and sustainable management of water and sanitation for all” [16]. Access to clean and safe drinking water is a basic human right and is of paramount socio-economic importance. According to 2019 WHO/UNICEF statistics, despite ongoing worldwide efforts, there are still 2.2 billion people around the world who do not have access to safe drinking water and 4.2 billion people lack access to safe sanitation [94]. Despite the growing attention on SDG 6 (as evident by Figure 7) and the significant improvements that have been achieved since 2015 (see [94]), it remains a huge challenge to achieve the ambitions of this goal for many countries in Africa [95]. Big Data techniques will most likely have a more important role to play in this area in the future.
WASH is referred by WHO/UNICEF as a collective term for Water, Sanitation and Hygiene. To improve access to and the sustainability of WASH, the WHO/UNICEF has set out eight practical steps to tackle the problem [94]. In summary, these guidelines address policy domains, such as resources, infrastructure, workforce, monitoring and management. The means and applications of Big Data are investigated below to assess whether the aforementioned domains can be assisted by Big Data to improve WASH access and achieve sustainable development.
Before any specific measures can be implemented, a first, fundamental step is to undertake an accurate evaluation of the current status and problems, followed by designing a standardized approach to monitor and evaluate progress. This was an essential step at the beginning of the WHO and UNICEF Big Data project, starting with the Joint Monitoring Programme (JMP) for WASH in 1990. Over these years, with partnerships at country, regional and global levels, this programme has established about 5,000 national datasets covering more than 200 countries/regions worldwide. Considering the close relations of WASH to climate, agriculture as well as healthcare, the UN also brought together other integrated data collection and monitoring schemes. Based on the official list by UN Water website, these include: the Global Environment Monitoring System for Water (GEMS/Water); FAO’s Global Information System on Water and Agriculture (AQUASTAT); and UN-Water GLobal Analysis and Assessment of Sanitation and Drinking-Water (GLAAS). Together, these form the comprehensive and high quality Big Data of WASH. This continuously monitored database reflects the progression of SDG 6, bringing insights for regional and global level policy making and enables substantial parallel data analyses and research as is evident by their regular progression reports and publications, which are made available via the UN-Water official website.
In addition to conducting households surveys, smart sensors/meters, IoT and cloud computing technologies have been playing increasingly important roles in the real-time data collection, visualization and analytics [96,97,98,99,100] for WASH, and also for the climate and agriculture sectors.
Some remarkable developments include the remote monitoring of reservoirs and water supply using satellite data—e.g., the Sentinel-1 Program [101], which predicts flood risk and water balance using Earth observation and hydrological data (i.e., rainfall, temperature, etc.) [102], using chemical sensors for real-time water quality monitoring and pollution tracing [103,104], applying advanced machine learning techniques for improving water quality forecasting and urban water management [105,106], etc. The water quality sector was examined by [107] and the water treatment industry by [108]. Both papers reviewed the functionalities and challenges of implementing Big Data technologies in their respective sectors, and therefore will not be reproduced here. In addition to direct technical applications of Big Data to the water sector outlined above, researchers in China have also investigated water sustainability from the perspective of the public’s attitude in order to collect more accurate information on recycled water use [109].
3.7. Big Data and Affordable and Clean Energy
SDG 7, “affordable and clean energy”, aims to “ensure access to affordable, reliable, sustainable and modern energy for all” [16]. It is a challenging goal with increasing relevance (as can be seen in Figure 8) to ensure widely accessible and sustainable energy worldwide. The UN reports that about 13% of global population still lacks access to modern electricity.
A recent paper by Hassani et al. (2019) [110] investigated the impact of Big Data on energy poverty, in which the authors discuss the issues of data collection, data standardization, Big Data merging, processing and advanced data analytics. They also review the practical implementations of Big Data-related technologies to fight energy poverty. One of the key domains where Big Data techniques were applied to achieve SDG 7 was in the identification and prediction of energy poverty using satellite imagery, especially for regions/countries where accessing energy is limited. According to [110], where a comprehensive review of relevant research can be found, research projects that investigate satellite image and energy poverty have been used worldwide. Researchers have also been advancing techniques by combining most recent machine learning and AI technologies.
Apart from identifying regions/countries suffering the most from energy poverty, Big Data technologies also play an important role in grid planning and energy management [111,112,113,114], improving energy efficiency and preparedness for peak demand. Smart grids with smart meter sensors [115,116] allow real-time observations of energy demand and supply capacity, better understanding of energy consumption pattern, timely prediction of peak demand and reductions in potential energy wastes. Smart networking aims to optimize energy efficiency and sustainability [117] and will play an important role in the construction of smart cities [118,119,120,121]. It is noteworthy that a few SDGs are directly relevant to the establishment of smart cities, not least SDG 9 and SDG 11. The improvement of energy sustainability is of great importance for climate change as well. Although the importance of Big Data for each SDG is addressed separately in this paper, there are many interconnections. It is important to recognize these and cross-cutting synergies, applications and functionalities across the SDGs as a whole when promoting a sustainable environment at the global scale.
3.8. Big Data and Decent Work and Economic Growth
SDG 8, “decent work and economic growth”, aims to “promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all” [16]. Economic well-being has always been of enormous interest and has been closely monitored, but the focus on production (as measured by GDP) has raised considerable debate and interest in the risks of a “growth trap”, perhaps best encapsulated by Richard Layard’s (2011) [122] quip “anyone who believes in indefinite growth on a physically finite planet is either mad, or an economist”. Equally, with mounting fears regarding the impact of AI on employment, there is also growing interest in the future of work (see
Economic growth is a broad subject covering various economic activities and sectors. the contribution of Big Data in assisting with the achievement of this SDG is mainly through data analytics, as well as compilation and monitoring of indicators. As previously mentioned, the UN network has been putting together a metadata repository [22] of all SDG indicators. The author of a recent work [123] has summarized the existing and potential contribution or significance of Big Data for the compilation of SDG indicators. He notes that Big Data could offer more efficient ways of compiling indicators, enabling more segmented and bespoke analyses, generating more granular or even completely new statistics and allowing dynamic analyses by using linkable data, etc.
To discover efficient and novel indicators, researchers explore social media or other relevant news data to reveal economic performance information that is not typically discoverable from traditional data sources or existing indicators [124,125]. Considering the close correlation between access to energy and economic status, there are obvious links to SDG 7, where researchers try to better understand regional energy poverty status using satellite imagery datasets. Complex and enormous economic activity or networks also contain valuable information on economic growth. For instance, the level of logistical network development and trade activity traffic may reflect the economic development of a certain region. Recent research [126] investigated transportation network Big Data and found valuable indicators of regional economic development status.
Moreover, Big Data can benefit economic modeling as they may facilitate the use of more advanced econometric tools [127]. Researchers can apply advanced data mining and machine learning techniques to improve their analyses of existing data. The are some examples where Big Data have contributed to improve macroeconomic forecasts and indicators—for example, Google data [128,129], economic news [125], web data [130,131] and individual bank card transaction data [132]. Others have exploited innovative methodologies or techniques Big Data has to improve practical functionality of economic modeling—for instance, early warnings of economic crisis using artificial neural networks [133] or trade nowcasting [134]. Other examples include [135], where the authors outline the methodological details of the New York Fed Staff Nowcast, or [136] where a digital AI decision tree is used to improve predictions of Russian GDP. Considering the data rich environment of the financial markets, it is also one of the sectors that first embraced Big Data technologies. A recent review of machine learning techniques, which were adopted for financial market forecasting, can be found in [137].
3.9. Big Data and Industry, Innovation and Infrastructure
SDG 9, “industry, innovation and infrastructure”, aims to “build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation” [16]. Thanks to the supportive policies and beneficial packages offered across different countries and regions to achieve sustainable transformation, industries have adopted better supply chain management to enable sustainable infrastructure and operations [138,139,140]. Big Data and predictive analytics have been used to significantly improve social and environmental performance [141]. Moreover, as evident in Figure 10, there has been growing attention given to SDG 9 since 2016. Note that a recent review [142] has evaluated relevant studies on Big Data in sustainability and supply chain management for the past decade. Moreover, a recent paper [143] has summarized a detailed list of drivers for industries to enable and implement sustainable supply chain management.
In regard to the sustainability of industrial infrastructure, popular practices and developments include sustainable building, efficient energy consumption and green energy, sustainable logistics and pollution management. SDG 9 is closely connected to SDG 11 in the context of sustainable cities, SDG 12 in regard to sustainable production and SDG 13 in terms of managing pollution. These will be separately addressed later in their respective sections. Of course, data can also be considered infrastructure—arguably a centrally important infrastructure in the era of data-driven decisions and also of relevance to SDG 17 [144,145]. The focus here is to list some specific Big Data implementations that closely reflect efforts to achieve SDG 9. The authors of [146] investigated industrial IoT and its benefits for promoting industrial sustainability in China. Other researchers have addressed the functionalities and advancements of Big Data in certain aspects of the supply chain, i.e., disaster resilience [147], risk predication and mitigation [148,149], maintenance optimization [150], cyber security [151], green product innovation [152], efficient logistics [153,154,155], etc.. There is also considerable literature investigating the interactions between Big Data and specific industries to achieve industrial sustainability such as the service industry in [156], agri-food industry in [157], and manufacturing industry [158] to name a few. Similarly, authors across the world have attempted to present the status of the fourth industrial revolution by country and region, for instance, Korea in [159], Germany [160], China [161] and India [162], etc.
3.10. Big Data and Reduced Inequality
SDG 10, “reduced inequality”, aims to “reduce inequality within and among countries” [16]. Within and across country inequality has been a persistent concern, and although huge progress has been achieved worldwide, it remains one of the most problematic challenges facing countries. Interest in this area is evident from the relatively consistent, high and growing interest in reducing inequality in Figure 11.
Inequality not only exists in terms of gender and race but exists across all domains. Thus, SDG 10 refers to all types of discrimination and lack of access/opportunity in the broadest sense. The increasing digital inequality (the “digital divide”) and the “data divide” arising from Big Data and relevant technologies have been well flagged—see [163], (UNCTAD, 2019) [164], where researchers have highlighted the negative effects usually caused by the lack of knowledge or skill sets due to education inequality or limited access to information and advanced technologies. Meanwhile, Big Data and and Big Data technologies have also been contributing in many domains to reduce inequality.
As outlined in SDG 1, satellite imagery data and geodata have been assisting researchers and governments to identify the most economically deprived regions. The practice of identifying poverty and lack of access to energy could also be adapted to reflect levels of inequality and promote corresponding policy adjustments. Similarly, the potential of social media data has been explored for better mapping socioeconomic status [165], helping to reveal insights to support policy response. Advanced data mining techniques and analytics have also been applied to social media data to monitor and extract hidden discrimination and inequalites [166]. Despite concerns regarding the negative effects of social media surveillance or “dataveillance”, there is a noticeable rise in the use of Big Data policing, which is shaping the future of law enforcement [167]. Apart from social media platforms, mobile phone data, internet data, and other transactional data have also been exploited by researchers for better mapping and economic forecasting status [26,168,169].
Although data science can be a challenging subject to master, it is becoming more accessible via online training, videos, e-book, Apps and distance learning, etc. Most of these resources are free and open to the public. The deluge of data availability corresponds with a growing awareness of accessing and sharing information. In today’s Big Data era, accessing educational resources and knowledge sharing is easier than ever. Advanced machine learning techniques have been applied to education Big Data to detect early warning signs of school dropout, contributing to a reduction in educational inequality [82,170]. Equality of opportunity and chances of success have been investigated by [171] with the assistance of modern Big Data, where researchers developed a better understanding of inequalities in upward mobility to inform policies regarding tax, housing and education, etc., and policies designed to provide support to those with disadvantaged backgrounds.
3.11. Big Data and Sustainable Cities and Communities
SDG 11, “sustainable cities and communities”, aims to “make cities and human settlements inclusive, safe, resilient and sustainable” [16]. As the concept of smart/green cities emerges [172,173,174,175], this SDG is attracting great interest worldwide, as can be seen in Figure 12. A very recent paper [34] has presented research investigating the use of Big Data in sustainable urban planning and infrastructure. In summary, it shows that recent research mainly addresses aspects of urban informatics [176], Big Data architectures [177], smart/green cities, smart transportation, as well as smart grids and smart buildings, which are closely linked to sustainable energy and SDG 7. The purpose of this paper is not to provide a complete or comprehensive literature review of this extensive field, but rather to address some of the more iconic applications in order to illustrate how Big Data can be helpful in achieving SDG 11.
Data-driven technologies have penetrated almost every aspect of human life. Thus, establishing smart cities will require the sustainable transformation to almost everything we encounter in our daily activities. Real-time transportation network data can be used to improve transport efficiency, report traffic congestion and conditions, monitor risks and optimize maintenance and operation [178,179,180]. In recent years, a bike sharing smart network has sprung up across the world thanks to the advancements of IoT, GPS sensors and awareness of sustainable transportation [181]. The logistics service sector (i.e., parcel and food delivery services) also relies heavily on the efficiency of transportation networks. Big Data analytics have been widely used for sustainable route planning and the optimization of daily operations [182,183,184]. Along with the development of real-time Big Data monitoring and processing techniques, urban Big Data have also contributed to crime prevention, consolidating the safety aspects of smart cities. Specifically, Big Data can assist the police to identify and predict potential crime heated areas, understand crime associated issues, and optimize policy force resources [185,186,187]. Moreover, the interactions of Big Data and smart cities can also be extended to energy management [188], water supply and monitoring [189], waste management [190,191], etc.
3.12. Big Data and Responsible Consumption and Production
SDG 12, “responsible consumption and production”, aims to “ensure sustainable consumption and production patterns” [16]. Growing income disparities and a growing general awareness regarding sustainability has put this SDG under public scrutiny worldwide—see Figure 13. Food waste is an example of this, where almost 9% of the world’s population suffers from hunger while simultaneously the UN Food and Agriculture Organization estimates that approximately a third of all food produced (an estimated 1.3 billion tonnes) is wasted. There is growing debate and concern regarding the increasing volumes of electronic waste due to the rapid pace of technological development. These global issues are ringing an alarm bell, signaling future food and other resource shortages. On the other hand, promising novel solutions and improvements that Big Data and associated technologies have to offer to combat irresponsible consumption and production worldwide are being proposed.
As highlighted in SDG 9, sustainable supply chain management has been the priority for many industries. Such improvements will also have a positive impact on SDG 12 [192,193]. For example, Big Data empower supply chain management [157], helping to reduce agriculture waste or machine learning [194] to detect defective horticultural products. The authors of [195] applied machine learning to make production planning more sustainable in a food company in Spain. Other researchers focused on product life-cycle management and studied the advancements and influences of Big Data and relevant technologies on each stage, from supply and production to maintenance, recycling and waste disposal [196]. Big Data analytics, in general, also benefit the sustainable consumption of energy [197], which directly links to SDG 7, SDG 9 and the green city aspect of SDG 11. Energy, transportation, consumption and waste recovery are also identified in [198] as the new fields for sustainable consumption research. To date, however, the existing literature dealing with retail, marketing or e-commerce Big Data investigations has focused on the search for a better understanding of consumption behavior, and this has mainly been used to bring insights to further promote consumption. There are concerns that the search for economic prosperity has paid insufficient attention to irresponsible and unsustainable production and consumption. For instance, there is a substantial collection of literature regarding sustainable consumption [199], which mainly deals with the problem from a consumer behavior perspective but seems to lack any research that incorporates Big Data analytics and relevant advanced technologies.
3.13. Big Data and Climate Action
SDG 13, “climate action”, aims to “take urgent action to combat climate change and its impacts” [16]. As the world has been experiencing more and more extreme environmental crises over the past decade, climate change has become a priority global scale policy issue, but general public awareness on such a crucial issue, while improving, has remained relatively low until recently (see Figure 14). That said, searching for the term “climate change” rather than “climate action”, the growing public interest is clear—especially since 2019. A recent paper [34] has reviewed and summarized the current status of Big Data applications in climate change-related studies; therefore, it is not necessary to reproduce this work. Nevertheless, some of the most important applications and functionalities of Big Data and relevant technologies in the field of climate change are presented below.
Hassani et al. [34] summarize the main functions of Big Data enabled techniques in climate change studies, such as: observing, monitoring, understanding, predicting, and optimizing. Almost all relevant use cases have employed Big Data for one or a combination of these functions. Moreover, the authors of [34] also identify the most established applications to be energy efficiency and intelligence (can also link to SDG 7 and SDG 11), smart farming and agriculture (this also relates to SDG 2) and forestry, sustainable urban planning and infrastructure (see also SDG 11), natural disaster and disease assessment, and other advanced supports, i.e., supply chain management and product life-cycle management (with connections to SDG 9). Some of these applications have already been addressed in other SDGs. The focus here, therefore, will concentrate on those applications that have not yet been mentioned, also noting recent novel applications that have been introduced since [34].
Similar to applications noted in SDGs 1 and 7, satellite imagery data have also been explored for sustainable forest management [200,201,202,203,204], i.e., identifying forest fire risk, monitoring deforestation, regional forest development planning and assisting forest management policy decision making, etc. Of relevance, the authors of [203] specifically focus on China, investigating the uses of Big Data in sustainable forest management, where they identify the relevant applications. The field of smart forestry is also addressed in [204], where the practical realization of Big Data analytics is examined.
There has also been a branch of research that studies meteorological Big Data, examining air pollution monitoring and prediction. These studies usually develop along similar lines to research on smart cities or urban Big Data [205,206,207]. Examples include the high impact factor of air pollution identified using advanced Big Data mining techniques in [208]. Honarvar et al. [209] used digitalized urban Big Data to predict particulate matter without having recourse to expensive air pollution sensor networks. Some researchers attempt to combat climate change by identifying and dealing with the source, namely, greenhouse gas emissions. Machine learning techniques were applied in [210] to promote greenhouse gas reduction technologies, and to better forecast greenhouse gas emissions [211].
3.14. Big Data and Life below Water
SDG 14, “life below water”, aims to “conserve and sustainably use the oceans, seas and marine resources for sustainable development” [16], meaning the sustainable development of marine and coastal ecosystems, reducing excessive usage of ocean resources and pollution and protecting marine species and coastal biodiversity.
Concerns regarding climate change, ocean acidification and marine pollution have raised awareness of life below water and the crucial role that our oceans play in the Earth’s ecosystem. Google search trends in Figure 15 indicate a growing interest in these topics. Moreover, more and more countries have mapped out “Blue Economy” strategies to combat environmental problems (i.e., resource scarcity, water crises, ocean acidification, etc.) while promoting maritime or blue growth [212,213]. The authors of [212] note the close interconnections between the Blue Economy and SDGs 14, 17, 16, 15 and 12.
A review of the literature highlights that the application of Big Data empowered technologies towards this goal mainly focuses on ocean/marine Big Data observation and monitoring [214,215], ocean and freshwater ecosystem diagnostics [216], sustainable fishery management [217], as well as marine life tracking and research. Remote sensing technologies enable real-time observation of ocean Big Data [218], i.e., sea temperature, sea level pressure, swells, salinity, humidity, surface winds, wind waves, etc. Satellite imagery data and monitoring of meteorological observations also contribute to better mapping the status of ocean ecosystems. Big Data analytics were applied to improve essential fish habitat designation in [219], discover marine habitats in [220], and track global fishing activities in [221]. Probst (2020) [222] illustrates the benefits of using advanced data technologies to improve the transparency of fishing activities.
A project funded by the European Union Horizon 2020 programme, datAcron, addressed how to monitor fishing activities. The findings were reported in [223]. The authors in [224] presented insights on how marine animal tracking data could inform the design and formulation of conservation policy. Similarly, to improve the protection of our ocean ecosystems, other researchers have investigated the role of marine predators [225]. These studies focused on the Southern oceans [226] and Antarctic Ocean [227].
3.15. Big Data and Life on Land
SDG 15, “life on land”, aims to “protect, restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests, combat desertification, and halt and reverse land degradation and halt biodiversity loss” [16]. It is focused on global public good issues and attracts relatively high public interest, as is evident from Figure 16. Considering the broad scope of “life on land”, this SDG also shares common goals with some of the other SDGs. For instance, there is a close interconnection between forest management and SDG 13 climate action, where various applications for Big Data empowered technologies to achieve sustainable forestry were identified; SDG 6 included sustainable water ecosystems; and sustainable farming and agriculture is crucial to ending hunger (SDG 2). These arguments will not be presented again here. Rather, this section will address applications benefiting “life on land” that have not yet been addressed.
As with sustainable forest management, Big Data empowered technologies have also been applied to assess and classify desertification levels [228], monitoring desertification via satellite imagery [229], and assessing the effectiveness of sustainable land management policies [230]. Another related study [231] proposed Big Data empowered water resource management, agricultural planning and desert geoengineering as a future solution to combat desertification. Similar research, investigating land degradation has also embraced Big Data empowered technologies [232], i.e., sensor monitoring and assessing soil [233], monitoring via satellite data [234], land classification [235], etc. It is of note that some smart farming and smart agriculture research projects have also attempted to address the sustainability of land management by combating desertification and land degradation [236].
In addition to Earth observation, satellite imagery, and remote sensor IoT Big Data, an emerging trend is the exploitation of social media Big Data to monitor natural disasters in real time [237], as well as post-disaster management [238,239]. The review paper [240] in 2018 presents a summary of Big Data applications being used to assist natural disaster management along with other Big Data sources that could be explored. Thanks to the wide availability of mobile devices and their associated “tailpipe” data, many research projects have incorporated geographic information systems and global positioning systems to obtain real-time location information, which can support more efficient disaster responses and victim rescue, as well as provide more accurate damage assessment [241].
Biodiversity of life on land is also important for maintaining sustainable ecosystems. Overexploitation and climate change pose significant risks of biodiversity loss. Big Data empowered ecology studies can provide richer information, quickly providing lessons for the evolution and dynamics of biodiversity [242]. Researchers in [243] explored using Big Data for ecology and species distribution modeling, to better understand the effects of climate change on biodiversity. Given the explosion in the availability of biodiversity data and more developed techniques in processing data in scale, König et. al. [244] address the importance of biodiversity data integration and present some effective solutions.
3.16. Big Data and Peace, Justice and Strong Institutions
SDG 16, “peace, justice and strong institutions”, aims to “promote peaceful and inclusive societies for sustainable development, provide access to justice for all and build effective, accountable and inclusive institutions at all levels” [16]. This SDG has attracted relatively limited public attention with noticeable changes only becoming evident quite recently (see Figure 17). This is reflected by the relative scarcity of reliable academic research available for a literature review.
Big Data and its relevant technologies have been used to combat crime, as reviewed in [245]; however, the use of such technologies in the justice system are still limited. Simmons (2018) [246] examines the use of predictive algorithms in justice systems to improve accuracy, efficiency and fairness. Similar research, investigating the uses of Big Data and its relevant technologies in criminal justice settings, can be found in [247]. A field receiving relatively more attention, and supporting SDG 16, is smart governance [248]. The concept of a “responsive city” is examined in in [249], which summarizes the advantages of data-smart governance for promoting inclusive and engaging communities. Researchers in [250] evaluated the merits as well as challenges of information and algorithm empowered governance in the Big Data era. Some research has focused on infrastructure and platforms for smart governance [251,252,253], whereas Lin (2018) [254] presents a comparison of smart governance applications in selected Western countries and China. There are close connections between smart governance and smart city research (covered in Section SDG 11) [255,256], many of which have addressed data intelligent governance decision making in the context of smart cities. For instance, a citizen-centered Big Data governance framework for smart cities was discussed in [257] and validated using a blood donation governance case in China.
Widely available social media Big Data have also been explored for potential use in governance and politics—for instance, event detection using sentiment analysis [258,259,260,261]. These techniques can also reveal valuable information on public opinion regarding governance, providing real-time predictions of political elections [262,263,264,265]. This is, of course, predicated on the assumption that social media Big Data truly reflect public opinion. Considering the prevalence of fake news, mendacity, and other attempts to manipulate or even redirect Big Data, this is a very strong assumption. Analysts must treat data cautiously and be aware that conclusions emerging from Big Data analytics may be misleading.
3.17. Big Data and Partnerships for the Goals
SDG 17, “partnerships for the goals”, aims to “strengthen the means of implementation and revitalize the global partnership for sustainable development” [16]. Since the MDGs, global partnerships have been the common approach to reaching a consensus and a shared understanding of issues in order to achieve complex MDGs and SDGs. It is no surprise that institutions and governments worldwide increasingly appreciate the need to join forces to combat the complex and interconnected challenges posed by economic, social and environmental progress (see Figure 18).
Cloud computing and greater data storage have enabled the merging and sharing of Big Data from various resources. Several challenges arising from the data deluge are explored in [13], including the different channels of data collection, differences in regulated data specification and standards, asymmetric access to information, knowledge based data pollution, and barriers to data sharing driven by hidden profit/interest/business secrets. The potential of Big Data, in the context of sustainable development, could be realized by close global partnerships dealing with the same SDGs, advanced cloud computing and the use of Big Data processing platforms [266]. These partnerships could be built upon mutually consenting policies or agreements and could address concerns regarding the fair exchange of data and data ownership. Blockchain technology may offer some advantages here, as it was fundamentally invented to assist and record transactions of value without the authorization of a central authority. Only later was it used to enable the digital world of cryptocurrency [13]. Using blockchain technology would facilitate Big Data sharing and processing in a secure manner, which would assist the process of building global partnerships. The fusion of Big Data and blockchain technologies is comprehensively discussed in [13]. Examples of innovative applications empowered by blockchain technology in the context of SDGs can be found in [267,268].
4. Conclusions and Future Research
This paper contributes to the existing literature by comprehensively investigating the interactions of two rather broad yet emerging subjects—Big Data and sustainable development. This is, to the best of our knowledge, the first academic paper that seeks to summarize the value of Big Data and Big Data technologies to the UN SDGs. The intention is not to present all existing studies, but rather to share the most up to date knowledge and bring insights for future research. Since the adoption of the 2030 Agenda for Sustainable Development in 2015, one-third of the time has passed and many remarkable achievements have already been achieved. While it is encouraging to witness the growing attention being given to the SDGs around the world, it is more important than ever to promote and encourage knowledge sharing via all reliable academical channels. With this in mind, this paper has explored the fruitful applications of Big Data and associated technologies that can assist with the realization of sustainable development, as defined by the UN 2030 Agenda and the 17 SDGs. As noted above, several of the applications in fact cross-cut, cover or are applicable to several SDGs simultaneously. This is appropriate as the SDGs themselves are all closely interconnected, and only be addressing all of them can we create a sustainable, safe, prosperous and equitable environment where humanity and nature can live together.
While many practical applications and technological advancements have emerged in recent decades, challenges remain. Some SDGs (i.e., SDGs 12, 14, 16, 17) receive relatively less attention than others. This is reflected by the relative scarcity of reliable information evident during our searches for relevant use cases and technological advancements. Despite the rapid advancements in affordable Big Data technologies, they remain prohibitively expensive for some. Perhaps this explains why we observed that more applications tend to focus on profit or value generating aspects of development. Specifically, more use cases address the potential for economic and industrial growth by promoting greater consumption, rather than highlighting solutions to curb the negative consequences of overexploitation, pollution and irresponsible or unsustainable consumption.
Although some regions or countries have realized the importance of long-term sustainable development and are willing to make short-term sacrifices to embrace advanced technologies, this remains challenging for regions or countries in the digitally disadvantaged position. Such unbalanced and asymmetric development is closely related to the digital divide debate. Specifically, access to information and knowledge may be much easier for certain cohorts than for others, placing some at an advantage and others in a relatively disadvantaged position. Reducing digital inequality will require a global collective effort—for instance, the sharing of information, knowledge, technologies and essential infrastructure. Moreover, there should be greater focus globally to address dimensions of sustainable development and applications of Big Data that are not necessarily profit or value generating but tackle the consequences of unsustainable activities. For instance, how do we find a sustainable balance between the technological revolution and the benefits we received from it whilst managing the rapid growth of electronic waste? Ignoring these questions will almost certainly lead to greater problems in long term and may eliminate the benefits that these technological advancements brought us in the first place.
As more and more people understand the value of Big Data and its associated technologies and begin to use it, the overexploitation of Big Data, or overlooking of Big Data ethics, could have serious consequences. Those who have more advanced Big Data skills may use data analytics for their own advantage, perhaps violating the human rights of others in the process. As noted above, fake news on social media may distort analytics, but could also be intentionally disseminated to alter public opinion. Individual data could be collected and sold for non-consensual marketing or for worse reasons, such as identity theft or blackmail. It will only be possible to achieve sustainable development if the use of Big Data empowered technologies is accompanied by consideration of privacy, ethics, human rights, and legislation, to find a balance between the common good and individual preferences.
Finally, our investigations suggest that Big Data on their own are no longer viewed as a solution but rather as a contributing element. Thus, data integration (integrating Big Data with traditional sources) is seen as the key issue. We also observe a move away from the purest view of Big Data towards a broader view, described broadly as “non-traditional sources”, which include, for example, citizen science, where everyone shares and contributes to data collection, integration, monitoring and analyses.
Author Contributions
All authors contributed to the paper equally. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
No applicable.
Informed Consent Statement
No applicable.
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found here:
Conflicts of Interest
The authors declare no conflict of interest.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2021 by the authors.
Abstract
The launch of the United Nations (UN) 17 Sustainable Development Goals (SDGs) in 2015 was a historic event, uniting countries around the world around the shared agenda of sustainable development with a more balanced relationship between human beings and the planet. The SDGs affect or impact almost all aspects of life, as indeed does the technological revolution, empowered by Big Data and their related technologies. It is inevitable that these two significant domains and their integration will play central roles in achieving the 2030 Agenda. This research aims to provide a comprehensive overview of how these domains are currently interacting, by illustrating the impact of Big Data on sustainable development in the context of each of the 17 UN SDGs.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 Research Institute of Energy Management and Planning, University of Tehran, Tehran 1417466191, Iran; Department of Business and Management, Webster Vienna Private University, 1020 Vienna, Austria
2 Faculty of Business and Law, De Montfort University, Leicester LE1 9BH, UK;
3 United Nations Conference on Trade and Development (UNCTAD), 1211 Geneva, Switzerland;
4 School of Mathematics, Statistics and Computer Science, Tehran 1417935840, Iran;