The article "Big Data: An Institutional Perspective on Opportunities and Challenges" at http://libguides.uhv.edu/Reserves under my name gives two examples of using Big Data. Can you give an example of

Baban Hasnat is a professor of international business and economics in the College at Brockport, State University of New York. The author expresses his appreciation to Steve Breslawski, Mustafa Canbolat, and Barry Hettl er for their help in improving this article. 580 ©2018, Journal of Economic Issues / Association for Evolut ionary Economics JOURNAL OF ECONOMIC ISSUES Vol. LII No. 2 June 2018 DOI 10.1080/00213624.2018.1469938 Big Data: An Institutional Perspective on Opportuni ties and Challenges Baban Hasnat Abstract: The data revolution is already reshaping how knowl edge is produced, business conducted, humanitarian assistance handled , public officials elected, and governance enacted. Economists rely on data to desc ribe, interpret, and forecast economic activity. Despite the rich tradition of us ing large datasets, institutional economics have shied away from big data. This artic le describes, reviews, and reflects on big data, with a particular focus on ec onomic development. It illustrates the vast opportunities and challenges for big data as an important tool for the benefit of the public. It suggests that big data an d data analytics, if used properly, can provide real-time actionable information that c an be used to identify problems and needs, offer services, and provide feedback on the effectiveness of policy action. Keywords: big data, Google trends, humanitarian assistance JEL Classification Codes: O1, O4 The world is undergoing a data revolution. The revo lution is already reshaping how knowledge is produced, business conducted, humanita rian assistance handled, public officials elected, and governance enacted (Kitchin 2014). Data now pours in from nearly everywhere at all times and from every devic e — this is undeniably an era of big data. Big data is produced anyway (data exhaust), i t is often accessible in real time, and it arises from the merging of different sources . It is an endless source of data for the economic and social world. Its impact on the ec onomy has been referred to as “the new oil” (Pringle 2017). Government agencies, international organizations, a nd private institutions have been collecting economic and social data for a long time. Economists have relied on these sources to describe, interpret, and forecast economic activity. Macroeconomists, in particular, have been at the forefront of exploi ting large datasets. For example, Arthur F. Burns and Wesley C. Mitchell’s (1946) pio neering search for patterns and 581 Big Data regularities in the data led to the identification of the business cycle. Similar work by Simon Kuznets (1941) led to the creation of the Nat ional Income and Product Accounts. Unfortunately, current institutional econ omists have shown very little interest in it. A review of the table of the conten ts and abstracts of the Journal of Economic Issues , the Journal of Institutional and Theoretical Economics , and the Journal of Institutional Economics found no articles on big data. This is surprising because early institutionalists displayed a particular penchant f or data to understand economic issues and to make policy recommendations. My objective in this article is to describe, review , and reflect on big data, with a particular focus on economic development. I illustr ate the vast opportunities and challenges that big data presents as an important t ool for the public good. I also show that big data and data analytics, if used properly, can provide real-time actionable information that can be used to identify problems a nd needs, offer services, and provide feedback on the effectiveness of policy act ion. My inspiration for this study comes from the work of Wesley C. Mitchell, who beli eved that acquiring the facts and “detailed sifting of data outside the context of a worked out model” (Hirsch 1976, 206) is the correct approach to understanding econo mic issues. What Is Big Data?

The term “big data” emerged in the 1990s and gained momentum in the early 2000s.

Similar to many new concepts, big data has been var iously defined and operationalized. Clearly, size often comes to mind when referring to big data. It is commonly defined as the astonishing amount of struc tured and unstructured data that are being generated, captured, and stored at a n amazing speed. An example of big data would be Walmart’s customer transaction data. Every hour, Walmart handles over one million transactions, which are captured i nto its databases that are estimated to contain over 2,560 terabytes of data (1 terabyte = 10244 byte) — equivalent to 167 times the information contained in all the books in the Library of Congress (Economist 2010). In a single day, there are about 5.2 billion Google searches, twenty-two billion texts sent, and more than four million hours of con tent uploaded to YouTube, with users watching 5.97 billion hours of YouTube videos (Schultz 2017). In regard to hardware and software, big data is often defined as data that is too large and complex for processing with traditional database management tools. Paradoxically, what is considered big data today may become small data in five years due to advances in technologies, platforms, and analytical capabilitie s. The data science community concentrates on its characteristics and defines big data in terms of the 3V model:

volume (amount of data), velocity (speed of data flow), and variety (range of data types and sources). Other dimensions, such as variability (highly inconsistent with periodic peaks) and veracity (trust and uncertainty), are also added to the 3Vs to characterize big data (Gandomi and Haider 2015). The United Nations’ (UN) Department of Economic and Social Affairs (2015) classifies big data into three categories: (i) soci al networks (human-sourced information, such as Facebook, Twitter, blogs, Inst agram, YouTube, Internet searches, 582 Baban Hasnat text messages, etc.), (ii) traditional business sys tems (process-mediated data, such as data generated in the context of business transacti ons, e-commerce, credit cards, and medical records), and (iii) Internet of Things (mac hine-generated data, such as data produced by weather, pollution, and traffic sensors , in addition to mobile phone tracking, satellite images and logs registered by c omputer systems). Danah Boyd and Kate Crawford (2012) describe big data as a cultura l, technological, and scholarly phenomenon that rests on the interplay of technolog y (tools and algorithms to gather, store, etc., data); analysis (identifying patterns to understand economic, social, political, technical, and legal issues), and mythol ogy (the widespread belief that the large data sets offer a higher form of intelligence and knowledge).

The Use of Big Data for Development and Humanitaria n Assistance Big data increasingly concerns people’s real behavi or, not just the topics on which people seek information through searching Google or through posting on Facebook.

Posts on social media may or may not represent a pe rson, but how that person spends time, whom he/she associates with, what he/she buys , where he/she goes, and so on, can reveal an enormous amount about that person. Da ta scientists can predict, with reasonable accuracy, if the person will take out a payback loan, develop diabetes, or buy tickets (Pentland 2018). Thus, the growth of ne w technologies and new sources of data, often available in real time, offers a number of important dividends for development. It can improve the efficiency of low-i ncome people because they can access a wide range of information on price and cos t, thereby allowing them to save money and time. Development programs can be inclusi ve as socially and economically excluded groups increasingly voice their positions in defining development priorities.

This gives people access, empowerment, voice, oppor tunity, and security — something that Amartya Sen (1999) has been advocating as the goal of development.

Highlighting the importance of big data, the United Nations declares: “It is time for the development community and policymakers around t he world to recognize and seize this historical opportunity to address twenty -first century challenges, including the effects of global volatility, climate change, a nd demographic shifts, with twenty- first century tools” (United Nations Global Pulse 2 012, 6).

Big data and data analytics have appeared on policy makers’ radars only in the last few years. They are still in the early years o f understanding big data and its application in international development. Data anal ytics can be used to predict the characteristics of sub-groups such as, for example, school dropout rates and social welfare programs. An analysis of Twitter and Google trends and other social media can be used to assess the attitude of different gro ups to social problems and issues or their response to different prevention strategies. Big data can allow the integration of multiple sources of data into a data platform (UN F ood and Agricultural Organization’s AQUASTAT n.d.), mapping (Ebola outbr eaks, the spread of crop diseases, the location of victims in an earthquake, etc.), monitoring trends (rural poverty in China), and real-time early-warning sign als (hunger, drought, and ethnic conflict). These tools are now starting to be used in development programs and Big Data 583 583 emergency management. Below I highlight some succes sful cases in the use of big data in economic development and humanitarian assistance :

• No census has been possible in Afghanistan since 19 79 due to security concerns.

By combing through satellite imagery, remote sensin g data, global information system modeling, and demographic surveys, the Unite d Nations’ Fund for Population Activities was able to generate populati on maps for Afghanistan.

• Combining satellite and other sources of data, the Food and Agricultural Organization has developed AQUASTAT, which is a glo bal water information system that collects, analyses, and disseminates da ta and information on water resources, water use, agricultural water management and other information related to water (FAO).

• As mobile phones are becoming ever-present in the d eveloping world, it is now possible to turn mobile phone-generated data into a n economic development tool. For example, when mobile operators see airtim e top-off amounts decreasing in a certain region, it is a sign of los s of income in the region.

Policymakers can take action based on such informat ion before the information appears in official indicators (World Economic Foru m 2012). Mobile payments for agricultural products, input purchases, and sub sidies, combined with satellite images, may improve predictions of food production trends and incentives.

Early detection of production trends can help gover nments provide targeted assistance. Mining mobile phone data and proxies fo r poverty indicators have been developed, which gives policymakers a much mor e economical and continuous source of data on poverty trends (United Nations Global Pulse 2016).

• Policymakers are increasingly resorting to big data to manage epidemics and healthcare. For example, the human population movem ent is a challenge to eliminate malaria in developing countries. Amy Weso lowski et al. (2012) analyzed the travel patterns of fifteen million mob ile phone owners in Kenya over a period of twelve months. Combining travel da ta with census and survey data, together with spatially referenced malaria da ta, the global information system, and network analysis tools, the authors wer e able to identify, map, and quantify malaria risk areas. People’s lifestyles ca n be analyzed from the data generated by the use of smartphones and apps, which offer opportunities for primary prevention. In Iceland’s capital, Reykjavik , a combination of behavioral economics, big data, and mobile technology has help ed identify individuals at increased risk of lifestyle-related diseases (i.e., diabetics) and reverse their condition (Thorgeirsson 2017). Global Viral, a non- profit organization based in San Francisco, uses big data to identify the locati ons, sources, and drivers of local outbreaks of global epidemics up to a week ah ead of global bodies, such as the World Health Organization, that depend on tradi tional techniques and indicators.

• Big data shows particular promise in emergency mana gement. Immediately after the April 2015 earthquake in Nepal, Flowminder/Worl dPro used mobile phone 584 Baban Hasnat data to create a report on population displacement, which the UN used to coordinate humanitarian assistance. When a devastat ing earthquake struck Haiti in 2010, a group of volunteers took it upon t hemselves to analyze informational content on Facebook, Twitter, and tex t messages to locate affected areas and victims of the earthquake. The i nformation was quickly loaded — with more than 1.4 million edits — on stre et maps to construct a crisis street map to assist humanitarian action.

• Big data and data analytics can be used to gain ins ight into how firms respond to trade reforms or economic shocks. For example, t he US-based company Panjiva collects custom transaction information (e. g., source, destination, types of goods) via a machine-learning algorithm that cov ers data for eight countries, with 190 partner countries comprising 450 million r ecords. The data can convey anticipated action from the US, China, and Europe i n terms of trade policies in 2017, the prospects for the shipping industry, and the industries that have the most to win and lose from trade.

• Combing real-time traffic conditions with past traf fic patterns and weather forecasts, urban planners are better able to manage public transportation, the police and fire departments, and save time and gaso line for citizens and businesses. Applications of Big Data: Two Case Studies Several sectors of the economy that are important f or development are also quite data- intensive. I present two case studies to show the u se of big data. The first case shows the tracking of words. Figure 1 combines the actual unemployment data from the U.S. Bureau of Labor Statistics in October 2017 wit h simple Google searches for the word “unemployment” in the fifty U.S. states and Wa shington, D.C., at the same time. The figure clearly shows that the Google Tren d data correlates very closely with the actual unemployment statistics. The potential f or development is straightforward.

Each month, the Bureau of Labor Statistics’ employe es survey 60,000 households (approximately 110,000 individuals) over the phone or in person and inquire about labor force activities. The survey results are publ ished with a time lag of one month.

Google search trend data are available for free and can be accessed with a simple computer in real time. Figure 2 shows two indexes for China’s manufacturin g capacity. The PMI index provides an overall view of activity in the manufac turing sector. It is calculated from a monthly survey of approximately 430 purchasing mana gers in China. The SMI index was created by SpaceKnow, a company that specialize s in geospatial analysis.

SpaceKnow has taken over two billion satellite phot os in China over the last fifteen years. By analyzing changes in images across 6,000 industrial sites and incorporating the number of trucks in industrial parks and the fr equency of turnovers, it allows the company to measure the manufacturing sector and com petitive capacity. The PMI index comes with a four-week time lag, while the Sp aceKnow index can be received in real time. Big Data 585 585 Figure 1. State Unemployment Rate and Google Trend (October 2017) Figure 2. Index for China’s Manufacturing Sector Ac tivity Based on Actual Survey and Satellite Image The Challenge Despite its availability and advances in technologi cal and analytical capacity, big data has not been widely adopted as a tool for economic development because of the 586 Baban Hasnat number of challenges. One of the most sensitive iss ues for anyone wishing to explore the use of big data for economic development and po licymaking is privacy. Safety, diversity, pluralism, and democracy are compromised without privacy. Recent research has shown that it is possible to “de-anony mize” previously anonymized datasets. Much of the big data belongs to private c ompanies, and they may not have any incentive to share proprietary data for securit y and privacy concerns. Convincing private companies to allow economists to access bus iness data is difficult because there are important privacy and competitive issues that a private company must consider before it allows a researcher to access co mpany data (Hilbert 2016). Access to big data is a major challenge. Economists traditionally rely on their own survey data or government survey data for their research. Just because a government entity collects data (i.e., the IRS, the Social Security Administration, etc.) does not mean that economists will be able to acces s it easily. Certain protocols must be followed, which is generally time-consuming. For example, a Harvard researcher needed very high-level security clearance, which to ok months to obtain, and he also had to submit information on all his places of resi dence in the last ten years and could only access the IRS data set in secure data r ooms authorized by the central office (Einav and Levin 2013; Taylor, Schroeder and Meyer 2014). In addition, the process could favor researchers who have the resour ces, influence, and network to gain access to the data, which may lead to “data ha ves’ and ‘data have-nots” (Boyed and Crawford 2012). Big data is worthless unless it is used for improve d decision-making. To do this, organizations must resort to managing data (acquisi tion and recording; extraction, cleaning, and annotation; integration, aggregation, and representation) and data analytics (modeling and analysis and interpretation s). Data management for computation may be a challenge for developing count ries and will require major investments in information and communication techno logy. Accurate and actionable data mining and analysis, particularly in real-time , requires extensive technical skills.

Developing countries may not be able to afford the data scientists and infrastructure. A significant share of big data is generated from p eople’s perception, intentions, and desires. Policymakers have to be careful about concluding before making a judgment about what the data is really conveying be cause perception, intentions, and desires can change rapidly. Additionally, combining data from multiple sources may also mean magnifying the data flaws (Bollier 2010). Thus, theory and context matter even more for extremely large data sets. A case in point is how Google Trend data failed to predict flu trends. Google Flu Trends (GF T) is a big data tool that claimed to accurately predict flu epidemics in the US. Because GFT could predict an increase in cases of flu before the Center of Disease Control, it was trumpeted as the beginning of the big data era. Unfortunately, the GFT’s predi ction did not match reality.

Despite improving its model, Google has been persis tently overestimating the flu since at least 2011 (Fung 2014). Economists typically look for a particular dataset to answer an unsettled question, but data mining leads to searches for the unsettled question. Noting that big data often involves billions of observations, Hal V arian (2014) argued that the Big Data 587 587 concept of statistical significance, a mainstay in hypothesis testing, may be useless in certain situations. Others worry that a substantial project that uses big data is essentially descriptive because the data will revea l correlations rather than causality. Conclusion It is clear that the size, speed, and nature of big data are extremely valuable in certain situations and can be a powerful tool to address va rious social ills and development efforts by providing early warnings, real-time awar eness, and real-time feedback.

Nevertheless, we cannot ignore the data context and cultural context. We must not forget that big data has its limitations and biases . We need to consider these and use caution in interpreting the data. Correlation is no t causation and should not replace or act as a proxy for official statistics. In fact, big data should complement the existing data. At present, some motivated persons and non-pr ofit organizations are spearheading the use of big data for public benefit . The prerequisites for making big data effective for development are extensive techno logical infrastructure, generic software services, and human capacities and skills. Developing countries have a long way to go before big data becomes an everyday tool. References Bollier, David. The Promise and Peril of Big Data . Communications and Society Program. The Aspen Institute, 2010.

Boyd, Danah and Kate Crawford. “Critical Questions for Big Data.” Information, Communication & Society 15, 5 (2012): 662-679.

Burns, Arthur F. and Wesley C. Mitchell. Measuring Business Cycles. New York, NY: Columbia University Press, 1946.

Economist . “Data, Data Everywhere.” Special report. The Economist, February 25, 2010. Available at http:// www.economist.com/node/15557443. Accessed Nov 1, 20 17.

Fung, Kaiser. “Google Flu Trends’ Failure Shows Goo d Data > Big Data.” Harvard Business Review , March 25, 2014 Einav, Liran and Jonathan D. Levin. “The Data Revol ution and Economic Analysis.” Working Paper No.

19035. NBER, May 2013. Available at http://www.nber .org/papers/w19035.pdf. Accessed August 1, 2017 Gandomi, Amir and Murtaza Haider. “Beyond the Hype: Big Data Concepts, Methods, and Analytics.” International Journal of Information Management 35, 2 (2015): 137-144.

Hilbert, Martin. “Big Data for Development: A Revie w of Promises and Challenges.” Development Policy Review 34, 1 (2016): 135-174.

Hirsch, Abraham. “The A Posteriori Method and the Creation of New Theory: W.C. Mitche ll as a Case Study.” History of Political Economy 8, 2 (1976): 195-206.

Kitchin, Rob. The Data Revolution: Big Data, Open Data, Data Infr astructures and Their Consequences.

Thousand Oaks, CA: Sage Publishing, 2014.

Kuznets, Simon. National Income and Its Composition, 1919–1938. New York, NY: National Bureau of Economic Research, 1941.

Pentland, Alex Sandy. “Reinventing Society in the W ake of Big Data: A Conversation with Alex ‘Sandy’ Pentland.” Edge, August 30, 2018. Available at https://www.edge.or g/conversation/ alex_sandy_pentland-reinventing-society-in-the-wake -of-big-data. Accessed November 19, 2018.

Pringle, Ramona. “Data Is the New Oil.” CBC News, August 25, 2017. Available at http://www.cbc.ca/ news/technology/data-is-the-new-oil-1.4259677. Acce ssed November 27, 2018.

Sen, Amartya. Development as Freedom . New York, NY: Oxford University Press, 1999. 588 Baban Hasnat Taylor, Linnet, Ralph Schroeder and Eric Meyer. “Em erging Practices and Perspectives on Big Data Analysis in Economics: Bigger and Better or More of the Same?” Big Data & Society , July-December 2014, pp. 1-10.

Thorgeirsson, Tryggvi. “Hospital Impact — Behaviora l Economics and Big Data May Improve Health and Reduce Healthcare Costs.” FierceHealthcare, September 26, 2017. Available at https://www.fiercehealthcare.com/hospitals/hospital -impact-behavioral-economics-may-improve- health-and-reduce-healthcare-costs. Accessed Decemb er 8, 2017.

Schultz, Jeff. “How Much Data Is Created on the Int ernet Each Day?” Micro Focus Blog, August 10, 2017.

Available at https://blog.microfocus.com/how-much-d ata-is-created-on-the-internet-each-day.

Accessed on December 10, 2018.

United Nations. Department of Economic and Social A ffairs Statistics Division. Classification of Types of Big Data , ESA/STAT/AC.289/26 11. UNSTAT, May 2015. Availabl e at https://unstats.un.org/unsd/ class/intercop/expertgroup/2015/AC289-26.PDF. Acces sed December 1, 2017.

United Nation. Food and Agriculture Organization. A QUASTAT, n.d. Available at http://www.fao.org/ nr/water/aquastat/main/index.stm. Accessed December 5, 2017.

United Nations Global Pulse. Big Data for Development: Challenges and Opportunit ies. UN Global Pulse, May 2012. Available at http://www.unglobalpulse.org/sit es/default/files/BigDataforDevelopment- UNGlobalPulseMay2012.pdf. Accessed November 15, 201 7.

———. Integrating Big Data into the Monitoring and Evalua tion of Development Programs. UN Global Pulse, 2016.

Available at http://unglobalpulse.org/sites/default /files/ IntegratingBigData_intoMEDP_web_UNGP.pdf. Accessed November 16, 2017 Varian, Hal. “Big Data: New Tricks for Econometrics .” Journal of Economic Perspectives 28, 2 (2014): 3-28.

Wesolowski, Amy, Nathan Eagle, Andrew J. Tatem, Dav id L. Smith, Abdisalan M. Noor, Robert W. Snow and Caroline O. Buckee. “Quantifying the Impact of Human Mobility on Malaria.” Science 338, 6104 (2012): 267-270.

World Economic Forum. Big Data, Big Impact: New Possibilities for Interna tional Development. World Economic Forum, 2012. Available at http://www3.wefo rum.org/docs/ WEF_TC_MFS_BigDataBigImpact_Briefing_2012.pdf. Acce ssed December 10, 2017. Copyright ofJournal ofEconomic Issues(Taylor &Francis Ltd)isthe property ofTaylor & Francis Ltdand itscontent maynotbecopied oremailed tomultiple sitesorposted toa listserv without thecopyright holder'sexpresswrittenpermission. However,usersmayprint, download, oremail articles forindividual use.