Despite the fact that we face an increasing scarcity of valuable resources, one area of growing abundance is data. From information-producing activities such as the global supply chain, to our own personal behaviors, the digital world is producing data on a mind-boggling scale. Eric Schmidt, the chairman of Google — an organization that knows a thing or two about data — stated that every two days we are creating the same amount of information that we did from the dawn of civilization to 2003.
On this scale, we no longer refer to it simply as data, we call it big data.
As devices and behaviors produce increasing volumes of data, a new visibility is emerging. For example, look at the insight we get from Google searches that result in a better understanding of the spread of seasonal flu. We are able to see formerly hidden patterns and make more-informed decisions. Organizations know more and more about you. Privacy is quickly becoming irrevocably passé. Mass production turns into mass personalization. Data at our fingertips is changing the way be live.
We’re moving from a post-industrial economy to a data economy.
Understanding Big Data
Regardless of whether your agency has 10 or 10,000 people, it’s a safe bet that you’re producing and storing data; it’s the one area where there is no deficit and no future likelihood of one. If we consider data as a valuable resource — which we should — then we’re all in a surplus position. That’s happy news for a sector so beset by negativity. That said, sadly, converting that surplus into value for the communities those agencies serve has not yet been broadly realized.
In short, governments are mostly sitting on an abundant resource, neglecting opportunities that could — if leveraged correctly — produce enormous benefits for their communities.
What is this government data that I’m talking about? On the federal site, Data.gov, there are almost 400,000 sets of data. These cover every type of subject one could imagine. For example, there is the visitor log for the White House; the register of all federal government contractors; and unemployment statistics. There’s data on energy, health, manufacturing, and education. And these are only the datasets that have been posted for easy consumption; there are many more that still need to be posted.
And this phenomenon is not restricted to the federal level. On the city and county data website for San Francisco, for example, there are local crime statistics, and the location of every movie made there since 1924. My own city, Palo Alto, posts a variety of data that includes details on all our trees—a most revered Palo Alto resource — and demographics. In addition, we recently posted five years of financial information, which is data that taxpayers care deeply about.
Realizing the Value
But what’s so novel about posting government data? Many will point out that we’ve been doing that since the first public Web sites arrived back in the 1990’s.
There is truth in that statement; however, the current trend has a distinctive advantage to it. This data is being posted in a form that can be more easily used by Web and mobile applications. That means it’s more accessible, and this is no small point. It’s called Open Data. If the data is available for software engineers, data scientists, and other interested stakeholders, then all manner of new solutions can be built.
These solutions won’t get built by cash-strapped public agencies; rather, they will be created by the private sector, activists, residents, and other interested stakeholders. Already, citizens from across the nation are applying their skills to build useful applications such as apps for smartphones that have exceptional utility for communities (review highlights of some of those apps and local efforts, including ours in Palo Alto). It’s a win-win: public agencies incur little or no cost, and the community receives the benefits.
Many communities host “hackathons” to promote their Open Data initiatives. These are events at which software developers focus on spinning up new applications—sometimes in a matter of hours—using a variety of datasets made available by the city. In Palo Alto earlier this year, we shut down a city block and 2,000 people turned up to build applications, create art, and network with one another.
We’re only at the very start of realizing the value of Open Data. One could easily imagine a time in the not-too-distant future when data is available to citizens at the moment of its creation. For example, an agency makes a payment for a product, and that transaction is immediately published and available to interested parties. Not only does real-time publishing create unprecedented transparency and accountability, it also makes the consuming applications vastly more useful.
I believe Open Data is foundational to building and enabling a digital city. This Open Data drives the development of useful applications; it is a convener of public-private partnerships; and it is a prerequisite to open government. And if your goal is to simply enable a lower cost and efficient manner to deliver your public agency services, then Open Data is still foundational.
Making It Happen
I’m often asked if Open Data is purely a product of Silicon Valley and its technically proficient community: “Isn’t Open Data only within the reach of tech-savvy communities like Palo Alto?”
Believing that Open Data requires significant technical expertise could not be further from reality. The biggest hurdle to enabling Open Data is recognizing it as an important part of your agency’s future, and then acting on it. Then focus should be on data value, not the volume of datasets.
There are many vendors ready to help any size agency, and the costs can be low enough for most to afford. In fact, with a little technical help—either from within your organization or by a willing volunteer in your community — there are Open Source solutions that can be deployed at negligible cost. Open Source is not the solution for everyone, but it’s certainly an option.
I’ll concede that this is a complex space, and any discussion here can only be superficial. While the dialogue is underway in some niche circles, I think it’s time for a broader national movement. We have to get the data topic on the table and start talking about how we can make it work for our citizens.
That’s my goal here: raising awareness to provoke you to learn more.
Let there be no doubt: Managing data and its value represent a core competency for both private enterprises and public agencies, from now and into the foreseeable future. Those that recognize this and assign priority to a data strategy will soon see benefits.
Are you ready to make data a priority?
Editor’s Note: Published October 17, 2012 and updated in 2014.
About the Author
Dr. Jonathan Reichental is the Chief Information Officer for the City of Palo Alto, where he is focusing on modernizing the existing technology environment, and is pushing the boundaries of innovation in local government such as open data and broader civic participation through mobile devices. Prior to joining the City, Jonathan served as the CIO of O’Reilly Media, an integrated media company. He also spent over 15 years at PricewaterhouseCoopers in a variety of technology-related roles. Dr. Reichental holds several degrees including a Ph.D. in Information Systems, and informally advises several technology start-ups. He is a highly sought-after public speaker at forums such as TEDx, and has been featured on media such as National Public Radio (NPR), Forbes, CIO magazine, InformationWeek, Computerworld, and Government Technology magazine. His TV appearances include a segment on CNBC. You can follow him or begin a conversation through Twitter via @reichental.