Wednesday, November 19, 2008

Chemical security accident report released

This is a copy of an old post from my personal blog, placed here so that data posts will be all in one place.

The Center for American Progress has released Chemical Security 101: What You Don’t Have Can’t Leak, or Be Blown Up by Terrorists. Whatever the awkwardness of the title, the report is excellent, identifying the 100+ most hazardous chemical facilities in the U.S. and listing specific actions they could take to change their operations to eliminate the hazard, rather than treating the problem as one for gates and guards. I'm familiar with the report because I spent a significant amount of time crunching numbers for part of it.

If and when I get through global warming databases on this blog, I'll write about chemical accident ones. The database used for this report, the Risk Management Plan database, has a particularly interesting history. The chemical industry and the Bush administration crippled what was supposed to be a publicly accessible database by restricting access to it to reading rooms where you could only get information on ten facilities at a time. Otherwise, they said, terrorists would use the data for targeting, even though all the actual incidents so far have been straightforward industrial accidents. And then they proceeded to block one law after another that would have required industry to actually do anything to protect people from these hazards. Computer people like to talk about Security Through Obscurity -- well, this was year after year of Security Theater For Obscurity.

Monday, November 17, 2008

eGRID

This is a copy of an old post from my personal blog, placed here so that data posts will be all in one place.

This is the second post in a series on global warming data, about the basics of U.S. EPA's eGRID database.

eGRID's home page claims that it is “the preeminent source of air emissions data for the electric power sector,” and as for as the U.S. is concerned, that is probably true. It contains air emissions data for nitrogen oxides (Nox) and sulfur dioxide (SO2), which are of concern because they contribute to ground-level smog and acid rain. It contains data on emissions of mercury, a persistent bioaccumulative toxic. And it contains data on emissions of the greenhouse gases carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O). It also has information on how much power is generated, and how much fuel of each type is used, so that you can see how efficient each plant is.

eGRID is an odd database in that it's not a data collection; no one ever fills out a form to report their emissions to eGRID. Instead, it's a combination of data from various data collections, together with model estimates. Most of the data that go into eGRID were originally collected through a scatter of databases held by EPA and the Department of Energy. For EPA for the last decade or more, it's been very difficult to get any new, major data collections, so information has to be cobbled together from a number of sources, none of them designed to exactly address the problem.

One of the advantages of a yearly data collection is that it has to be released every year. The primary disadvantage of eGRID, in the past, was that it came out irregularly and by the time it came out it sometimes used old versions of the data sources that it drew from. For instance, it's been released about once a year since 1998, except that it wasn't between May 2003 and Dec 2006. The Dept. of Energy databases that it draws from currently seem to be available up through 2006, and eGRID only has data through 2005. Still, a version has just been released – as of October 2008 – and that makes it up-to-date enough for all but the most picky and expert uses.

One of the large advantages to using eGRID is that some data quality work has been done to match the various databases together. I had to do that once, for a report for an environmental group that we couldn't use eGRID data for, and it's something that you don't want to do unless you have no other choice. Even more important, it upgrades all plant ownership, parent company, merger data and so on to a single date: December 31 2007 in this case. Electric utilities try all sorts of tricks to confuse their paper trail or to take advantage of regulatory exemptions or make financial maneuvers; there has been a lot of buying and selling of power plants among various entities. Making sure that all of that is upgraded to a single date is a significant advance. What this means is that, for instance, a power plant that last reported in 2005 will be listed in eGRID as being owned by whichever company owned it on Dec 31 2007, not by whatever company owned it in 2005.

eGRID is used in all sorts of regulatory initiatives, for environmental disclosure, and in governmental and nonprofit electricity-information Web sites such as Power Profiler, Power Scorecard, or CARMA. If you have a casual interest in your local electric power, you're probably better off with one of those. But it's good for some people to look at eGRID, because more information is available through it directly, and because it sets the baseline that so many people work from.

There are a couple of reasons why eGRID may not be the best source for generally tracking electricity, as opposed to tracking sources of emissions due to electricity generation. For one thing, it doesn't include any purchases of power, e.g. from Canada. For another, the net generation amounts that it reports subtract generation used by the power plant itself, but don't take transmission and distribution losses into account, so the electricity that people actually use will have a lower efficiency with respect to emissions than is reported in eGRID.

So how do you use eGRID? It's really just a set of three Excel files, so all you do is download them and open them on your computer – you can use OpenOffice. The most basic file holds information for each generating plant, and for subunits within plants. A second file, the aggregation file, adds things up – it combines individual plants into totals by state, owner, operator, parent company, grid, and for the whole U.S. That has almost all of the same data fields as the plant file, so once you learn one of them, you learn the other. The third of the files is for state imports and exports, and you can probably ignore it.

(Note, though, that the aggregation file handles parent companies badly, in my opinion. The people who made eGRID considered a parent company to be a holding company, not whatever company ultimately controls the plant, including the plant itself if there is no other owner. Therefore, some plants in eGRID don't have parent companies. That means that the parent company file, unlike the other aggregations, doesn't add up to the total of the individual plants. I may try to get the people who make eGRID to change this, put in a parent company for every plant, and indicate whether a parent company is a holding company or not with some kind of data field.)

But the plant file is probably the most useful. EPA doesn't like to release information about individual plants, or companies, within its general summary documents which are all that most people see if they see anything. It likes to release numbers about states, regions, industries, and so on, but saying that specific company ABC is responsible for x percent of pollution? You'll very rarely see that from EPA. So you'll have to dig it out for yourself.

The plant file contains sheets on generators and boilers: components of plants. Most users will probably skip those, although it's worth noting that they include years when the equipment went in service, which can be important for some things. But you'll probably want the information on plants themselves. There's about 5000 of them. You can look at the eGRID technical docs to explain the data elements.

What are some of the more useful data elements? Well, for the purpose of global warming, I'll look at CO2, ignoring methane and N2O for now. That's “plant annual CO2 emissions (tons)”, or PLCO2AN. A quick descending sort of the sheet by that field, and the top plant is the Scherer plant in Georgia, whose parent company is Southern Co. With 26 million tons of CO2, that's one percent of the total CO2 emissions for the whole database right there. There's only 68 plants that emitted more than 10 million tons. Those 68 plants account for 36% of the total emissions from electricity generation. That's about 12% of the total U.S. CO2 emissions from all sources, including cars, industry, houses, and residential electricity used by those light bulbs that people are always telling you to change whenever you say that we need to do something about global warming.

But those plants generate electricity too, of course. How much? Well, there the whole thing is complicated by the fact that a single power plant might generate electricity from a wide range of fuels. So just totaling up all the electricity from those plants is going to be a bit off. But I can total up the net generation from combustion sources for them. It's 31% of total U.S. generation from combustion sources – we're getting 36% of the CO2 for 31% of the power from combustion. It's 22% of our power from all sources.

What I'd like to see for these top plants is how efficient they are in burning coal. Coal is worse, from a CO2 standpoint, than natural gas, and coal burning efficiency varies by the equipment and the grade of coal used. But I can't quite see how to do it. The database includes an efficiency number that divides emissions of CO2 by the net generation from all combustion sources, but that includes oil and gas as well as coal. There's a net generation only from coal number, but there doesn't appear to be a CO2 emissions only from coal number, so I don't see how to figure out an emissions rate that includes only coal in both the numerator and denominator. Perhaps I could get it by digging into the boiler and generator data – but this post is too long as it is.

So, finally, here's a table of the 6 largest plants for 2005 for CO2 emissions, those with more than 20 million tons. You could get these yourself through the eGRID tables, but I might as well list them here for Google indexing purposes:

Top U.S. CO2 Emitting Electric Power Plants, 2005

StatePlant namePlant operatorParent company2005 CO2 tons
GASchererGeorgia Power CoSouthern Co26,040,793.5
ALJames H Miller JrAlabama Power CoSouthern Co22,509,466.8
GABowenGeorgia Power CoSouthern Co22,156,373.7
INGibsonDuke Indiana IncDuke Energy21,746,394.3
TXMartin LakeTXU Generation Co LPEnergy Future Holdings (TXU)21,593,119.5
TXW A ParishNRG Energy20,703,129.9

Wednesday, November 12, 2008

Global warming -- U.S. sources

This is a copy of an old post from my personal blog, placed here so that data posts will be all in one place.

Global warming -- or anthropogenic global climate change, to be more exact -- is one of the most critical contemporary environmental problems. It's also one that the Obama administration has promised to do something about. It's safe to assume that in a couple months, various proposals are going to begin to fly. What data do we have that would bear on these proposals? Over those months, I'm going to go over some of the material here. It's a good excuse to refamiliarize myself with it, since the last time I worked with it was in 2003.

I'm not going to address the science at all, or engage in any way with global warming denialists. The evidence that this is a real and important problem is unequivocal at this point, and anyone wanting more information on it should check out the IPCC, or if they prefer a group blog, RealClimate, or if they prefer more chatty, individual blogs: Deltoid, Stoat, Rabett Run, Only In It For the Gold, or More Grumbine Science.

The questions I'm going to look at bear more on politics and infrastructure. Where are the largest sources of the problem? Who owns them? How can people get information that helps them figure out their local power structure, if it comes down to local or state politics rather than national politics?

Global warming is caused by releases of greenhouse gasses, primarily carbon dioxide, CO2. The overall U.S. estimates of human sources of these gasses are in the U.S. Greenhouse Gas Inventory. Looking at its Executive Summary, the total sources for 2006, the latest year available, were about 7000 Tg Co2 equivalents. (Don't worry about the units for now; just think of it as 7000 something.) Where did that come from? 2,300 was from electricity generation. 1,850 was from fossil fuels burned for transportation. 860 was from fossil fuels burned for industrial use, 330 residential, and all other types of sources were smaller. That means that roughly a third of the problem is from electric power plants, a quarter is from cars and other vehicles, and about a tenth from large industrial uses. Those three together make up more than 70% of the problem.

And those three types of sources are susceptible to infrastructural / political intervention. Affected industries' preferred defenses involve either saying that the market should decide, or diffusing responsiblity to consumers -- as if individual volunteerism like replacing light bulbs or turning down the thermostat a few degrees or driving a few less miles could really have enough of a cumulative effect to matter. (These actions can help, yes, but in the end you need to change infrastructure. I may get into that in a future post.) But no one builds a large power plant without governmental involvement; it's not really a market decision. The miles per gallon of car fleets is already regulated. And individual, large sources respond to pressure from organized communities.

Electricity generation is clearly the largest single piece. What is the picture for current sources? Here's the best map I could find, for 2005:

That map is from eGRID, one of the best U.S. databases available when it's up-to-date. You may not be able to read the legend, but the black color is coal, the worst fuel from a greenhouse gas perspective. There's a few major things to notice. First, large hydro, the blue color, already dominates the areas where it's available. Nuclear, in red, has a substantial presence, but no more is going to be built any time soon. California and New England are already starting to diversify. The Mountain West and midwest isn't, but the emissions are comparatively small there in any case. The most immediate problem areas are Texas and the Illinois/Indiana/Ohio/Pennsylvania corridor.

The political situation in Texas may not be the greatest, but Texas has abundant potential solar and wind energy resources, and my guess is that it's going to take advantage of them. The corridor is where I think local or state action might be most important.

What kind of information might assist in that action? Well, with a database like eGRID, people can identify which actual plants, owned by which companies, are producing the majority of the problem. And then there's a number of different outcomes people can push for -- shutting down coal plants and building renewable energy plants are only the most obvious ones. One type of early intervention can be made through efficiency improvements at existing power plants.

Imagine a set of ten coal-burning power plants, all alike. If you want to reduce their greenhouse gas emissions by 10% and keep the same electricity generation, one way to do it is to shut one of them down and build a renewable power plant with the same output. But another way is to increase the efficiency at which the plants convert coal into electricity by about 10% so that you can shut one of them down and not build anything. We're going to want to eventually shut the coal plants down and replace them, of course. But putting in new equipment, such as more efficient turbines, can be cheaper and quicker for the initial stages.

A database like eGRID has information on every individual electric generating plant in the U.S. -- power generated, greenhouse gas emissions, and even some information on how up-to-date the equipment is. Using it, people can change the problem from a big, fuzzy one involving "large power companies" into one in which they know where their power is being generated, where the greenhouse gas sources are, and which source contributes what. That suggests points of potential pressure.

Perhaps that pressure won't be necessary -- perhaps a national cap-and-trade program will be implemented, and the problem will magically be solved by pseudo-market means. (I have my doubts about that, too.) Perhaps the data won't really be useful to local or state groups, or will be insufficient. Perhaps they will be useful to national policy people, although they have their own researchers for summarizing this kind of thing. But the particular tool of public access to data is the area that I know something about, so I'm going to assume that it's going to be useful to someone.

The next post in this series will be about eGRID.