Category Archives: Measures, Statistics & Technicalities

ISO country codes for USITC DataWeb output

The US International Trade Commission’s Interactive Tariff and Trade DataWeb provides detailed data describing aggregate trade flows between the United States and other economies. It reports country names without reporting a country code. Many data sources (e.g. the IMF’s World Economic Outlook database) report both a country name and a standardized ISO three-letter country code. Due to the use of unofficial names (Burma vs Myanmar, East Timor vs Timor-Leste, etc) and incon- sistent formatting (“Grenada Island” vs “Grenada”, “Saint” vs “St” vs “St.”, etc), merging using country names rather than standardized country codes is unreliable.
I’m making available a correspondence between USITC DataWeb country names and ISO country codes that I built in the course of my research. You can download it as a tab-delimited text file and a Stata data file from my website. The (very brief) documentation is here.

The US International Trade Commission’s Interactive Tariff and Trade DataWeb provides detailed data describing aggregate trade flows between the United States and other economies. It reports country names without reporting a country code. Many data sources (e.g. the IMF’s World Economic Outlook database) report both a country name and a standardized ISO three-letter country code. Due to the use of unofficial names (Burma vs Myanmar, East Timor vs Timor-Leste, etc) and inconsistent formatting (“Grenada Island” vs “Grenada”, “Saint” vs “St” vs “St.”, etc), merging using country names rather than standardized country codes is unreliable.

I’m making available a correspondence between USITC DataWeb country names and ISO country codes that I built in the course of my research. You can download it as a tab-delimited text file and a Stata data file from my website. The (very brief) documentation is here.

The schoolboy error that will not die

In a special report on managing information, the Economist writes that Wal-Mart’s “revenue last year, around $400 billion, is more than the GDP of many entire countries.”

This is an apples-to-oranges comparison that means nothing. GDP measures value-added. Revenue measures gross value. Please never print such a comparison again.

Martin Wolf tackled this in a FT column in 2002. Jagdish Bhagwati took it on in In Defense of Globalization in 2004. And Paul De Grauwe and Filip Camerman even devoted 15 pages to measuring the size of companies correctly. Yet this “elementary howler” keeps rearing its head, time after time.

Addendum (22 March): My very brief letter to the Economist on this point appeared online.

Ravallion on Pinkovskiy and Sala-i-Martin

Martin Ravallion is open to the idea that African poverty has been improving to the last 15 years, but he is cautious regarding the quality of our data and methods:

Maxim Pinkovskiy and Xavier Sala-i-Martin (PSiM herafter) have confidently claimed that “The conventional wisdom that Africa is not reducing poverty is wrong” and that “African poverty is falling and is falling rapidly.” This sounds like good news. But is it right?

We must first be clear about what we mean when we say “poverty is falling”. What many people mean is falling numbers of poor. However, PSiM refer solely to the poverty rate—the percentage of people who are poor. (There is no mention of this important distinction in their paper.)…

Here we agree: aggregate poverty rates have fallen in Sub-Saharan Africa (SSA) since the mid-1990s.  Shahoua Chen and I came to exactly the same conclusion in our research, for the World Bank’s global poverty monitoring effort, although our methods differ considerably and (no surprise) I prefer our methods.

However, Chen and I also point out that the decline in the aggregate poverty rate has not been sufficient to reduce the number of poor, given population growth…

Two points to note here: (i) Chen and I show that the poverty decline in SSA tends to be larger for lower poverty lines (in the region $1-$2.50 a day) and (ii) PSiM’s method attributes the entire difference between GDP and household consumption to the current consumption of households, and they assume that its distribution is the same as in the surveys. These assumptions are very unlikely to hold, and they give an overly optimistic picture.

In effect, PSiM are using a lower poverty line than us…

PSiM do not tell readers just how few survey data points they have actually used after 1995. Indeed, readers of their paper may be surprised to hear that there is any uncertainty about the trend decline since the mid-1990s; their main graph has 30 annual data points since 1995. But these are not real data points in any obvious sense; rather they are synthetic (model-based) extrapolations based on national accounts and growth forecasts.

We have national household surveys for all but 10 of the 48 countries in SSA since 1995. However, for only 18 countries do we have more than one survey since 1995; for 30 countries, there are is at most one survey since 1995.

As we warn explicitly in our paper, this is not yet sufficient survey data to be confident about the (promising) downward trend for Africa’s aggregate poverty rate that PSiM have announced with such confidence.

Hopefully we will see a confirmation of the emerging downward trend for Africa in the years ahead, as more (genuine) data emerge.

HT: Larry W-S.

Addendum: Blattman beat me to it and has more thoughts.

Measuring protectionist actions during the crisis: What’s the counterfactual?

Dani Rodrik:

The GTA’s latest report identifies no fewer than 192 separate protectionist actions since November 2008, with China as the most common target. This number has been widely quoted in the financial press. Taken at face value, it seems to suggest that governments have all but abandoned their commitments to the World Trade Organization and the multilateral trade regime.

But look more closely at those numbers and you will find much less cause for alarm. Few of those 192 measures are in fact more than a nuisance. The most common among them are the indirect (and often unintended) consequences of the bailouts that governments mounted as a consequence of the crisis. The most frequently affected sector is the financial industry.

Moreover, we do not even know whether these numbers are unusually high when compared to pre-crisis trends. The GTA report tells us how many measures have been imposed since November 2008, but says nothing about the analogous numbers prior to that date. In the absence of a benchmark for comparative assessment, we do not really know whether 192 “protectionist” measures is a big or small number.

Finicelli, Pagano, and Sbracia: “Trade-revealed TFP”

To the extent that you’re willing to believe in a particular model, you can pull off some interesting exercises, such as “trade-revealed TFP“:

We introduce a novel methodology to measure the relative TFP of the tradeable sector across countries, based on the relationship between trade and TFP in the model of Eaton and Kortum (2002). The logic of our approach is to measure TFP not from its “primitive” (the production function) but from its observed implications. In particular, we estimate TFPs as the productivities that best fit data on trade, production, and wages. Applying this methodology to a sample of 19 OECD countries, we estimate the TFP of each country’s manufacturing sector from 1985 to 2002. Our measures are easy to compute and, with respect to the standard development-accounting approach, are no longer mere residuals. Nor do they yield common “anomalies”, such as the higher TFP of Italy relative to the US.

Via Agent Continuum.

Pierce & Schott: A Concordance Between HTS & SIC/NAICS

This paper looks like it may be helpful to applied empirical researchers:

This paper provides and describes concordances between the ten-digit Harmonized System (HS) categories used to classify products in U.S. international trade and the four-digit SIC and six-digit NAICS industries that cover the years 1989 to 2006. We also provide concordances between ten-digit HS codes and the five-digit SIC and seven-digit NAICS product classes used to classify U.S. manufacturing production. Finally, we briefly describe how these concordances might be applied in current empirical international trade research.

Measuring distance

Neat:

The CEPII has built and made available two datasets providing useful data for empirical economic research including geographical elements and variables. A common use of these files is the estimation by trade economists of gravity equations describing bilateral patterns of trade flows…

Distance calculation requires information on geographical coordinates of at least one city in each of the country. The simplest measure of geodesic distance considers only the main city of the country, reported here with the English and French names, latitude and longitude. In most cases, the main city is the capital of the country. However, for 13 out of the 225 countries, we considered that the capital was not populated enough to represent the “economic center” of the country. For these countries, we propose the distances data calculated for both the capital city and the economic center…

There are two kinds of distance measures: simple distances, for which only one city is necessary to calculate international distances; and weighted distances, for which we need data on the principal cities in each country. The simple distances are calculated following the great circle formula, which uses latitudes and longitudes of the most important city (in terms of population) or of its official capital. These two variables incorporate internal distances based on areas provided in the geo_cepii.xls file. The two weighted distance measures use city-level data to assess the geographic distribution of population inside each nation. The idea is to calculate distance between two countries based on bilateral distances between the largest cities of those two countries, those inter-city distances being weighted by the share of the city in the overall country’s population.

More on Penn World Table data revisions

Highlights from NBER WP 15455, which I flagged last week:

How fast did Equatorial Guinea grow over the two and a half decades beginning in 1975? The natural place to turn to answer such a question is data from the Penn World Table (PWT), which is the most widely used source for cross-country comparisons for the level and growth rate of GDP. According to its latest available version (PWT 6.2) Equatorial Guinea is the second-fastest growing country among 40 African countries. However, according to its previous version (PWT 6.1), which was released four years before, Equatorial Guinea was the slowest growing country. Indeed, as table 1 shows, if one were to compile the list of the 10 fastest and slowest growing countries in Africa between 1975 and 1999, PWT 6.1 and PWT 6.2 would produce almost disjoint lists…

The variability of growth data has implications for the cross-country growth literature. Results based on annual data prove to be less robust across versions of the PWT than are results based on 10-year averages and/or levels of GDP. And results are sensitive to sample, especially the inclusion of small countries…

Plotting the data suggests that data quality might matter for revisions. The left-hand panel in figure 6 shows differences in 29-year annual average growth rates (1970—1999) for countries with data quality grades of A or B. The right-hand panel shows the same for countries with grades of C or D. All the major variation across versions of the table occurs in the countries with lower grades…

We have examined many of the leading papers in the growth literature based on PWT 5.6 or 6.1. In each case, we attempted to run exactly the same specifications and samples, but using version 6.2 of the table instead. This approach cannot prove that a particular set of results is right or wrong, but it may illustrate patterns in terms of what kind of results are more or less robust…

In all, we tested the robustness of 13 papers in the growth literature. Note that we did not check all specifications in all papers. Rather we concentrated on what appeared to us–or to others citing the work–as the “main” results. The lower part of Appendix table 2 lists nine papers for which we found basically no or small changes in results. In addition, there were more substantial changes for four papers: Ramey and Ramey (1995), Jones and Olken (2005), Hausmann, Pritchett, and Rodrik (2005), and Aghion, Howitt, and Mayer-Foulkes (2005).

The Penn World Table and growth regressions

This abstract caught my eye, though I haven’t looked at the paper:

This paper sheds light on two problems in the Penn World Table (PWT) GDP estimates. First, we show that these estimates vary substantially across different versions of the PWT despite being derived from very similar underlying data and using almost identical methodologies; that this variability is systematic; and that it is intrinsic to the methodology deployed by the PWT to estimate growth rates. Moreover, this variability matters for the cross-country growth literature. While growth studies that use low frequency data remain robust to data revisions, studies that use annual data are less robust. Second, the PWT methodology leads to GDP estimates that are not valued at purchasing power parity (PPP) prices. This is surprising because the raison d’être of the PWT is to adjust national estimates of GDP by valuing output at common international (purchasing power parity [PPP]) prices so that the resulting PPP-adjusted estimates of GDP are comparable across countries. We propose an approach to address these two problems of variability and valuation.