Data and Programs

Some of the materials at this website can also be accessed from within Stata using the following command:
net from

  • ACLP data set in Stata format (compressed in a .zip file); this is the data used in Adam Przeworski, Michael Alvarez, José Antonio Cheibub and Fernando Limongi's book Democracy and Development: Political Institutions and Material Well-Being in the World, 1950-1990. Cambridge: Cambridge University Press, 2000. You can download the codebook from José Antonio Cheibub's website at Penn. I have used the codebook to add variable and value labels and marked the -9's as missing values for Stata. Otherwise, the data is unchanged.

  • 1999 Memphis municipal election returns and registration data, per precinct (cityonly.dta: Stata format). This data was used in my paper “Beyond the Crossroads” and may be a useful dataset for trying various ecological inference algorithms.

  • epcp is a neat little Stata routine that implements Michael Herron's "expected percentage correctly predicted" measure for limited dependent variable models; it also provides an "expected proportional reduction in error" measure. (Provided in ZIP format; also available as a TAR.GZ archive.) See Political Analysis 8(1) if you are particularly interested in the theory behind the measure. Harvey Palmer helped out with the extensions to ordered logit/probit and multinomial logit.

    An R version of "epcp" is in development as part of the cnlmisc package; it will (eventually) work with either the basic R estimators (lm, glm, polr, VGAM, etc.) or via the Zelig package. In the meantime, "hitmiss" in the pscl package does some of what epcp does. cnlmisc now also includes a simplified frontend to the separationplot package.

  • Data for other papers are available upon request; one of these decades, I'll get them all online.

  • Projects: Debian (packages maintained); Quantian; GNU R.