Category Archives: Computation

Why I encourage econ PhD students to learn Julia

Julia is a scientific computing language that an increasing number of economists are adopting (e.g., Tom Sargent, the NY FRB). It is a close substitute for Matlab, and the cost of switching from Matlab to Julia is somewhat modest since Julia syntax is quite similar to Matlab syntax after you change array references from parentheses to square brackets (e.g., “A(2, 2)” in Matlab is “A[2, 2]” in Julia and most other languages), though there are important differences. Julia also competes with Python, R, and C++, among other languages, as a computational tool.

I am now encouraging students to try Julia, which recently released version 1.0. I first installed Julia in the spring of 2016, when it was version 0.4. Julia’s advantages are that it is modern, elegant, open source, and often faster than Matlab. Its downside is that it is a young language, so its syntax is evolving, its user community is smaller, and some features are still in development.

A proper computer scientist would discuss Julia’s computational advantages in terms of concepts like multiple dispatch and typing of variables. For an unsophisticated economist like me, the proof of the pudding is in the eating. My story is quite similar to that of Bradley Setzler, whose structural model that took more than 24 hours to solve in Python took only 15 minutes using Julia. After hearing two of my computationally savvy Booth colleagues praise Julia, I tried it out when doing the numerical simulations in our “A Spatial Knowledge Economy” paper. I took my Matlab code, made a few modest syntax changes, and found that my Julia code solved for equilibrium in only one-sixth of the time that my Matlab code did. My code was likely inefficient in both cases, but that speed improvement persuaded me to use Julia for that project.

For a proper comparison of computational performance, you should look at papers by S. Boragan Aruoba and Jesus Fernandez-Villaverde and by Jon Danielsson and Jia Rong Fan. Aruoba and Fernandez-Villaverde have solved the stochastic neoclassical growth model in a dozen languages. Their 2018 update says “C++ is the fastest alternative, Julia offers a great balance of speed and ease of use, and Python is too slow.” Danielsson and Fan compared Matlab, R, Julia, and Python when implementing financial risk forecasting methods. While you should read their rich comparison, a brief summary of their assessment is that Julia excels in language features and speed but has considerable room for improvement in terms of data handling and libraries.

While I like Julia a lot, it is a young language, which comes at a cost. In March, I had to painfully convert a couple research projects written in Julia 0.5 to version 0.6 after an upgrade of GitHub’s security standards meant that Julia 0.5 users could no longer easily install packages. My computations were fine, of course, but a replication package that required artisanally-installed packages in a no-longer-supported environment wouldn’t have been very helpful to everyone else. I hope that Julia’s 1.0 release means that those who adopt the language now are less likely to face such growing pains, though it might be a couple of months before most packages support 1.0.

At this point, you probably should not use Julia for data cleaning. To be brief, Danielsson and Fan say that Julia is the worst of the four languages they considered for data handling. In our “How Segregated is Urban Consumption?” code, we did our data cleaning in Stata and our computation in Julia. Similarly, Michael Stepner’s health inequality code relies on Julia rather than Stata for a computation-intensive step and Tom Wollmann split his JMP code between Stata and Julia. At this point, I think most users would tell you to use Julia for computation, not data prep. (Caveat: I haven’t tried the JuliaDB package yet.)

If you want to get started in Julia, I found the “Lectures in Quantitative Economics” introduction to Julia by Tom Sargent and John Stachurski very helpful. Also look at Bradley Setzler’s Julia economics tutorials.

Trade economists might be interested in the Julia package FixedEffectModels.jl. It claims to be an order of magnitude faster than Stata when estimating two-way high-dimensional fixed-effects models, which is a bread-and-butter gravity regression. I plan to ask PhD students to explore these issues this fall and will report back after learning more.