TR-2013-07

Mathematically Informed Linear Algebra Codes Through Term Rewriting

Matthew Rocklin. 3 September, 2013.
Communicated by L. Ridgway Scott.

Abstract

To motivate modularity in scientific software development we build and study a system to generate mathematically informed linear algebra codes as a case study. We stress the effects of modularity on verification, flexibility, extensibility, and distributed development, each of which are particularly important in scientific contexts. Development in computational science is both accelerated and burdened by changing hardware and diffusion into new disciplines. Hardware development expands the scale of feasible problems. This same development also brings challenging programming models that are both unfamiliar and reflect complex memory and communication architectures. The adoption of computational methods by new fields multiplies both the potential and the burden of this growth. Old techniques can be reapplied to fresh problems in new fields such as biology or within smaller scale research groups. Unfortunately these new communities bring a population of novice scientific programmers without a strong tradition of software engineering. The progress of scientific computing is limited by scientists’ ability to develop software solutions in these new fields for this new hardware. This dissertation discusses the health of the current scientific computing ecosystem and the resulting costs and benefits on scientific discovery. It promotes software modularity within the scientific context for the optimization of global efficiency. To support this argument it considers a case study in automated linear algebra, a well studied problem with mature practitioners. We produce and analyze a prototype software system that adheres strictly to the principles of modularity. This system automatically generates numerical linear algebra programs from mathematical inputs. It consists of loosely coupled modules which draw from computer algebra, compilers, logic programming, and static scheduling. Each domain is implemented in isolation. We find that this separation eases development by single-field experts, is robust to obsolescence, enables reuse, and is easily extensible.

Original Document

The original document is available in PDF (uploaded 3 September, 2013 by L. Ridgway Scott).