Robot Scientists Project for IMSM 2011

Dr. Cammey Cole Manning
Department of Mathematics and Computer Science
Meredith College
Raleigh, NC

Dr. John Peach
MIT Lincoln Laboratory
Group 38 - Optimal Systems Technology
Lexington, MA

Project Description

Ever wanted to have a robot to do your research for you? If you are a scientist, you have almost certainly had this dream. Now it’s a real option: Eureqa, a program that distills scientific laws from raw data, is freely available to researchers. - Wired Science, Dec 3 2009 [8]

Many techniques have been developed for solving optimization problems, but among the most difficult are those problems with many local maxima, and little known structure. To solve such problems, the method of genetic algorithms was developed in which a population of n vectors are chosen randomly and allowed to "evolve" according to a set of simple rules that mimic biological evolution. While the resulting maxima is not guaranteed to be the global maximum, the problem to be solved is often so difficult that a sub-optimal solution is considered sufficient. More recently the technique of genetic algorithms has itself evolved into genetic programming and symbolic regression [2]. In genetic programming the vectors are components of a computer program which solves a pre-defined task, while the method of symbolic regression attempts to fit mathematical formulae to measured data using evolutionary algorithms. The basic components of a symbolic regression algorithm might be the operators plus, minus, times and divide and a few other user selected basic functions, such as sin; cos; etc. as the "chromosomes" which are evolved to produce a function representing the data. Symbolic regression has been successfully used to derive conservation laws from measured data. Software for performing experiments are now available.[5, 11, 4]

Schmidt and Lipson [10, 1] report success in using symbolic regression to determine conservation laws from measurements of a double pendulum and for this project, similar experiments will be performed on a device called a Swinging Atwood’s Machine[12] using open source modeling code [6] to generate the "measured" data. A drawback of genetic algorithms is their speed. Schmidt and Lipson report that finding the conservation laws required 30 hours of computation time. This could be greatly improved by applying the methods on a parallel processing cluster[9]. Another possible approach would be to allow the fitness function to migrate from a relatively easy solution space towards the desired solution, mimicking changes in evolutionary pressure. A new application of symbolic regression is to solve classification problems through evolved neural networks. By restricting the symbol set to the operators: +, -, x, ÷ and a sigmoidal function such as the hyperbolic tangent, the evolved functions will be a family of neural networks.[7, 3] A comparison between standard neural network techniques and this new symbolic regression method will be part of the project.

The project will involve taking "measurements" of the motion of a simulated Swinging Atwood’s Machine, and applying the computer programs Eureqa and GPTips to extract the underlying conservation laws. Next, we will attempt to find a function that will classify data and compare it to results from an actual neural net, and finally we will look for methods to improve the speed and efficiency of the search for solutions using the techniques of symbolic regression.

References

  1. Josh Bongard and Hod Lipson, Automated reverse engineering of nonlinear dynamical systems,
    Proceedings of the National Academy of Sciences, vol. 104, pp 9943-9948, June 2007
  2. Kenneth Chang, Hal, call your office: Computers that act like physicists,
    The New York Times, April 2, 2009
  3. C. Gregory Doherty, Fundamental analysis using genetic programming for classification rule induction,
    Technical report, Oracle Corporation, 2003
  4. Dominic P. Searson, David E. Leahy, and Mark J. Willis, GPTips: An open source genetic programming toolbox for multigene symbolic regression,
    IMECS Proceedings, vol. I, pp 17-19, 2010
  5. Lin Edwards, Eureqa, the Robot Scientist, December 2009
  6. Peter Fritzson, OpenModelica
  7. Jean-Yves Potvin, Patrick Soriano and Maxime Vallee, Generating trading rules on the stock markets with genetic programming,
    Computers and Operations Research, vol. 31, pp 1033-1047, 2004
  8. Brandon Keim, Download Your Own Robot Scientist, Wired Science, December 2009
  9. Jeremy Kepner, MIT Lincoln Laboratory activates 1500-processor interactive parallel computing system
  10. Michael Schmidt and Hod Lipson, Distilling free-form natural laws from experimental data,
    Science, vol. 324, no. 5923, pp 81-85, 2009
  11. Dominic Searson, Genetic programming and symbolic regression for Matlab, April 2010
  12. Swinging Atwood's Machine

 

IMSM 2011 home page