'Shrinking Bull's-eye' Algorithm Speeds Up Complex Modeling From Days To Hours (mit.edu) 48
rtoz sends word of the discovery of a new algorithm that dramatically reduces the computation time for complex processes. Scientists from MIT say it conceptually resembles a shrinking bull's eye, incrementally narrowing down on its target. "With this method, the researchers were able to arrive at the same answer as a classic computational approaches, but 200 times faster." Their full academic paper is available at the arXiv. "The algorithm can be applied to any complex model to quickly determine the probability distribution, or the most likely values, for an unknown parameter. Like the MCMC analysis, the algorithm runs a given model with various inputs — though sparingly, as this process can be quite time-consuming. To speed the process up, the algorithm also uses relevant data to help narrow in on approximate values for unknown parameters."
Re:This sounds familiar... (Score:4, Informative)
Not really, Zeno's paradox is about an infinite sum giving a finite result.
The technique in the article look more like a typical monte-carlo method (that is generating a lot of random values to find a result) combined with some localized search around local maximums. As such that does not seem to be a new idea. I haven't read the whole paper yet but the innovation seems to be in the fact that they managed to apply this idea to a new class of problems.
Re: (Score:2)
How is this faster than a gradient method? Or does this replace a simplex method?
Estimation of distribution algorithm (Score:3)
Uh isn't this a toy version of the entire field Estimation of Distribution Algorithms for parameter optimization??? Anyone want to point out what is new here since the MIT article sure didn't point it out.
Re:Estimation of distribution algorithm (Score:5, Informative)
Indeed, it looks much like this. I happend to have a dozen EDA papers from the past couple decades open on my desktop at this moment! The foremost of the three hard problems in estimation of distribution is how one describes a probability distribution over high dimensional space in a compact low dimensional way. An efficient, if naive, way to do this is just fit the distribution to a function like a Gaussian and then just change the axes and width of the gaussian to evolve it. one then can easily sample and update this gaussian process to narrow down on the parameter guesses. This may be what they "bullseye" approach is doing from the description. Another more general approach I just happened to be re-reading today is the Probability collective which sort of divides the high dimensional space into a low dimensional product distribution. This can handle non-unimodal surfaces and by adapting the basis for the product space coordiantes adapt itself to the local topology. there's a lot of papers on this including this one: http://www.worldscientific.com... [worldscientific.com] (perhaps not the best starting place but I had the link handy!).
The other two hard problems are how one optimally re-uses old data which was taken in what is now the low probality region. one has to down weight it wrt to fresher samples but how is unclear. Finally there's the annealing schedule which reflects the trade between exploration of the surface versus exploitation of the available data. anneal quickly and you converge to local minimum and miss the global minimum. Usually these are arbitrary. But Probability collectives has some clever techniques to make those decisions more principled. In general setting the annealing rate by cross validation is good way to decide if you should anneal faster or slower.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I'm curious about the claim "any complex model". Really? ANY complex model?[1] Is this the silver bullet? The Holy Grail? After testing it on two "complex" (definition please) models?
So yeah, I was curious and it looks like they are doing sparse parameter estimation of sorts. I wonder how they deal with non-uniqueness and convergence on local maxima.
What is just as import is knowing parameter sensitivity. If a parameter does not add much to the accuracy of the model, throw it out. This has it's own pitfall
Somewhat hyped (Score:5, Informative)
Huh (Score:3)
Huh. You think they would have realized that more quickly.
Re: (Score:2)
Huh. You think they would have realized that more quickly.
I'm sure the authors are well aware of it. It's the press hype that I'm pointing out.
Re: (Score:2)
How is this relevant? The algorithm is for speeding up Markov Chain Monte Carlo (MCMC) analyses, which are very commonly used in science to examine posterior probability distributions for a particular model. I run MCMC codes most weeks and they're becoming increasingly more common in my field of science. Being able to speed up the analysis by a large factor to avoid model likelihood evaluations would be very valuable.
Re: (Score:2)
How is this relevant? The algorithm is for speeding up Markov Chain Monte Carlo (MCMC) analyses
The paper is phrasing it in terms of MCMC but it's more generally applicable if you think of it as an optimizer.
Re: (Score:2)
Parameter estimation.
RBF (Score:2)
As far as I know the state of the art for nonlinear optimization when the objective is a very expensive process to run (or simulation of a process) is Radial Basis Function modeling/estimation. Much of the work on this was done at Cornell University - see for example Rommel Regis's papers [sju.edu]. These algorithms do some random sampling, then build a model of the objective function, then based on that pick one or more new points to sample, and repeat.
Re: (Score:2)
Re: (Score:2)
I think the original paper on this is by H. M. Gutman [springer.com]. For an intro to more modern methods see Regis and Shoemaker's 2007 paper.
Juliane Mueller [lbl.gov] has some working code available.
Re: (Score:2)
It's hard to say without doing all the implementation work, but the paper does say that the algo is "...general enough to describe both local polynomial and Gaussian process approximations..." and there is a section called "Local Gaussian process surrogates". So, they do in fact incorporate this in the larger framework of their algo.
In fact, they claim "...that the accuracy is nearly identical for all the cases, but the approximate chains use fewer evaluations of the true model, reducing costs by more than
Re: (Score:2)
It's hard to say without doing all the implementation work, but the paper does say that the algo is "...general enough to describe both local polynomial and Gaussian process approximations..." and there is a section called "Local Gaussian process surrogates". So, they do in fact incorporate this in the larger framework of their algo.
In fact, they claim "...that the accuracy is nearly identical for all the cases, but the approximate chains use fewer evaluations of the true model, reducing costs by more than an order of magnitude for quadratic or Gaussian process approximations (Figure 10b)."
Indeed, though that quote is simply pointing out that the relative performance of their algorithm is at its best with its mode set to local Gaussian approximations.
Looks familiar ... (Score:3)
Is this not hill-climbing on an assumed-flat-ish response surface? There is history and a whole forest of approaches using approximation when the "cost" function is expensive to evaluate. Solving a honking big set of PDEs certainly qualifies for that.
A big part of the history is work to predict whether the degree of assumed flatness is not in conflict with reality. That in itself can become computationally expensive.
Fun stuff though!
IANACS (Score:2, Interesting)
So I'm a lay-idiot, but whats to stop this getting stuck on a local maxima / minima for a given parameter if it's not doing a comprehensive evaluation?
Re: IANACS (Score:1)
Probably some amount of random retries...
Enigma / bomba (Score:3)
Obligatory "Star Trek II" Reference (Score:1)
Algorithm Speeds Up Complex Modeling From Days To Hours . . .
Kirk: Kirk to Enterprise.
Spock: Spock here.
Kirk: Captain Spock, damage report.
Spock: Admiral, if we go "by the book". like Lieutenant Saavik, hours could seem like days.
Kirk: I read you captain. Let's have it.
Spock: The situation is grave, Admiral. We won't have main power for six "days". Auxiliary power has temporarily failed. Restoration may be possible, in two "days". By the book, Admiral.
. . .
Saavik: You lied!
Spock: I exaggerated.
Kirk:
This would help just about every couple I know (Score:2)
"What do you want to eat?"
"I dunno, what do you want to eat?"
Sounds like a perfect fit!
Sounds like a branch and bound algorithm (Score:1)
Nothing to see here