SAMSI Astrostat SaFDe group, 2:30pm Informal notes by AC Present: Jeff Scargle., Jim Chiang, in CA; Vinay Kashyap, Taeyoung Park, Aneta Siemiginowska, David van Dyk, Alanna Connors at CfA. Alex Young at GSFC. CfA group was late due to technical difficulties with phones We came in to discussion led by J. Chiang. Using EMC2 on GLAST data - -- Jim C has his own code, taken from Esch etal 2004 and Nowka and Kolaczyk 2004. He wants to use radio etc data to set -priors- levels, at each resolution level, for the gamma-ray data, ahead of time. He thinks --- complementary to Strong etal GALPROP approach -- we can't predict complicated gamma-ray flux exactly because of guessing what the H2 molecular hydrogen data from the H1 data; plus the uncertainties in the cosmic rays, etc. He thinks the models are not good enough at small scale. Plus, how to parametrize uncertainties in Strong etal model? One of his other problems: variation in PSF; variation in "metric" or projection effeect across the sky (as GLAST sees so much of the sky at once); can Adam and David include it in EMC2? Aneta: different from Chandra, where projection doesn't vary so much. (Only the smaller area being pointedat is visible). BUT PSF does vary with energy, like GLAST. Jim will supply test dataset, test point-spread function 256 x 256; array which is the data plus one which is the PSF. It will probably be EGRET data which is public. Adam and DvD have a second generation EMC2 running. They are ready to test it. They would like both "fits" and ASCII formats for the test files. Eric F. would also like to see the test datasets, hopefully in ASCII. They are trying to post data-sets, sample data-sets, for al kinds of challenge problems, on PSU's CAst website. 2/ Alex Y. on RHESSI imaging. (Some delay with recieving PDF at He sent out a PDF file: * Intro on instruments - 1s of keV up to 10s of MeV Offset, spinning germanium detector - essentially a modulation collimator. * He describes count-profile is essentially fourier components i.e. the "light-curve" is part of the representation of the generalized "PSF". * The challenge: doing image reconstruction * Traditional technique: - Richardson-Lucy + penalty; Maximum Entropy, - Some people have parametrized PSF and done 'chi-squared' forward fitting with goodness of fit - The R-L or ME seem to give more robust results. * Total counts in image depends on time-resolution one wants. One revolution is 4s (pretty long) - Other techniques include: "CLEAN" technique; - Is there public code for penalized R-L? - Alex: Yes; but it's in IDL. - BUT scienctists like error bars on there code. Dvd: If you already have a penalized likelihood code, its not too hard to turn it into a full MCMC to get posterior and error bars. The haredest component is often handling the PSF. But we want to handle it the same way a R-L code does. Code is mainly in IDL. Vinay is happy! What about compuation time? Alex: Pixons, at least, take 1/2 day for an image. Probably typical for penalized R-L. DvD: MCMC won't be that much slower than R-L. Maybe 20%. Alex: people will be happy with that, if they can get error bars. Tomorrow (2-7) Alex will be meeting with image processing specialists. He will have more details then. Alex: RHESSI has a lot of simulated test data. He will put together a little more detailed PDF that says more what is going on. DvD wants to have simple test datasets. General laughs over how learning a new language (IDL) is easier than rewriting the PSF-handling codes. * We have a task on writing an R-fits reader. * What is the price for IDL? ------- * Eric F. on a list of non-parametric methods for Poisson "smoothing" or multiscale. E.g. Kashyap, Freeman, etc; Stark et al for XMM; Eberling paper on Adaptive smoothing; all for Poisson; also Appendix C of a paper also has more rigorous treatment. Also Voronoi Oh. Also Damiani et al. performance of PW Detect doesn't perfoem quite as well. Caution: Some of these are geared to visulaization; some for "source detection". Discussions wih Eric and vinay on Stark method for multireolution vs wave detect' Alex and Vinay: They use an "a trous" which is a discrete relative of Mexican Hat. Damiani et al use a theoretical form for wavelet coefficients. Wavedetect sets thresholds based on simulations. Eric's resource will be posted on-line. --------------------- Vinay and Peter start discussions of "Upper Limits" DvD on "one-sided upper conidence interval" He would use a "fit" to the data, with source or line and see what the posterior "The confidence interval will tell you what values of the intensity (or appropriate paramter) that you couldn't reject" if you were doing a test. May be the way to proceed is to choose a simple problem Some discussion of terms. Discussion of Pilla et al paper. How do you mix up Type II errors, source detection, etc. DvD: So maybe the thing to do is to go through a simple problme. And have a statistician go through what they would do. "I think you wouldn't like it", but then you can tell us more specifically why. Inverting hypothesis tests to get confidence intervals. VK can send us a list of papers. ------------- J. Chiang on Pillar: has a new 'detection" algorithm , loosened version of a likelihood test. J Scargel on UL: Is the posterior distribution good enough? VK: It wouldn't simply a source intensity/posterior Prob of finding a source with xxx intensity PF For wavedetct use that would be perfectly good? VK The one thing I am fairly sure we should not do is marginalize over background because we lose the information over the background PF on detectabilty in wavedetect: broad fainter sources are seen differently than a spiky source of the same total intensity. VK That's sort of a related problem but for now lets stick to point sources. JS various problems: Is there a point source anywhere in the field of view? vs What are the limits at this particular position? VK No,I am interested in just one point source i.e. in one pixel. DVD: Let's talk with the students in NC. Again, hypothesis testing can be inverted to get confidence interval. Has subtlties - PHYSTAT has been going on for years talking about this problem. They do not like that good Bayesian intervals do not always have good frequency coverage for example. ----------- Keith Arnaud's problem DvD: I feel these are related. All these become more challenging in low-count regime. Because priors matter, essentially. And one is not in the asymptotic case. DvD ran a simulation: Equal tail Poisson interval, using Jefferyes prior. For low counts, frequency properties are not always what one expects. Often Bayes intervals are too conservative. This simulations tudy will be posted. BUT a soon as you leave Gaussian problems (plus have a nuisance parameter), frequency properties are based on asymptotic properties. That's true for M-L as well as Bayes. JS/DvD: I think that's exactly right. The Bayesian is more flexible. His prior contains uncertainties one could consider as a degree of freedom. How do you summarizea posterior into an interval. One can choose it so it has good frequency properties. But there is freedom/flexibilty in frequency formulation, too. E.g. how to handle nuisance parameters and so forth. VK: Queston on frequency coverage properties? Deviations, after fitting some particular model, should comply with what frequency coverage is supposed to be. Then you do the same thing for a different model. Suppose the distribution of the residuals is larger for this model. Can you say this is a worse model? DvD: yea, well, a consitency check. Can sort of get a handle on over-fitting and under-fitting to see if residuals are consistant. But you want to be a little careful. We al believe photons are Poisson coming from a start. But there can be some systematic compnent one is missing. --------------- Schedule: Another meeting next week. Alex will be here on Monday. Can Becca and Thomas be on-line? Maybe we will have some of these data on-line by then. ---- Announcement: Vinay and Aneta will give lunch-time talk; We will also look at Pilla etal paper. See Aneta's email.