SAMSI Astrostat SaFDe group, 2:30pm

Informal notes by AC

 Present: Jeff Scargle., Jim Chiang, in CA; 
 Vinay Kashyap, Taeyoung Park, Aneta Siemiginowska, David van Dyk, Alanna Connors at CfA.
 Alex Young at GSFC.
 CfA group was late due to technical difficulties with phones

We came in to discussion led by J. Chiang.
 Using EMC2 on GLAST data -
 -- Jim C has his own code, taken from Esch etal 2004 and Nowka and Kolaczyk 2004.
    He wants to use radio etc data to set -priors- levels, at each resolution level,
    for the gamma-ray data, ahead of time.
     He thinks --- complementary to Strong etal GALPROP approach --
     we can't predict complicated gamma-ray flux exactly
    because of guessing what the H2 molecular hydrogen data from the H1 data;
    plus the uncertainties in the cosmic rays, etc.
   He thinks the models are not good enough at small scale.
   Plus, how to parametrize uncertainties in Strong etal model?
   One of his other problems: variation in PSF; variation in "metric"
   or projection effeect across the sky (as GLAST sees so much of the sky at once);
   can Adam and David include it in EMC2?

 Aneta: different from Chandra, where projection doesn't vary so much.
 (Only the smaller area being pointedat is visible).
 BUT PSF does vary with energy, like GLAST.

 Jim will supply test dataset, test point-spread function
 256 x 256; array which is the data plus one which is the PSF.
 It will probably be EGRET data which is public.

 Adam and DvD have a second generation EMC2 running.
 They are ready to test it.  They would like both
 "fits" and ASCII formats for the test files.

 Eric F. would also like to see the test datasets, hopefully in ASCII.
 They are trying to post data-sets, sample data-sets, for al kinds
 of challenge problems, on PSU's CAst website.

 2/ Alex Y. on RHESSI imaging. (Some delay with recieving PDF at

 He sent out a PDF file:
 * Intro on instruments - 1s of keV up to 10s of MeV
   Offset, spinning germanium detector -
   essentially a modulation collimator.
 * He describes count-profile is essentially fourier components
   i.e. the "light-curve" is part of the representation of the 
   generalized "PSF".  
 * The challenge: doing image reconstruction
 * Traditional technique:
   - Richardson-Lucy + penalty; Maximum Entropy,
   -  Some people have parametrized PSF and done
     'chi-squared' forward fitting with goodness of fit
   - The R-L or ME seem to give more robust results.
 * Total counts in image depends on time-resolution one wants.
   One revolution is 4s (pretty long)
  - Other techniques include:
    "CLEAN" technique; 
 - Is there public code for penalized R-L?
 - Alex: Yes; but it's in IDL.
 - BUT scienctists like error bars on there code.

 Dvd: If you already have a penalized likelihood code,
 its not too hard to turn it into a full MCMC to get
 posterior and error bars.
 The haredest component is often handling the PSF.
 But we want to handle it the same way a R-L code does.
 Code is mainly in IDL. Vinay is happy!

 What about compuation time?
 Alex: Pixons, at least, take 1/2 day for an image.
 Probably typical for penalized R-L.
 DvD: MCMC won't be that much slower than R-L. Maybe 20%.
 Alex: people will be happy with that, if they can get error
 bars.

 Tomorrow (2-7) Alex will be meeting with image processing
 specialists. He will have more details then.

 Alex: RHESSI has a lot of simulated test data.
 He will put together a little more detailed PDF that says more what
 is going on.

 DvD wants to have simple test datasets.

 General laughs over how learning a new language (IDL) is easier
 than rewriting the PSF-handling codes.  
 
 * We have a task on writing an R-fits reader.

 * What is the price for IDL?

-------

* Eric F. on a list of non-parametric methods for Poisson "smoothing"
 or multiscale.  E.g. Kashyap, Freeman, etc; Stark et al for XMM;
 Eberling paper on Adaptive smoothing; all for Poisson; also
 Appendix C of a paper also has more rigorous treatment.
 Also Voronoi
 Oh. Also Damiani et al. performance of PW Detect doesn't
 perfoem quite as well.

 Caution: Some of these are geared to visulaization; some for
 "source detection".

 Discussions wih Eric and vinay on Stark method for multireolution
 vs wave detect'

 Alex and Vinay: They use an "a trous" which is a discrete relative
 of Mexican Hat.

 Damiani et al use a theoretical form for wavelet coefficients.
 Wavedetect sets thresholds based on simulations.

 Eric's resource will be posted on-line.

---------------------

Vinay and Peter start discussions of "Upper Limits"

DvD on "one-sided upper conidence interval"
 He would use a "fit" to the data, with source or line
 and see what the posterior 

 "The confidence interval will tell you what values of the
 intensity (or appropriate paramter) that you couldn't reject"
 if you were doing a test.

 May be the way to proceed is to choose a simple problem
 
 Some discussion of terms.

Discussion of Pilla et al paper.

How do you mix up Type II errors, source detection, etc.

DvD: So maybe the thing to do is to go through a simple problme.
And have a statistician go through what they would do. "I think you
wouldn't like it", but then you can tell us more specifically why.
Inverting hypothesis tests to get confidence intervals.

VK can send us a list of papers.

-------------

J. Chiang on Pillar:
 has a new 'detection" algorithm , loosened version of
a likelihood test.

J Scargel on UL:
 Is the posterior distribution good enough?
VK: It wouldn't simply a source intensity/posterior
Prob of finding a source with xxx intensity

PF For wavedetct use that would be perfectly good?

VK The one thing I am fairly sure we should not do
is marginalize over background because we lose
the information over the background

PF on detectabilty in wavedetect: broad fainter sources
are seen differently than a spiky source of the same total
intensity.

VK That's sort of a related problem but for now lets
stick to point sources.

JS various problems:
Is there a point source anywhere in the field of view?
vs What are the limits at this particular position?

VK No,I am interested in just one point source i.e.
in one pixel.

DVD: Let's talk with the students in NC.
Again, hypothesis testing can be inverted to get
confidence interval.  Has subtlties - PHYSTAT
has been going on for years talking about this problem.
They do not like that good Bayesian intervals
do not always have good frequency coverage for example.

-----------

Keith Arnaud's problem

DvD: I feel these are related.  All these become more challenging
in low-count regime. Because priors matter, essentially. And
one is not in the asymptotic case.

DvD ran a simulation:
Equal tail Poisson interval, using Jefferyes prior.

For low counts, frequency properties are not always what
one expects.  Often Bayes intervals are too conservative.
This simulations tudy will be posted.

BUT a soon as you leave Gaussian problems (plus have a nuisance
parameter), frequency properties are based on asymptotic properties.
That's true for M-L as well as Bayes. 

JS/DvD: I think that's exactly right. The Bayesian is more flexible.
His prior contains uncertainties one could consider as a degree
of freedom. How do you summarizea posterior into an interval.
One can choose it so it has good frequency properties.

But there is freedom/flexibilty in frequency formulation, too.
E.g. how to handle nuisance parameters and so forth.

VK: Queston on frequency coverage properties?

Deviations, after fitting some particular model, should comply with
what frequency coverage is supposed to be.

Then you do the same thing for a different model.
Suppose the distribution of the residuals is larger for
this model.  Can you say this is a worse model?

DvD: yea, well, a consitency check. Can sort of get a handle
on over-fitting and under-fitting to see if residuals are
consistant.

But you want to be a little careful.  We al believe photons
are Poisson coming from a start. But there can be some
systematic compnent one is missing.

---------------

Schedule:

Another meeting next week.

Alex will be here on Monday. Can Becca and Thomas be on-line?
Maybe we will have some of these data on-line by then.

----

Announcement:
 Vinay and Aneta will give lunch-time talk;
 We will also look at Pilla etal paper.
 See Aneta's email.