![]() |
![]() |
19 T.W. Alexander Drive P.O. Box 14006 Research Triangle Park, NC 27709-4006 Tel: 919.685.9350 Fax: 919.685.9360 info@samsi.info |
|||
|
|||||
2010-11 Program on Analysis of Object Data
Research Foci
Introduction
The 12-month SAMSI program will focus on the analysis of complex data types that are an extension of Functional Data Analysis where one considers methods to analyze data samples of complex objects. Modern science is generating a need to understand, and statistically analyze, populations of increasingly complex types. The term "Analysis of Object Data" (AOOD) is aimed at encompassing a broad array of such methods. The proposed SAMSI program seeks to bring together a diverse group of researchers (from statistics, other parts of mathematics, and related sciences) to explore the common structure that underlies such methodologies, and to use this knowledge in turn to motivate and synthesize new approaches.
Registration for the Opening Workshop will open on March 1, 2010
Organizing Committee:
Research Foci
AOOD extends the very active research area of Functional Data Analysis and generalizes the fundamental FDA concept of curves as data points, to the more general concept of objects as data points. Examples include images, shapes of objects in 3D, points on a manifold, tree structured objects, and various types of movies. Specific AOOD contexts can be grouped in a number of interesting ways. A grouping of perhaps mathematical interest is considered first. This is in terms of the type of space in which the data objects lie:
Euclidean Objects
Euclidean data objects are quite ubiquitous in a variety of AOOD contexts. One focus will be on Functional Data Analysis (FDA), viewing curves as data. These curves are commonly either simply digitized, or else decomposed by a basis expansion, which gives a vector that represents each data curve. Evolutionary biology and longitudinal applications will be important drivers of the FDA and shape analysis considered in this program.
A second focus is Time Dynamics Data, with an emphasis on differential equations and dynamic systems as drivers of fully or incompletely observed samples of stochastic processes. This will also include point and marked point processes as data objects. Applications can be found in control, engineering, biological modeling of growth or cell kinetics and in e-commerce, where the analysis of auction dynamics is of great interest. In the social sciences repeated events such as child births of a woman and in medical studies, the dynamics of HIV infections, and the dynamics of gene expression and relations with gene networks are examples. Mildly non-Euclidean Objects
One research focus will be Shape Analysis and Manifold Data, where for example 2 or 3 dimensional locations of a set of common landmarks are collected into vectors that represent shapes. While these vectors are just standard multivariate data, they frequently violate standard multivariate assumptions, such as the sample size being (usually much) larger than the dimension. Research in the direction of High Dimension Low Sample Size (HDLSS) issues will be a major emphasis of the proposed SAMSI program. In addition, the landmarks may be invariant to certain transformations such as location, rotation and scale, and Kendall's shape analysis of such objects leads to non-Euclidean distances being the most natural. Further recent examples include analysis of shapes of unlabeled points, especially on curves, surfaces and images. The closely related manifold data also are based on non-Euclidean distances.
Data which naturally lie in a manifold have been in the statistical literature for some time in the form of directional data (data points which are circular or spherical angles) and play an increasingly important role for the analysis of landmarks.
A second research focus will be Modern Image Analysis, that is applications where the data consist of a sample of images will be another program focus. Such data can be often understood as being located on manifolds. These include medial representations for shape objects (involving a mix of real numbers and angles as parameters), diffusion tensor imaging (a branch of magnetic resonance imaging, which represents directionality of fluid flow using tensors), and diffeo-morphisms (a powerful mathematical approach to studying warpings of space that address non-affine registration challenges.). While manifold data present major statistical challenges (because most statistical methods are very Euclidean in nature), they are termed "mildly non-Euclidean", because manifolds admit tangent plane approximation, so that (at least when the data are sufficiently concentrated near the point of tangency) approximate Euclidean methods have been employed to good effect. A wide open research area, that will be a major focus on the SAMSI program, is the development of "intrinsic" methodologies, where the statistical analysis is carried out really inside the manifold, which thus avoid distortion problems for manifold data that are not concentrated in a small area. Strongly non-Euclidean Objects
Objects such as Tree and Graph Structured Data are "strongly non-Euclidean", because the data space admits no tangent plane approximation. Thus, there is no apparent approach to adapting even approximate Euclidean methodologies, and statistical analysis must be invented from the ground up. The first workable methodology of this type appears in Aydin et al (2008). But this field is in its infancy, with large potential as a context for the development of new ideas. Thus it will be another focus of the SAMSI program. Statistical and Mathematical Challenges
The mathematical areas involved in AOOD highlight the potential synergies that are possible through this SAMSI program. These include:
Potential Applications
The applications areas to be emphasized will depend upon the program participants themselves. The following list suggests potential areas of interest, but it remains to be revised or expanded as the program develops following the Opening Workshop.
Description of Activities
Workshops: The Opening Workshop will be held September 12-15, 2010 at the Radisson RTP. This workshop will aim to engage as large a part of the statistics, mathematics, and relevant scientific communities as possible, with representative sessions from all of the main program topics. The Transition Workshop at the end of the program will disseminate program results and chart a path for future research in the area. There will also be mid-program workshops focused on each of the three key research areas mentioned above.
Working Groups: Working groups will meet throughout the program to pursue particular research topics identified in the kickoff workshop (or subsequently chosen by the working group participants). The working groups consist of SAMSI visitors, postdoctoral fellows, graduate students, and local faculty and scientists.
Further Information
Additional information about the program and opportunities to participate in it is available:
|
|||||
Entire site © 2001-2010, Statistical and Applied Mathematical Sciences Institute. All Rights Reserved. |
|||||