SAMSI Deputy Director to Deliver Helen Barton Lecture Series at UNC-G

Dr. Sujit Ghosh, Deputy Director of the Statistical and Applied Mathematical Sciences Institute (SAMSI), has been invited by the University of North Carolina-Greensboro’s Department of Mathematics and Statistics to present a series of three lectures this fall as part of the Helen Barton Lecture Series in Mathematical Sciences.

The lecture series has been a fixture in the academic community since 2012 and the target audience for these talks are graduate and upper level undergraduate students and faculty members. Dr. Ghosh is one of many distinguished mathematicians/statisticians who have been invited to speak for the series.

Ghosh’s three-part series, entitled, “Statistical Inference Subject to Shape Constraint,” will take place on the UNC-G campus from, Monday, November 14 thru Wednesday, November 16.

The focus of Dr. Ghosh’s talk will be to present an introductory overview of lectures on statistical inference for density and regression function estimations that are known to preserve a set of shape constraints. Some popular applications include the study of:

  • utility functions, cost functions, and profit functions in economics
  • the analysis of growth rates as a function of various environmental factors
  • the study of dose response curve in the phase I clinical trials
  • the estimation of the monotone hazard rates and the mean residual life functions in reliability and survival analysis and many more

In addition to theoretical results and applications, the lectures will also feature demos of R software packages that can be used to compute various statistical data and graphics.

Ghosh has served as the Deputy Director at SAMSI since 2014. He has served as the Co-Director of Graduate Programs in Statistics at North Carolina State University, where he managed over 150 students annually from 2010 – 2013. Before serving in his current role at SAMSI, Ghosh served as the Program Director in the Division of Mathematical Sciences within the Directorate of Mathematical and Physical Sciences at the National Science Foundation from 2013 – 2014.

Prof. Ghosh has more than 20 years’ experience in conducting, researching and applying statistical analysis of biomedical and environmental information in a wide variety of capacities and subjects. On top of these accomplishments professionally, he has a lengthy and extensive academic record which includes: giving over 125 invited lectures at seminars and national/international meetings; serving as a statistical investigator and consultant for over 40 different research projects funded by numerous private industry leaders and federal agencies and publishing over 95 referred journal articles in the area of biomedical, econometrics and environmental sciences just to name a few. Dr. Ghosh has also co-edited a popular book entitled “Generalized Linear Models: A Bayesian Perspective.”

To see more information on Dr. Ghosh’s lecture or other upcoming events visit the web page for the Helen Barton Lecture series.

IMSM 2016 Prepares Graduate Students for ‘Real World’ Research

The sun set on a hot July day across the street from North Carolina State University, signaling the end of another positive experience in research.

Nearly 40 Graduate Students, of various science, applied mathematics backgrounds and statistics celebrated their accomplishments and experiences after attending the 2016 Industrial Modeling Workshop (IMSM) for Graduate Students in Raleigh, N.C., July 18-27.

This year marked the 22nd anniversary of the IMSM workshop, a major educational outreach component of the Statistical and Applied Mathematical Sciences Institute (SAMSI). Each year, SAMSI invites graduate students from across the country to attend a 10-day workshop, where various industrial and government agencies partner with academia to solve “real world” problems that impact our lives.

This year, SAMSI was pleased to have representatives from: Sandia National Laboratories; Rho, Inc.; the US Army Corps of Engineers (USACE); Environmental Protection Agency (EPA), Pfizer and the Cooperative Institute for Climate and Satellites (CICS). The IMSM workshop is sponsored by SAMSI as well as the Department of Mathematics and the Center for Research in Scientific Computation (CRSC) at N.C. State University.

Graduate students were split into six teams and presented with six different projects from the various industry and lab partners. Subjects of these problems ranged from climate and health to environmental issues. Each team was guided by at least one Industry and one faculty mentor who offered support and helpful hints to make sure the team could develop workable solutions within the allotted time frame.

The IMSM workshop introduces graduate students to the effective application of academic knowledge towards solving “real world” problems. Students also learned valuable skills about time management and team-based research in a time-constrained environment – a practice that is key to achieving results in industry and government labs. The group of students was dynamic, representing such disciplines as Geophysics, Engineering, Biology and of course Applied Mathematics and Statistics. The diversity of students played a pivotal role in helping the teams to develop synergy through their collective strengths and experience in order to reach a common goal. Most students were excited about the opportunity to attend and collectively looked forward to the challenges presented in the IMSM workshop. In the end, industry and lab partners as well as the students benefitted from the experience of producing research results that have the potential to advance “real world” applications.


One highlight of this year’s projects was a problem set directed at ways to identify elements of various allergens in order to develop therapies against food allergies. This important issue was posed by Rho, Inc.

Based on research from the Centers from Disease Control (CDC), food allergies are specifically prevalent in children ages 5 and above. This trend has increased by 18% from 1997 to 2007 and effects nearly 5% of adults and 8% of children. Primarily, eight foods account for 90% of all food allergy reactions: milk, eggs, peanuts, tree nuts, wheat, soy, fish and shellfish.

The students’ focus was to look at nut allergies. Nut allergies make up more than 25% of the most common foods associated with severe allergic reactions. In this specific case, the research developed here could easily be replicated towards the study of other food allergies as well. Allergies are caused by a person’s immune system overreacting to harmless proteins in our food or the environment.  One tool for analyzing these proteins is a peptide microarray. These microarrays help to identify parts of certain proteins that trigger allergic reactions. Fragments of allergy-triggering proteins are arranged on small plates or “chips” and exposed to a patient’s blood.  Antibodies from the patient’s immune system found in the blood will react with some of the fragments. These interactions can be detected by microscopes or scanning machines. The data from these experiments, however tend to be “noisy” when researchers try to accurately determine which protein fragments react with the patient’s antibodies. The students’ aim was to try to identify a more effective way to clear up the noise in these samples. Clearing up the noise ensures better predictability by the researchers in their analysis.

Nut Allergies

The data from samples presented by Rho, Inc., had positive markers for a specific nut allergen. The students analyzed these samples and created an algorithm that could identify these patterns more quickly. The students identified the outliers in each sample, which correlated into clearing up the noisy data from these findings. Correctly identifying these outliers made the predictions about this data more reliable and accurate. The result of applying this approach led to identifying 96% of the noise or “bad spots” on a microarray. By identifying these bad spots with a high degree of certainty, one can have a more effective tool to correctly see what protein fragments are triggering allergies.

Though this algorithm was a big break through, still much research needs to be done. The students’ assistance was a positive step forward on this problem.  With these new findings, Rho, Inc., can now go back and apply some of these same techniques to their ongoing research for this problem. It is work like this that further justifies the purpose of bringing great minds together in order to tackle some of life’s puzzles and help us all to live more problem free.

USACE presented two problems: one on habitat quality assessments in the Columbia River and the second on using surface wave properties to predict nearshore bathymetry. Bathymetry is a measurement of submarine topography and can be used to indicate changes in the ocean floor. This near shore analysis could prove vital for predicting damage to coastal environments due to major storms or significant erosion. Storm surge and erosion also negatively impact transportation routes and civil infrastructure. Collectively, these factors would prohibit efforts of support agencies to assist the civilian populace with critical needs in an emergency.

The group used USACE data from Duck, N.C., compiled from various resources to determine coastal depths within 500 m of the coastline. This distance is crucial when it comes to large vessels providing logistical aid support. Support agencies want to ensure adequate water depth, keeping these large vessels from running aground in poor conditions.  The data could also help to understand the various impacts of erosion on coastal structures and transportation routes.  Studies like this have been used in other situations as well, like saving the Historic Lighthouse out at Cape Hatteras.

Accurate measurements of bathymetry in nearshore regions using conventional means are difficult to obtain.  Direct measurements are costly and sparse, and the underlying topography is constantly changing.

Currently, obtaining accurate data related to this research requires many man hours and often costly equipment. The students used USACE data on wave height, wave number and ocean depth to understand how information on the wave mechanics can be used to generate a map of the underlying bathymetry. They used mathematical representations of the connections between measurable wave properties and bathymetry to develop a statistical algorithm for estimating the water depths along a one-dimensional profile.


The students used data provided by remote sensing platforms, compiled from airborne, satellite and onshore sensors. They studied the dispersion relationship connecting water depth to surface properties, including wave length and period, and discovered using these factors as input provided a relatively accurate estimate of the bathymetry.

Using three different inversion methods, the students accurately determined ocean floor topography up to 900m away from shore. The students found that by using these multiple measurement types, it helped to reduce the amount “noise” in a given variable. In addition, the students determined which inversion method was the best algorithm to use when attempting to accurately identify this data.

Though the group was successful in finding a solution, more work is still needed. The researchers suggested more refinement of their selected inversion method in order to account for more parameters such as beach profile and more access to wave number profiles throughout a given year. These factors could help to isolate trends in the shifting of the ocean floor, which could lead to making mitigation efforts to correct these issues easier.

The group’s final recommendation was to apply this information to a higher fidelity model in order to assess bathymetry in multiple dimensions. The USACE industry mentor looked upon the results favorably. The students’ findings have the potential for numerous applications in keeping with the USACE mission at home and abroad.

Dining Out


Overall the consensus of the graduate students was that this workshop was helpful in preparing them for their future contributions in research. The IMSM is a valuable tool for industry as well. Industries actively seek qualified up and coming researchers by being a part of workshops like this and the research gained also has the potential to advance the work in their various research. As the workshop closed, the students spent their last night dining together and reflecting on the experiences they shared over the previous week and a half with peers and faculty and industry mentors in the program.


Planning and scheduling by SAMSI has begun for the 2017 IMSM; applications for the workshop next year will be accepted in January. To find out more and apply, interested graduate students should visit the SAMSI website at:

SAMSI Poised to Help Hone Gravitational Wave Astronomy, Astronomers’ New Sense

February 24, 2016

(Written by the ASTRO program organizing committee)

LIGO_0– A long time ago in a galaxy far, far away, two large black holes—each with a mass of about 30 suns—reached the end of an aeons-long orbital dance. In the final second of their separate existence, they spiraled toward each other, whirling with a frequency that quickly rose from tens to hundreds of cycles per second. At last they touched, then violently merged in the space of about twenty milliseconds, producing a single black hole that quickly settled down to a bloated, lone existence. Had a video camera been present in the vicinity, it would likely have seen little; black holes are black, after all, regions where gravity is so strong that not even light can escape. Yet during that final merger, the power emitted by this event was larger than all of the power being emitted in light by all of the stars in all of the galaxies in the observable universe. The merger shone, not in electromagnetic waves, but in gravitational waves. The black hole binary’s dance continually sloshed the fabric of space and time in its vicinity, sending out waves carrying news of the invisible event as fluctuations in the spatial separations of objects, and in the flow of time. The waves began as gentle ripples during the long inspiral, steadily climbing in frequency and amplitude; they roiled and crashed during the merger; and finally, they decayed away like the ring of a bell. They followed paths outward from the merger in all directions at the speed of light, diminishing in amplitude but maintaining their shape, an encoding of the story of the merger in the dynamics of spacetime. After a billion-year journey, the waves reached Earth.

This is not the start of a science fiction tale. An international team of over a thousand scientists has observed this merger, the culmination of over four decades of effort sponsored by the National Science Foundation (NSF) and international sources. And NSF’s Statistical and Applied Mathematical Sciences Institute (SAMSI) will soon help astronomers to take the next steps to make the most of this and future gravitational wave discoveries.


Image Credit: SXS, the Simulating eXtreme Spacetimes (SXS) project ( 

Gravitational waves and LIGO

In 1916, Einstein realized that the theory of gravity he had proposed a year before—general relativity, a revolutionary reframing of gravitational interaction, not as the consequence of long-range forces, but rather as a consequence of curvature of spacetime—implied the existence of a new type of radiation, gravitational waves. But the theory revealed space to be incredibly stiff, so resistant to changes in curvature that even violent motions of large masses would produce what seemed to be immeasurably small waves. By the late 1970s, scientists in the U.S. and Europe had converged on a vision for how to make the immeasurable measurable. The Laser Interferometer Gravitational wave Observatory (LIGO) is the realization of this vision.



LIGO’s time series data from the binary black hole merger event GW150914; see the LIGO project page, Gravitational Waves, As Einstein Predicted, for details. Image Credit: Caltech/MIT/LIGO Lab

On September 14, 2015, the waves from that distant merger met LIGO and produced a signal. Months of analysis by many dozens of scientists confirmed its reality, and enabled detailed measurement of the properties of the merging black holes, and of the final hole. The LIGO project announced the discovery to the world on February 11, 2016, dubbing the event GW150914. The discovery marked the confirmation of Einstein’s century-old prediction. But more than that, it marked the opening of a new sense with which astronomers could examine the universe.


SAMSI is poised to help astronomers hone their new sense. In November 2014, SAMSI sought input from the astronomical community for a year-long program that would gather astronomers, statisticians, and applied mathematicians to address challenging interdisciplinary problems in astronomy. Led by statistician G. Jogesh Babu (Penn. State University), a team of scientists identified a set of timely and compelling research directions, under the overarching and overlapping themes of time-domain astronomy and survey-based astronomy. With renovations to LIGO nearing completion, gravitational wave data analysiswas quickly identified as a focus area, along with exoplanets (which are detected via time series measurements), synoptic surveys(an emerging mode of large-scale automated time-domain observing), and cosmology. In September 2015, scientists gathered at SAMSI to plan the 2016-17 Program on Statistical, Mathematical and Computational Methods for Astronomy (ASTRO). The planning team included LIGO scientists who had only just learned of the candidate detection, and had to keep it secret until confirmed.

Of five working groups planned for the ASTRO program, four will address LIGO data analysis challenges, in concert with related challenges in other areas of time-domain astronomy (a fifth working group will focus on statistical problems in cosmology). One working group will study the potential role of new stochastic process models for analysis of time series data from LIGO and exoplanet surveys, particularly models that abandon the simplifying assumptions of stationarity and Gaussianity underlying most currently-used methods. Another working group will focus on gravitational wave and exoplanet signal detection, and how best to use detected signals for demographic studies (for example, to infer the prevalence and diversity of binary black hole systems, and other sources of LIGO signals). A third working group will address the data-theory interface in the regime of computationally expensive theoretical calculations, where it is impossible to directly compute detailed predictions for every candidate model for the data. Numerical general relativity calculations of binary black hole mergers are a motivating example; similar challenges arise in cosmology.

Finally, a working group on synoptic time-domain surveys will address how to find electromagnetic counterparts to gravitational wave sources. Black hole binary mergers, by their very nature, are essentially invisible electromagnetically. But astronomers expect LIGO to detect other types of events that synoptic surveys could capture electromagnetically, providing opportunities for synergistic multimessenger astronomy. These include such exotic phenomena as merging binary neutron stars, and mergers between black holes and ordinary stars, neutron stars, or white dwarf stars. In addition, gigantic stellar explosions, such as those producing supernovae or gamma-ray bursts, may produce detectable gravitational waves. In a tantalizing twist of fate, astronomers have observed all of these types of objects, and presumed that the first LIGO events would come from such already-known systems. Instead, the first LIGO signal was from a type of system hitherto undetected. What other surprises might this new ear on the sky reveal to us?

SAMSI and Astronomy

The ASTRO program is just the latest of several productive programs SAMSI has hosted to build interdisciplinary partnerships between astronomers, statisticians, and mathematicians. The first such program was the 2006 Spring Program on Astrostatistics (also led by Babu). It, too, included working groups addressing problems in gravitational wave and exoplanet astronomy. Many participants built long-lived collaborations at SAMSI; several are helping to organize the forthcoming ASTRO program. SAMSI’s 2012-13 Program on Statistical and Computational Methodology for Massive Datasets included a week-long Workshop on Astrostatistics, organized by Babu, exploring the intersection of astronomy and “big data.” In the summer of 2013, exoplanet astronomer Eric Ford (Penn State University) led a three-week program, Modern Statistical and Computational Methods for Analysis of Kepler Data. It spawned an independent ExoStats2014 workshop, and one of that program’s working groups continues to meet two and a half years later. Finally, the ASTRO program’s working group on inference with computationally expensive models will build on expertise gained from the 2006-07 Program on Development, Assessment and Utilization of Complex Computer Models, and the 2011-12 Programs on Uncertainty Quantification; participants from both of those programs are on the ASTRO planning team.

More information can be found at: Contact directorate liaison: Sujit Ghosh at

Ghosh Receives Honorary Degree from Thammasat University

February 2, 2016

TU-HD-diploma-engSAMSI’s Deputy Director and Professor of Statistics at North Carolina State University (NCSU), Sujit Ghosh, received an honorary doctoral degree in statistics from Thammasat University (TU) in Thailand.

This is one of the highest forms of recognition a university can offer. Thammasat University primarily gives honorary doctorates to people from Thailand.

Ghosh has been visiting the Department of Mathematics and Statistics at TU since the summer of 2005. “I have offered several short courses (e.g., Bayesian methods, Monte Carlo Statistics, Spatial Statistics, etc.) which have now been incorporated into their doctoral curriculum,” said Ghosh.

In addition to graduate students, the courses were attended by the local faculty from TU and now their faculty are trained to offer such courses on their own.

Ghosh also co-supervised at least 4 doctoral students from TU who initially attended his courses and then worked with him on completing their doctoral dissertations. Three of them visited him at NCSU during the last six months of their doctoral programs to complete their theses. All of them are currently serving as lecturers at renowned universities in Thailand.

“I am truly honored to receive this recognition from Thammasat University. I hope to continue our wonderful relationship,” said Ghosh.

The graduation ceremony took place on November 16, 2015.

Emergency Department Simulator Uses Analytics to Help Administrators Make Data-Driven Decisions

July 25, 2014

flowmapEmergency departments (EDs) are under growing pressure; while the number of ED visits have sharply increased, the number of EDs serving this need has actually decreased. According to a report from Rand Corporation, ED doctors are increasingly becoming the decision-makers regarding hospital admissions. Today, nearly half of all non-obstetrical hospital admissions occur through the ED. With the adoption of the Affordable Care Act, it is expected the number of ED visits will continue to rise. ED staffs are, therefore, looking for ways to make effective decisions to make their departments more efficient.

A group of researchers from the University of Florida and the Statistical and Applied Mathematical Sciences Institute (SAMSI) have created an online simulator to help hospital ED administrators understand how analytics and simulation can be used to inform decisions in the ED. In particular, the simulator reveals how various factors or decisions affect the flow of patients through the ED. The group includes, Kenneth Lopiano, SAMSI; Joshua Hurwitz, Jo Ann Lee, Scott McKinley, James Keesling, University of Florida Department of Mathematics; and Joseph Tyndall, University of Florida Department of Emergency Medicine.

The simulator is freely available on the web at On the website doctors or administrators can change several different variables to best mimic the conditions in their particular ED. For example, one can change the number of beds, number of doctors, number of nurses for various hours of the day, or number of patients entering the ED at different times of the day.

Lopiano, who was a postdoctoral fellow at SAMSI during this past year’s Data-Driven Decisions in Healthcare research program, learned about the power of simulation in healthcare through SAMSI-sponsored working groups. It was during a visit to his alma mater, the University of Florida, to discuss his SAMSI experiences when Lopiano learned of lead author Joshua Hurwitz’s efforts. There Lopiano connected with former SAMSI postdoctoral fellow and assistant professor Scott McKinley who introduced Lopiano to Hurwitz. Realizing their common research interests, the core research group was formed which led ultimately to the online simulator, principally developed by Lopiano and Hurwitz. The online simulator has seen substantial increases in traffic since the publication of their research paper in BMC Medical Informatics and Decision Making.
The simulator recognizes that the causes of ED crowding are variable and require site-specific solutions. For example, in a nationally average ED, provider availability can cause bottlenecks in patient flow while investments in other resources may not have the positive impact an administrator would expect. Further, the simulator recognizes that by reallocating resources and creating alternate care pathways, some EDs can dramatically expedite care for lower acuity patients without delaying care for higher acuity patients.

Lopiano, co-founder and principal collaborator of Roundtable Analytics, a healthcare analytics company based in Raleigh, North Carolina, said, “A simulator is very effective because it is risky for health systems to implement overhauls in their care-delivery systems. By using a simulator, administrators are able to evaluate many different scenarios without making these costly and time-consuming changes. Most importantly, administrators can understand the consequences of operational decisions, both intended and unintended.”

The paper published in BMC Medical Informatics and Decision Making is available at: Kenneth Lopiano may be contacted at


The Statistical and Applied Mathematical Sciences Institute (SAMSI) is one of eight mathematical institutes funded by the NSF’s Division of Mathematical Sciences, but is the only one that focuses on statistics and applied mathematics. Its mission is to forge a new synthesis of the statistical and applied mathematical sciences with disciplinary sciences to confront important data- and model-driven scientific challenges. It is based in Research Triangle Park, North Carolina. SAMSI was founded in 2002.

SAMSI is a partnership of the National Science Foundation with a consortium of Duke University, North Carolina State University, the University of North Carolina at Chapel Hill, and the National Institute of Statistical Sciences. You can find more information at, @NISSSAMSI.

SAMSI Appoints New Directorate Members

June 2, 2014

The Statistical and Applied Mathematical Sciences Institute (SAMSI) is pleased to announce the appointments of three new members of the Directorate.

Sujit Ghosh, Professor of Statistics at NC State University (NCSU) and currently a Program Director in the NSF Division of Mathematical Sciences, will become Deputy Director of SAMSI beginning September 8, 2014. Sujit’s research interests are in area of Bayesian statistical methods for analyzing biomedical, econometrics and environmental models. Ghosh previously participated in several SAMSI programs, including as Faculty Fellow representing NCSU in the 2011/12 program on Uncertainty Quantification. Ghosh received his Ph.D. in statistics from the University of Connecticut in 1996is actively involved in teaching, supervising and mentoring graduate students at the doctoral and master levels. He has supervised over 30 doctoral graduate students and 3 post-doctoral fellows and he has also served as a statistical investigator and consultant for over 40 different research projects funded by various leading private industries and federal agencies. In addition to his time at NCSU, he has been a visiting professor at Thammasat University in Thailand, Bocconi University in Italy, Middle East Technical University in Turkey, Techincal University of Crete in Greece and National University in Singapore. He is an elected fellow of the American Statistical Association and the recipient of the 2008 IISA Young Investigator Award. He has also been elected as the President of NC Chapter of ASA in 2013 and served as the Co-Director of Graduate Programs in Statistics at NCSU managing over 150 students annually during 2010-2013, and the Project Director of a training program for undergraduates funded by the NSF during 2007-2013.

“Sujit brings to SAMSI a mature understanding of SAMSI’s research mission, as well as administrative and grant management experience which will be invaluable as we plan for our next funding cycle,” noted Richard Smith, Director of SAMSI.

Thomas Witelski, Professor of Mathematics at Duke University, specializing in nonlinear partial differential equations and fluid dynamics, will become Associate Director of SAMSI for a three year term beginning July 1, 2014. His expertise will be valuable on the applied mathematics side of SAMSI’s activities, and he will also act as SAMSI’s liaison with Duke University during this period. Witelski received his Ph.D. in Applied Mathematics from California Institute of Technology in 1995. Before working at Duke, he was an NSF Postdoctoral Fellow and an Applied Mathematics Instructor at the Massachusetts Institute of Technology (MIT). He is a member of the Society of Industrial and Applied Mathematics, the American Mathematical Society and Tau Beta Pi. He is also the co-Editor-in-Chief of the Journal of Engineering Mathematics and a Division Editor of the Journal of Mathematical Analysis and Applications. He also serves on the editorial board for the European Journal of Applied Mathematics, Discrete and Continuous Dynamical Series B.

Ghosh and Witelski will replace Snehalata Huzurbazar from the University of Wyoming, whose term as Deputy Director ends June 30, 2014, and Ezra Miller from Duke University, whose term as Associate Director also ends June 30, 2014. To fill the gap between Snehalata and Sujit, SAMSI is delighted to welcome back Pierre Gremaud, Professor of Mathematics at NCSU, as Interim Deputy Director for July and August, 2014. During this period, Pierre will be primarily responsible for the education and outreach side of SAMSI’s activities. Pierre previously served as Associate Director of SAMSI from July 2008 through December 2009, and as Deputy Director from January 2009 through June 2012.


The Statistical and Applied Mathematical Sciences Institute (SAMSI) is one of eight mathematical institutes funded by the NSF’s Division of Mathematical Sciences, but is the only one that focuses on statistics and applied mathematics. Its mission is to forge a new synthesis of the statistical and applied mathematical sciences with disciplinary sciences to confront important data- and model-driven scientific challenges. It is based in Research Triangle Park, North Carolina. Samsi was founded in 2002. SAMSI is a partnership of the National Science Foundation with a consortium of Duke University, North Carolina State University, the University of North Carolina at Chapel Hill, and the National Institute of Statistical Sciences. You can find more information at, @NISSSAMSI.

Researchers Help Boston Marathon Organizers Plan for 2014 Race

April 14, 2014

After experiencing a tragic and truncated end to the 2013 Boston Marathon, race organizers were faced not only with grief but with hundreds of administrative decisions, including plans for the 2014 race – an event beloved by Bostonians and people around the world.

One of the issues they faced was what to do about the nearly 6,000 runners who were unable to complete the 2013 race. The Boston Athletic Association, the event’s organizers, quickly pledged to provide official finish times for these runners. Thinking ahead, they also had to consider how to provide these runners with an opportunity to qualify for the 2014 race.

To seek advice on these issues, they contacted Richard Smith, a statistician and marathon runner at the University of North Carolina at Chapel Hill, and director of the Statistical and Applied Mathematics Sciences Institute (SAMSI) based in Research Triangle Park, N.C. They asked Smith to come up with a statistical procedure for predicting each runner’s likely finish time based on their pace up to the last checkpoint before they had to stop.
“Once I got their email,” said Smith, “of course I knew I had to help them.” Smith already knew the organizers, as a result of a previous occasion when he provided advice related to the event’s qualifying times.

Smith quickly assembled a team of fellow analysts that included Francesca Dominici and Giovanni Parmigiani at Harvard School of Public Health, and Dorit Hammerling, postdoctoral fellow at SAMSI, who were in the 2013 race and finished uninjured. The team also included Matthew Cefalu, Harvard School of Public Health; Jessi Cisewski, Carnegie Mellon University and Charles Paulson, Puffinware LLC.

The results, and the method the researchers developed, were published in the April 11 edition of PLOS ONE.

With the help of the Boston Athletic Association, the researchers created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, and all the runners from the 2010 and 2011 Boston marathons. The data consist of “split times” from each of the 5 km sections of the course (from the start up to 40 km), and the final 2.2 km. The research team was tasked to predict the missing split times for the runners who failed to finish in 2013.

The researchers adapted techniques used in such contexts as computing missing data in DNA microarray experiments and estimating ratings which Netflix subscribers would have given to movies they had not seen. They proposed five prediction methods and created a validation dataset to measure the runners’ performance by mean squared error and other measures. Of the five, the method that worked best used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality.

The KNN method looks at each of the runners who did not complete the race (DNF) and finds a set of comparison runners who finished the race in 2010 and 2011 whose split times were similar to the DNF runner up to the point where he or she left the race. These runners are called “nearest neighbors.”

“We had to come up with a method to compare the runners based on the split points up to a certain point of the race and then had to decide how many of the nearest neighbors to examine in order to develop a prediction for the DNF runner that would be based on the different finishing times of these nearest neighbors,” said Smith, who has run the Boston Marathon in the past and will run this year’s race. “We decided to choose 200 nearest neighbors. We also tried 100 and 300 nearest neighbors, but the results changed only slightly and didn’t make them better.”

The Boston Athletic Association decided to grant entry to the 2014 race to anyone who was stopped from completing the 2013 event, so they will have a chance to complete the Boston Marathon after all. But in the course of developing the method, Smith and his colleagues realized there were other uses for the technique.

“We have found that using the KNN method looking at a runner’s intermediate split-time will also be useful in predicting the person’s completion time while the race is in progress,” said Smith. “This can be helpful for relatives and friends to be able to meet the person at the finish line.”

Link to the paper:

From UNC News Services

Researchers Receive IJERPH Best Paper Award 2014

April 9, 2014

What are the human health implications of climate change? There is by now a well established body of evidence about the direct effects of increasing temperature, for example, heat stroke. But is that the full story? It is also possible that air pollution patterns may change as a result of the changing climate, especially ozone, whose production is stimulated by hot weather. In work started at The Statistical and Applied Mathematical Sciences Institute (SAMSI) and later completed with colleagues at North Carolina State University, Howard Chang studied the effect of simultaneous changes in temperature and ozone, using simulations from climate models. Rather than run the model multiple times under different scenarios (a very time consuming process), Chang and his colleagues devised a statistical approach which saves computation time and also allows them to estimate the uncertainty in their projections. As a result, they find significant increases in projected mortality in the southeastern U.S. during the period 2041-2050 compared with 2000 levels.
The resulting paper, written by Chang, Jingwen Zhou, North Carolina State University (NCSU) and Montserrat Fuentes, NCSU, was awarded the International Journal of Environmental Research and Public Health (IJERPH) Best Paper Award 2014. Their paper, “Impact of Climate Change on Ambient Ozone Level and Mortality in Southeastern United States” received the 3rd prize in the category “Articles.”
On an annual basis the IJERPH Best Paper Award recognizes outstanding papers in the area of environmental health sciences and public health that meet the aims, scope and high standards of the IJERPH journal.

Article link:
Award link:


The Statistical and Applied Mathematical Sciences Institute (SAMSI) is one of eight mathematical institutes funded by the NSF’s Division of Mathematical Sciences, but is the only one that focuses on statistics and applied mathematics. Its mission is to forge a new synthesis of the statistical and applied mathematical sciences with disciplinary sciences to confront important data- and model-driven scientific challenges. It is based in Research Triangle Park, North Carolina. Samsi was founded in 2002.

SAMSI is a partnership of the National Science Foundation with a consortium of Duke University, North Carolina State University, the University of North Carolina at Chapel Hill, and the National Institute of Statistical Sciences. You can find more information at, @NISSSAMSI.

Air Pollution, Climate Change and Their Effect on Human Health Focus of Simons Public Lecture

April 9, 2013

DominiciAs our climate changes, there are important impacts to consider that may effect human health. It is important to understand how various populations adapt to changes in the environment and who is most vulnerable to these changes. As part of the Simons Public Lecture Series celebrating MPE2013 (the Mathematics of Planet 2013), SAMSI is hosting Dr. Francesca Dominici, Professor of Biostatistics in the Harvard School of Public Health and Associate Dean of Information Technology, in a public lecture to be held at the UNC Friday Center in Chapel Hill on Wednesday, April 24 at 7 p.m. Her talk will be “The Public Health Impact of Air Pollution and Climate Change”, and is sponsored by the Simons Foundation. You must register for this event at:

Linda S. Birnbaum, Ph.D., director of the National Institute of Environmental Health Sciences (NIEHS) of the National Institutes of Health, will introduce Dr. Dominici. Dr. Birnbaum is a board certified toxicologist, and has served as a federal scientist for nearly 33 years. Prior to NIEHS, she was with the Environmental Protection Agency (EPA), where she directed the largest division focusing on environmental health research.

This talk will review statistical modeling approaches and epidemiological evidence regarding the public health impact of air pollution and extreme heat under a changing climate. Dominici will draw upon massive, heterogeneous, and nationally representative data based on weather, air pollution, health outcomes, and socioeconomic and demographic variables to: 1) quantify health risks based on historical data; 2) predict future risks under different scenarios for climate change; 3) quantify the many sources of uncertainty in these predictions. We will also address the following key challenges: 1) how individuals and their communities will adapt to increasing temperatures and ambient air pollution levels. The health effects of combined exposure to degraded air quality and heat could be more severe than expected based on the individual exposures.

The World Health Organization has estimated that fine particulate matter (PM2.5) contributes to approximately 800,000 premature deaths per year, ranking it as the 13th leading cause of worldwide mortality. Over the next century, climate change is expected to lead to an increase in global average temperature by more than 2 degrees F and to an increase in the intensity, frequency, and duration of extreme weather events such as heat waves. In this talk, Dominici will review statistical modeling approaches and epidemiological evidence regarding the public health impact of air pollution and extreme heat under a changing climate. It is important to understand and quantify the health risks associated with these anticipated changes. Estimating the health impact of climate change cannot be done without a comprehensive understanding of how populations will adapt to these changes and which of these populations are most vulnerable.

Dominici received her Ph.D. in Statistics at the University of Padua in 1997 and was also a visiting graduate student at Duke University. From 1997 to 2009 she was at the Bloomberg School of Public Health at Johns Hopkins University and in 2009 moved to the School of Public Health at Harvard University.

Dominici’s research has focused on the development of statistical methods for the integration of large data to assess and monitor health risks associated with air pollution and climate change. She has developed statistical methods for the analysis of large databases on air pollution and health. She has extensive experience with the analysis of Medicare data and their linkage by geography and time to other data sources, such as air pollution, weather, and socioeconomic status. She has developed statistical methods for the adjustment of measured and unmeasured confounders, Bayesian hierarchical models, causal inference methods, and missing data methods.

Dominici was the recipient of the first Walter A. Rosenblith Young Investigator Award from The Health Effects Institute, Boston, MA; of the Diversity Recognition Award, from Johns Hopkins University, 2009; of the Myrto Lefkopoulou Distinguished Lectureship Award, from the Department of Biostatistics, School of Public Health, Harvard University, 2007, and of the Mortimer Spiegelman Award, from Statistics Section of the American Public Health Association, 2006. Dominici is also a Fellow of the American Statistical Association.

Dominici’s lecture is a part of the Simons Foundation sponsored lecture series and is being coordinated by the Statistical and Applied Mathematical Sciences Institute (SAMSI). The MPE2013 Simons Public Lecture Series is taking place in nine locations around the world. Each lecture features a leading expert explaining how the mathematical sciences play a significant role in understanding and solving some of Planet Earth’s important problems. Over 100 scientific societies, universities, research institutes and organizations have banded together for MPE2013, including SAMSI. It is also part of the 2013 International Year of Statistics events.


SAMSI is a nonprofit organization comprised of the three major universities in Research Triangle region, the National Institute of Statistical Sciences in collaboration with the National Science Foundation (NSF), and the William Kenan Jr. Institute for Engineering Technology and Science. SAMSI is located in Research Triangle Park and focuses on statistical sciences and applied mathematics.

About Simons Foundation

The Simons Foundation is a private foundation based in New York City, incorporated in 1994 by Jim and Marilyn Simons. The Simons Foundation’s mission is to advance the frontiers of research in mathematics and the basic sciences.

About MPE2013

Mathematics of Planet Earth 2013 is an initiative of over 120 scientific societies, research institutes, universities, and organizations all over the world. The mission of the project is to encourage research in identifying and solving fundamental questions about planet Earth, encourage educators at all levels to communicate the issues related to planet Earth, inform the public about the essential role of the mathematical sciences in facing the challenges to our planet, and to encourage young people interested in sustainability and global issues to consider mathematics as an exciting career choice. The Simons Foundation is a major supporter of MPE2013 and the sponsor of the international “MPE2013 Simons Public Lecture Series” at nine locations throughout the world in 2013.

About 2013 International Year of Statistics

The International Year of Statistics (“Statistics2013″) is a worldwide celebration and recognition of the contributions of statistical science. Over 1,700 organizations are participating in this endeavor.

New Media