go to SAMSI home page
19 T.W. Alexander Drive
P.O.Box 14006
Research Triangle Park, NC 27709-4006
Tel: 919.685.9350
Fax: 919.685.9360
[email protected]

 

Poster Session
Multiplicity and Reproducibility Workshop
Monday, July 10

Song Zhang, M.D. Anderson Cancer Center

Title: A CAR-BART Model to Merge Two Datasets

Abstract: One problem frequently encountered by public health researchers and health planners is the absence of socioeconomic data in many widely used and routinely collected sources of health and disease information. A common practice to solve this problem is to supplement individual-level record with the socioeconomic profile of the immediate neighborhood of the individual's residence. In this study, we are interested in the relationship between self-perceived health status and income, but they are only available in two different datasets. A Bayesian hierachical model is built to merge two datasets. We extend the Bayesian additive regression trees (BART) model by incorporating additional spatial effects. A simulation study and real data analysis are presented.

 

Delong Liu and Kevin W. Gaido, CIIT

Title: Modeling effect of nucleotide compositions on expression levels of probes due to non-specific binding on Affychips

Abstract:
Background: Affy GeneChips (Affychips) have been widely used in genome research. Large signal intensity variation of genes with low expression levels across a probe set on an Affychip has raised concerns about quality control in analysis of gene expression data. One way to address this issue is to study the effect of nucleotide composition of a probe on non-specific binding measured from its background signal intensity (at 0 pM concentration of its target cRNA).

Results: We proposed a linear model to approximate the contribution of nucleotide composition of a probe to its signal intensity due to non-specific binding. The model includes three sets of predictors: the contribution of each nucleotide on a probe; the contribution of linearly-dependent position of each nucleotide; and the contribution from two adjacent nucleotides on a probe. We applied the linear model to measured background signal intensities of 498 perfect match (PM) probes of 42 spike-in genes at 0 concentration of target cRNAs on Affy HG-U133 chips, and fitted the linear model using Least Angle Regression (LARS) method. We then used the trained model to predict the background intensities of all the PM probes on the spike-in chip. The predicted high background intensities correlate with the observed high signal intensities for the PM probes clustered in one direction on the spike-in chips.

Conclusion: Our analysis on the 498 spike-in PM probes suggests that different nucleotide compositions within a probe set can lead to large variation in background intensities. Spatial positions of nucleotides on a probe should be considered in modeling non-specific binding on Affychips.

 

 

 

Entire site © 2002-2006, Statistical and Applied Mathematical Sciences Institute. All Rights Reserved.

This page updated on July 6, 2006 4:08 PM