Syndicate content

Bio: Next Generation Sequencing Errors

The "pipeline" is ubiquitous in bioinformatics research. Most pipelines take data and a suite of parameters (often left at their "defaults"); they often leak in somewhat mysterious ways; and they are well-known to affect downstream analyses. Early parts of the pipeline might be controlled by "the company," or the bioinformatics center technicians, or computer scientists who can manage the massive data, and almost all of us are affected by a pipeline, which we probably do not fully understand.

This working group will focus on pipelines for high throughput sequence data. In particular, we identify several components common in most bioinformatics pipelines:
(1) base calling,
(2) error correction,
(3) alignment or assembly, and
(4) normalization.

Mathematical and statistical scientists have been involved to varying degree in each of these components, and we could do more. Our preliminary plan is

(1) To learn about the most common pipeline components.
(2) To identify the components with the biggest effects on downstream analyses.
(3) To write a review on statistical concerns in pipelines: What you need to know.
(4) Develop specific research directions that increase the role of the mathematical sciences and researchers in the development of early bioinformatics pipelines.

The long term goal is to increase the reproducibility of results by limiting or properly communicating the variability and biases introduced by the pipeline to downstream analyses. We welcome YOUR datasets, ideas, and suggestions.

Based on our discussions we are planning the following pipeline topics for our initial meetings:
1. Sequencing Technology (Rivera). We will focus on the Illumina platform, but will touch on the other surviving sequencing platforms.
2-3. Base-Calling (Cui). Base-calling methods for the Illumina platform.
4. Error Correction (Dorman). Error correction is not always part of the NGS pipeline, but it can be used to improve results when assembly is required.
5. Sequence Trimming (Murillo). Adapters and low quality nucleotides are typically trimmed from the sequences.
6-?. Alignment (Olshen). If there is a reference genome, then alignment is a ubiquitous pipeline component.
Other potential topics: variant calling, assembly, integrating results of multiple pipelines

katiasmirn2's picture

Bio: Next Generation Sequencing Errors

Monday, April 27, 2014
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr

Join WebEx meeting:
https://samsi.webex.com

Meeting number: 681 212 715
Meeting password: SeqErrors1

Meeting Date: 
April 27, 2015 - 3:00pm - 5:00pm
katiasmirn2's picture

Bio: Next Generation Sequencing Errors

Bio: Next Generation Sequencing Errors

Monday, April 13, 2015
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr

Join WebEx meeting:
https://samsi.webex.com

Meeting number: 685 102 609
Meeting password: SeqErrors1

Meeting Date: 
April 13, 2015 - 3:00pm - 5:00pm
katiasmirn2's picture

Bio: Next Generation Sequencing Errors

Bio: Next Generation Sequencing Errors

Monday, March 2, 2015
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr

Join WebEx meeting:
https://samsi.webex.com

Meeting number: 688 645 090
Meeting password: SeqErrors1

Meeting Date: 
March 2, 2015 - 3:00pm - 5:00pm
xpcui's picture

Integrated Base-Calling, Alignment and Assembly and microassembly for variant detection:

The talks presented by Dr. Bud Mishra and Dr. Giuseppe Narzis demonstrate the advantages of a complete pipeline integrating base-calling (TotalReCaller) with assembly (SUTTA) in a Bayesian
manner. Some recent work on combining alignment and assembly (microassembly) for variant detection will also be discussed

Meeting Date: 
March 2, 2015 - 3:00pm - 5:00pm
katiasmirn2's picture

Bio: Next Generation Sequencing Errors

Bio: Next Generation Sequencing Errors

Monday, February 9, 2015
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr

Join WebEx meeting:
https://samsi.webex.com

Meeting number: 684 069 328
Meeting password: SeqErrors1

Meeting Date: 
February 9, 2015 - 3:00pm - 5:00pm
acach001's picture

Base-calling review

katiasmirn2's picture

Bio: Next Generation Sequencing Errors

Bio: Next Generation Sequencing Errors
We will discuss research progress on the base-caller methods comparison.

Monday, January 26, 2015
4:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 1 hr

Please note: Only for this week, the meeting will start 1 hour later than usual scheduled time.

Join WebEx meeting:
https://samsi.webex.com

Meeting number: 682 433 121
Meeting password: SeqErrors1

Meeting Date: 
January 26, 2015 - 4:00pm - 5:00pm