Bio: Next Generation Sequencing Errors
The "pipeline" is ubiquitous in bioinformatics research. Most pipelines take data and a suite of parameters (often left at their "defaults"); they often leak in somewhat mysterious ways; and they are well-known to affect downstream analyses. Early parts of the pipeline might be controlled by "the company," or the bioinformatics center technicians, or computer scientists who can manage the massive data, and almost all of us are affected by a pipeline, which we probably do not fully understand.
This working group will focus on pipelines for high throughput sequence data. In particular, we identify several components common in most bioinformatics pipelines:
(1) base calling,
(2) error correction,
(3) alignment or assembly, and
(4) normalization.
Mathematical and statistical scientists have been involved to varying degree in each of these components, and we could do more. Our preliminary plan is
(1) To learn about the most common pipeline components.
(2) To identify the components with the biggest effects on downstream analyses.
(3) To write a review on statistical concerns in pipelines: What you need to know.
(4) Develop specific research directions that increase the role of the mathematical sciences and researchers in the development of early bioinformatics pipelines.
The long term goal is to increase the reproducibility of results by limiting or properly communicating the variability and biases introduced by the pipeline to downstream analyses. We welcome YOUR datasets, ideas, and suggestions.
Based on our discussions we are planning the following pipeline topics for our initial meetings:
1. Sequencing Technology (Rivera). We will focus on the Illumina platform, but will touch on the other surviving sequencing platforms.
2-3. Base-Calling (Cui). Base-calling methods for the Illumina platform.
4. Error Correction (Dorman). Error correction is not always part of the NGS pipeline, but it can be used to improve results when assembly is required.
5. Sequence Trimming (Murillo). Adapters and low quality nucleotides are typically trimmed from the sequences.
6-?. Alignment (Olshen). If there is a reference genome, then alignment is a ubiquitous pipeline component.
Other potential topics: variant calling, assembly, integrating results of multiple pipelines
Bio: Next Generation Sequencing Errors
Monday, April 27, 2014
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr
Join WebEx meeting:
https://samsi.webex.com
Meeting number: 681 212 715
Meeting password: SeqErrors1
Bio: Next Generation Sequencing Errors
Bio: Next Generation Sequencing Errors
Monday, April 13, 2015
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr
Join WebEx meeting:
https://samsi.webex.com
Meeting number: 685 102 609
Meeting password: SeqErrors1
Manuscript for PREMIER: PRobabilistic Error-correction using Markov Inference in Error Reads
Bio: Next Generation Sequencing Errors
Bio: Next Generation Sequencing Errors
Monday, March 2, 2015
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr
Join WebEx meeting:
https://samsi.webex.com
Meeting number: 688 645 090
Meeting password: SeqErrors1
Pensive Melancholy and Endless Uncertainty in Next Generation Genomics By professor Bud Mishra
Integrated Base-Calling, Alignment and Assembly and microassembly for variant detection:
The talks presented by Dr. Bud Mishra and Dr. Giuseppe Narzis demonstrate the advantages of a complete pipeline integrating base-calling (TotalReCaller) with assembly (SUTTA) in a Bayesian
manner. Some recent work on combining alignment and assembly (microassembly) for variant detection will also be discussed
Self-Validating Technology-Agnostic Genome Assembly Genomics to Lean-Omics by Bud Mishra
Bio: Next Generation Sequencing Errors
Bio: Next Generation Sequencing Errors
Monday, February 9, 2015
3:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 2 hr
Join WebEx meeting:
https://samsi.webex.com
Meeting number: 684 069 328
Meeting password: SeqErrors1
Bio: Next Generation Sequencing Errors
Bio: Next Generation Sequencing Errors
We will discuss research progress on the base-caller methods comparison.
Monday, January 26, 2015
4:00 pm | Eastern Daylight Time (New York, GMT-01:00) | 1 hr
Please note: Only for this week, the meeting will start 1 hour later than usual scheduled time.
Join WebEx meeting:
https://samsi.webex.com
Meeting number: 682 433 121
Meeting password: SeqErrors1
Working Group Information
Recent Comments
-
xydrolase
-
Remarks on PREMIER manuscript.
29 weeks 1 day ago
-
xpcui
-
micro assembly for variant detection
35 weeks 12 hours ago
-
kdorman
-
Assembly
37 weeks 5 days ago
-
kdorman
-
NOT MEETING TODAY DECEMBER 8th
46 weeks 4 days ago
Active Documents
Meetings
-
April 27, 2015 - 3:00pm - 5:00pm
-
April 13, 2015 - 3:00pm - 5:00pm
-
March 2, 2015 - 3:00pm - 5:00pm
-
March 2, 2015 - 3:00pm - 5:00pm
Group notifications
-
kdormanxpcui