Molecular Data Analysis Using R

Molecular Data Analysis Using R

Ortutay, Csaba
Ortutay, Zsuzsanna

87,05 €(IVA inc.)

This book addresses the difficulties experienced by wet lab researchers with the statistical analysis of molecular biology related data.  The authors explain how to use R and Bioconductor for the analysis of experimental data in the field of molecular biology.  The content is based upon two university courses for bioinformatics and experimental biology students (Biological Data Analysis with R and High–throughput Data Analysis with R). The material is divided into chapters based upon the experimental methods used in the laboratories.  Key features include:  Broad appeal––the authors target their material to researchers in several levels, ensuring that the basics are always covered.  First book to explain how to use R and Bioconductor for the analysis of several types of experimental data in the field of molecular biology.  Focuses on R and Bioconductor, which are widely used for data analysis. One great benefit of R and Bioconductor is that there is a vast user community and very active discussion in place, in addition to the practice of sharing codes. Further, R is the platform for implementing new analysis approaches, therefore novel methods are available early for R users. INDICE: 1. Introduction to R statistical environment: Basics are established. 2. Simple sequence analysis: Readers will learn to work with sequences. The simple analysis of their composition will be also covered. The libraries seqinR, ape, and GenomeGraphs will be used to accomplish the practical tasks. 3. Annotating gene groups: The focus in this part will be on obtaining and assembling annotation data for gene groups, and performing enrichment analysis on them. Gene enrichment analysis is a systematic approach to assign ontologies, pathways, and transcription factors to gene lists usually resulted from high throughput experiments. Readers will be familiarized with the two most frequently applied approaches to locate the common features of large gene lists. 4. Next–generation sequencing: introduction and genomic applications: A general overview of next–generation sequencing methods is discussed with a close look at the data analysis possibilities with different R/Bioconductor packages. Since NGS data tends to be large, special computational approaches are needed to accommodate the relevant data handling issues. The most important approaches are also introduced here. 5. Quantitative transcriptomics:qRT–PCR: Real–time PCR is an experimental method designed for monitoring the amplification of selected DNA segments during polymerase chain reaction. It can be used for quantitative measurements of target DNA segments in experimental samples. The most popular application of this method is to measure the amount of mRNA synthesized from selected genes in biological samples, thus measuring the change of gene expression levels throughout experimental conditions.  This chapter contains an overview of the experimental background, and enumerates the most often applied data analysis approach related to quantitative real–time PCR experiments used in transcriptome analysis. . 6. Advanced transcriptomics:gene expression microarrays: Microarray experiments have dominated the genomic and transcriptomic landscapes. Tremendous work has been dedicated to the perfection of algorithms and statistical evaluation of microarray data, just to be rendered obsolete quickly with the spreading of NGS techniques. Gene expression microarray data is still generated where the instruments are readily available, and where they were incorporated, for example, into diagnostic pipelines. Additionally, a vast body of microarray data is available from databases, such as Gene Expression Omnibus or ArrayExpress. These are excellent sources for meta–analysis projects, for data mining purposes or as reference datasets for NGS experiments. The most important approaches for accessing and analyzing gene expression microarray datasets will be introduced in this chapter. . 7. Next–generation sequencing in transcriptomics:RNA–seq experiments: The rapid development of transcriptome analysis methods steers the focus of methodological research and development. For RNA–seq data analysis, many concepts from gene expressionmicroarray experiments can be used, but there are also specific aspects coming from the NGS approach itself. In this section, the edgeR and DESeq packages are introduced. These are the mostoften used tools for analyzing read counts in genes and exons to find differential expression among various samples. 8. Deciphering the regulome:from CHIP to CHIP–seq: Chromatin immunoprecipitation is a method to study DNA regions interacting with proteins. These include transcription factor binding sites, histone binding regions, transcription initiationsites, and others. In this section, the application of this experimental approach is introduced. Readers will get an idea of how it can be applied in a high–throughput mode using microarrays or NGS toidentify the DNA side of the interactions. 9. Inferring regulatory and other networks from gene expression data: The theoretical aspects of gene regulatory networks are covered. Two selected methods are shown for reconstructing these networks and one more for visualization of other kind of genenetworks with Bioconductor packages. 10. Analysis of biological networks: Includes an introduction to networks and biological network analyzed most often. Includes an overview on using the igraph package for the mostpopular measures related to networks. 11. Proteomics:mass spectrometry: Includes the most important aspects of mass spectrometry data analysis which can be relevant in high scale proteomics projects. 12. Measuring protein abundance with ELIS A1 (this chapter is under development and revision): Enzyme–Linked ImmunoSorbent Assay (ELISA) is a quantification method for measuring the concentration of any kind of molecular compound from biological liquids, such as blood, serum orcell culture supernatans. This method is used in molecular biology for a long time, and medium throughput instruments are available to produce measurement data for dozens of parallelsamples. Data analysis aspects of ELISA are far from being trivial mostly because of the complicated mathematics of translating measured raw data to concentrations. Reliable statistical modeling solutions are introduced here with an R package specialized in the analysis of dose–response curve data. 13. Flow cytometry:counting and sorting stained cells: Flow cytometry, one of the earliest high–throughput methods in experimental biology, is introduced in this section. Also, the typical steps of a general data analysis pipeline are overviewed so that readers will be familiar with the common terms related to flow cytometric analysis. At the end of this block, readers will have a tool set to design FACS data independently and publish the results in a standardized way.

  • ISBN: 978-1-119-16502-6
  • Editorial: Wiley–Blackwell
  • Encuadernacion: Rústica
  • Páginas: 376
  • Fecha Publicación: 30/12/2016
  • Nº Volúmenes: 1
  • Idioma: Inglés