logo
logo
x
바코드검색
BOOKPRICE.co.kr
책, 도서 가격비교 사이트
바코드검색

인기 검색어

실시간 검색어

검색가능 서점

도서목록 제공

Data Mining for Bioinformatics

Data Mining for Bioinformatics (Hardcover)

Sumeet Dua, Pradeep Chowriappa (지은이)
CRC Press
231,500원

일반도서

검색중
서점 할인가 할인률 배송비 혜택/추가 실질최저가 구매하기
189,830원 -18% 0원
9,500원
180,330원 >
yes24 로딩중
교보문고 로딩중
notice_icon 검색 결과 내에 다른 책이 포함되어 있을 수 있습니다.

중고도서

검색중
서점 유형 등록개수 최저가 구매하기
로딩중

eBook

검색중
서점 정가 할인가 마일리지 실질최저가 구매하기
로딩중

책 이미지

Data Mining for Bioinformatics
eBook 미리보기

책 정보

· 제목 : Data Mining for Bioinformatics (Hardcover) 
· 분류 : 외국도서 > 컴퓨터 > 데이터베이스 관리 > 데이터 마이닝
· ISBN : 9780849328015
· 쪽수 : 348쪽
· 출판일 : 2012-11-06

목차

Introduction to Bioinformatics
Introduction
Transcription and Translation
     The Central Dogma of Molecular Biology
The Human Genome Project
Beyond the Human Genome Project
     Sequencing Technology
          Dideoxy Sequencing
          Cyclic Array Sequencing
          Sequencing by Hybridization
          Microelectrophoresis
          Mass Spectrometry
          Nanopore Sequencing
     Next-Generation Sequencing
          Challenges of Handling NGS Data
     Sequence Variation Studies
          Kinds of Genomic Variations
          SNP Characterization
     Functional Genomics
          Splicing and Alternative Splicing
          Microarray-Based Functional Genomics
     Comparative Genomics
     Functional Annotation
          Function Prediction Aspects
Conclusion
References

Biological Databases and Integration
Introduction: Scientific Work Flows and Knowledge Discovery
Biological Data Storage and Analysis
     Challenges of Biological Data
     Classification of Bioscience Databases
          Primary versus Secondary Databases
          Deep versus Broad Databases
          Point Solution versus General Solution Databases
     Gene Expression Omnibus (GEO) Database
     The Protein Data Bank (PDB)
The Curse of Dimensionality
Data Cleaning
     Problems of Data Cleaning
     Challenges of Handling Evolving Databases
          Problems Associated with Single-Source Techniques
          Problems Associated with Multisource Integration
     Data Argumentation: Cleaning at the Schema Level
     Knowledge-Based Framework: Cleaning at the Instance Level
     Data Integration
          Ensembl
          Sequence Retrieval System (SRS)
          IBM’s DiscoveryLink
          Wrappers: Customizable Database Software
          Data Warehousing: Data Management with Query Optimization
          Data Integration in the PDB
Conclusion
References

Knowledge Discovery in Databases
Introduction
Analysis of Data Using Large Databases
     Distance Metrics
     Data Cleaning and Data Preprocessing
Challenges in Data Cleaning
     Models of Data Cleaning
          Proximity-Based Techniques
          Parametric Methods
          Nonparametric Methods
          Semiparametric Methods
          Neural Networks
          Machine Learning
          Hybrid Systems
Data Integration
     Data Integration and Data Linkage
     Schema Integration Issues
     Field Matching Techniques
          Character-Based Similarity Metrics
          Token-Based Similarity Metrics
          Data Linkage/Matching Techniques
Data Warehousing
     Online Analytical Processing
     Differences between OLAP and OLTP
     OLAP Tasks
     Life Cycle of a Data Warehouse
Conclusion
References

Section II

Feature Selection and Extraction Strategies in Data Mining
Introduction
Overfitting
Data Transformation
     Data Smoothing by Discretization
          Discretization of Continuous Attributes
     Normalization and Standardization
          Min-Max Normalization
           z-Score Standardization
          Normalization by Decimal Scaling
Features and Relevance
      Strongly Relevant Features
     Weakly Relevant to the Dataset/Distribution
     Pearson Correlation Coefficient
     Information Theoretic Ranking Criteria
Overview of Feature Selection
      Filter Approaches
     Wrapper Approaches
Filter Approaches for Feature Selection
     FOCUS Algorithm
     Relief Method?Weight-Based Approach.
Feature Subset Selection Using Forward Selection
     Gram-Schmidt Forward Feature Selection
Other Nested Subset Selection Methods
Feature Construction and Extraction
     Matrix Factorization
          LU Decomposition
          QR Factorization to Extract Orthogonal Features
           Eigenvalues and Eigenvectors of a Matrix
     Other Properties of a Matrix
     A Square Matrix and Matrix Diagonalization
          Symmetric Real Matrix: Spectral Theorem
          Singular Vector Decomposition (SVD)
     Principal Component Analysis (PCA)
          Jordan Decomposition of a Matrix
          Principal Components
     Partial Least-Squares-Based Dimension Reduction (PLS)
     Factor Analysis (FA)
     Independent Component Analysis (ICA)
     Multidimensional Scaling (MDS)
Conclusion
References

Feature Interpretation for Biological Learning
Introduction
Normalization Techniques for Gene Expression Analysis
     Normalization and Standardization Techniques
          Expression Ratios
          Intensity-Based Normalization
          Total Intensity Normalization
          Intensity-Based Filtering of Array Elements
     Identification of Differentially Expressed Genes
     Selection Bias of Gene Expression Data
Data Preprocessing of Mass Spectrometry Data
     Data Transformation Techniques
          Baseline Subtraction (Smoothing)
          Normalization
          Binning
          Peak Detection
          Peak Alignment
     Application of Dimensionality Reduction
Techniques for MS Data Analysis
     Feature Selection Techniques
          Univariate Methods
          Multivariate Methods
Data Preprocessing for Genomic Sequence Data
     Feature Selection for Sequence Analysis
Ontologies in Bioinformatics
     The Role of Ontologies in Bioinformatics
          Description Logics
          Gene Ontology (GO)
          Open Biomedical Ontologies (OBO)
Conclusion
References

Section III

Clustering Techniques in Bioinformatics
Introduction
Clustering in Bioinformatics
Clustering Techniques
     Distance-Based Clustering and Measures
          Mahalanobis Distance
          Minkowiski Distance
          Pearson Correlation
          Binary Features
          Nominal Features
          Mixed Variables
     Distance Measure Properties
     k-Means Algorithm
     k-Modes Algorithm
     Genetic Distance Measure (GDM)
Applications of Distance-Based Clustering in Bioinformatics
     New Distance Metric in Gene Expressions for Coexpressed Genes
     Gene Expression Clustering Using Mutual Information Distance Measure
     Gene Expression Data Clustering Using a Local Shape-Based Clustering
          Exact Similarity Computation
          Approximate Similarity Computation
Implementation of k-Means in WEKA
Hierarchical Clustering
     Agglomerative Hierarchical Clustering
     Cluster Splitting and Merging
     Calculate Distance between Clusters
     Applications of Hierarchical Clustering Techniques in Bioinformatics
          Hierarchical Clustering Based on Partially Overlapping and Irregular Data
          Cluster Stability Estimation for Microarray Data
          Comparing Gene Expression Sequences Using Pairwise Average Linking
Implementation of Hierarchical Clustering
Self-Organizing Maps Clustering
     SOM Algorithm
     Application of SOM in Bioinformatics
          Identifying Distinct Gene Expression Patterns Using SOM
          SOTA: Combining SOM and Hierarchical Clustering for Representation of Genes
Fuzzy Clustering
     Fuzzy c-Means (FCM)
     Application of Fuzzy Clustering in Bioinformatics
          Clustering Genes Using Fuzzy J-Means and VNS Methods
          Fuzzy k-Means Clustering on Gene Expression
          Comparison of Fuzzy Clustering Algorithms
Implementation of Expectation Maximization Algorithm
Conclusion
References

Advanced Clustering Techniques
Graph-Based Clustering
     Graph-Based Cluster Properties
     Cut in a Graph
     Intracluster and Intercluster Density
Measures for Identifying Clusters
      Identifying Clusters by Computing Values for the Vertices or Vertex Similarity
          Distance and Similarity Measure
          Adjacency-Based Measures
          Connectivity Measures
     Computing the Fitness Measure
          Density Measure
          Cut-Based Measures
Determining a Split in the Graph
     Cuts
     Spectral Methods
     Edge-Betweenness
Graph-Based Algorithms
     Chameleon Algorithm
     CLICK Algorithm
Application of Graph-Based Clustering in Bioinformatics
     Analysis of Gene Expression Data Using Shortest Path (SP)
     Construction of Genetic Linkage Maps Using Minimum Spanning Tree of a Graph
     Finding Isolated Groups in a Random Graph Process
     Implementation in Cytoscape
          Seeding Method
Kernel-Based Clustering
     Kernel Functions
     Gaussian Function
Application of Kernel Clustering in Bioinformatics
     Kernel Clustering
     Kernel-Based Support Vector Clustering
     Analyzing Gene Expression Data Using SOM and Kernel-Based Clustering
Model-Based Clustering for Gene Expression Data
     Gaussian Mixtures
     Diagonal Model
     Model Selection
Relevant Number of Genes
     A Resampling-Based Approach for Identifying Stable and Tight Patterns
     Overcoming the Local Minimum Problem in k-Means Clustering
     Tight Clustering
     Tight Clustering of Gene Expression Time Courses
Higher-Order Mining
     Clustering for Association Rule Discovery
     Clustering of Association Rules
     Clustering Clusters
Conclusion
References

Section IV

Classification Techniques in Bioinformatics
Introduction
     Bias-Variance Trade-Off in Supervised Learning
     Linear and Nonlinear Classifiers
     Model Complexity and Size of Training Data
     Dimensionality of Input Space
Supervised Learning in Bioinformatics
Support Vector Machines (SVMs)
     Hyperplanes
     Large Margin of Separation
     Soft Margin of Separation
     Kernel Functions
     Applications of SVM in Bioinformatics
          Gene Expression Analysis
          Remote Protein Homology Detection
Bayesian Approaches
     Bayes’ Theorem
     Naive Bayes Classification
          Handling of Prior Probabilities
          Handling of Posterior Probability
     Bayesian Networks
          Methodology
          Capturing Data Distributions Using Bayesian Networks
          Equivalence Classes of Bayesian Networks
          Learning Bayesian Networks
          Bayesian Scoring Metric
     Application of Bayesian Classifiers in Bioinformatics
          Binary Classification
          Multiclass Classification
          Computational Challenges for Gene Expression Analysis
Decision Trees
     Tree Pruning
Ensemble Approaches
     Bagging
          Unweighed Voting Methods
          Confidence Voting Methods
          Ranked Voting Methods
     Boosting
          Seeking Prospective Classifiers to Be Part of the Ensemble
          Choosing an Optimal Set of Classifiers
          Assigning Weight to the Chosen Classifier
     Random Forest
     Application of Ensemble Approaches in Bioinformatics
Computational Challenges of Supervised Learning
Conclusion
References

Validation and Benchmarking
Introduction: Performance Evaluation Techniques
Classifier Validation
     Model Selection
          Challenges Model Selection
     Performance Estimation Strategies
          Holdout
          Three-Way Split
          k-Fold Cross-Validation
          Random Subsampling
Performance Measures
     Sensitivity and Specificity
     Precision, Recall, and f-Measure
     ROC Curve
Cluster Validation Techniques
     The Need for Cluster Validation
          External Measures
          Internal Measures
     Performance Evaluation Using Validity Indices
          Silhouette Index (SI)
          Davies-Bouldin and Dunn’s Index
          Calinski Harabasz (CH) Index
          Rand Index
Conclusion

References

저자소개

Pradeep Chowriappa (지은이)    정보 더보기
펼치기
이 포스팅은 쿠팡 파트너스 활동의 일환으로,
이에 따른 일정액의 수수료를 제공받습니다.
이 포스팅은 제휴마케팅이 포함된 광고로 커미션을 지급 받습니다.
도서 DB 제공 : 알라딘 서점(www.aladin.co.kr)
최근 본 책