Computational biology(12)
-
Bio Python example code
import Bioprint(Bio.__doc__)Collection of modules for dealing with biological data in Python.The Biopython Project is an international association of developersof freely available Python tools for computational molecular biology.http://biopython.orgDNA sequence handlingfrom Bio import Entrezfrom Bio import SeqIOEntrez.email = "A.N.Other@example.com"with Entrez.efetch( db="nucleotide", rettype..
2024.11.27 -
Python-Basic
Python has 6 data typesNumeric Types: int (integer), float (decimal), complex (complex number)Sequence Types: str(string), list(list), tuple(tuple)Mapping Type: dictSet Types: setBoolean Type: bool (Boolean)Binary Types: bytes, bytearray, memoryview✍🏻 python# 데이터 타입v_str1 = "Niceman" #strv_str2 = "Goodgirl" #strv_bool = True #boolv_float = 10.3 #floatv_int = 7 #intv_complex = 3 + 3jv_dict = { ..
2024.11.27 -
biopython
1. Parsing: Import the desired part from information such as FASTA, FASTQ, GenBank, KEGG, etc.2. Handle the sequence information that was parsed as string and character3. Information on the web such as NCBI, ExPASy, etc. can be imported, so information processing is easy.4. BLAST, sequence alignment, etc. can be analyzed.
2024.11.27 -
python libraries
Anaconda: Used for utilizing Python virtual environments. It is not limited to bioinformatics but is useful in many cases of data science. It is possible to create a virtual environment and manage the version of each package in that environment. Below are conda channels useful for bioinformatics.- bioconda- conda-forgepandas, numpy: useful for storing and processing data in the form of Excel-lik..
2024.11.27 -
scRNA-seq Raw Data Preprocessing: scRNA-seq quality control
The QC report summarizes the passage, failure, warning, etc. for each item.Basic statisticsBasic statistics contain basic statistical information such as file format, encoding, sequence number, number of poor quality flags in FASTQ, sequence length, and GC ratio.For good data, poor quality sequences are small, uniform sequence length is shown,The overall GC content of the analyzed organism shoul..
2024.11.27 -
R packages
Key packages and functionsGene expression analysis:DESeq2: Differential expression analysis of RNA-Seq data.EdgeR: Differential gene expression analysis.Gene sequencing:Biostrings: manipulation and analysis of DNA, RNA, and protein sequence data.ShortRead: Next Generation Sequencing (NGS) Data Analysis.Functional interpretation:ClusterProfiler: Gene Ontology (GO) and Pathway Analysis.TopGO: Anal..
2024.11.27