Chemical & Biomolecular Formats
The Wolfram Language can import—and often export—standard formats used in chemistry, molecular biology and bioinformatics, routinely handling a full range of molecular types, as well as genome-sized datasets.
Chemical Formats
"XYZ" — XYZ molecule geometry file (.xyz)
"MOL" — MDL MOL format (.mol)
"MOL2" — Tripos MOL2 format (.mol2)
"SDF" — MDL SDF format (.sdf)
"SMILES" — SMILES chemical format (.smi)
"HIN" — HyperChem molecular data format (.hin)
"CML" — Chemical Markup Language (.cml)
"CDX" — ChemDraw Exchange format (.cdx)
"CDXML" — ChemDraw Exchange XML format (.cdxml)
"Cube" — Gaussian Cube file (.cub)
"FCHK" — Formatted Checkpoint file (.fchk)
"GaussianLog" — Gaussian log file (.log)
"JCAMP-DX" — chemical spectroscopy format (.jdx, .dx, .jcm)
Bioinformatics Formats
"GenBank" — NCBI GenBank sequence format (.gb, .gbk)
"FASTA" — DNA, RNA, and amino acid sequence format (.fasta, .fa, .fsa, .mpfa)
"FASTQ" — DNA and RNA sequence format with base qualities (.fastq, .fq)
"NEXUS" — NEXUS phylogenetic data format (.nex, .ndk)
"AgilentMicroarray" — microarray data format (.txt)
"Affymetrix" — microarray data format (.cel, .cdf, .chp, .gin, .psi)
"SFF" — DNA sequence flowgram format (.sff)
Molecular Biology Formats
"PDB" — Protein Data Bank format (.pdb)
"MMCIF" — MMCIF 3D molecular model format (.cif)
"FCS" — flow cytometry data format (.fcs, .lmd)
Common Elements
"Molecule" — a symbolic representation of the molecule model
"StructureDiagram" — chemical structure diagram
"Graphics3D" — 3D molecular graphics
"VertexCoordinates" — 3D coordinates of atoms