The program gmconvert converts a various model of molecules (such as atomic model, 3D density map) into GMM (gaussian mixture model). EM (expectation maximization) algorithm is employed for covertion into GMM. The program gmconvert also has many other useful functions to handle GMM.
The source code of gmconvert is written in C assuming the compiler "gcc" in Linux environment. After you download the file "gmconvert-src-[date].tar.gz", just type following commands:
tar zxvf gmconvert-src-[date].tar.gz cd src makeThen you will find the execute file "gmconvert" in the upper directory (../src).
Both atom model and 3D density map can be conveted in GMM.
gmconvert -ipdb [atomic model in pdb] -ogmm [output GMM file] -ng [number of Gaussian functions]
gmconvert -icif [atomic model in mmCIF] -assembly [assembly_id] -ogmm [output GMM file] -ng [number of Gaussian functions]
gmconvert -imap [3D density map] -ogmm [output GMM file] -ng [number of Gaussian functions]
gmconvert -igmm [input GMM file] -omap [output 3D density map] -gw [grid_width]
gmconvert -igmm [input GMM file] -opdb [wireframe model in PDB] -gw [grid_width]
gmconvert -igmm [input GMM file] -owrl [surface/wireframe model in VRML] -gw [grid_width]
gmconvert -igmm [input GMM file] -oewrl [ellipsoidal model in VRML]
gmconvert -igmm [input GMM file] -oobj [surface model in obj] -gw [grid_width]
gmconvert -imap [input 3D density map] -opdb [wireframe model in PDB] gmconvert -imap [input 3D density map] -owrl [surface/wireframe model in VRML] gmconvert -imap [input 3D density map] -oobj [surface in Object]
gmconvert -ipdb [atomic model in pdb] -omap [output 3D density map] -reso [resolution]
gmconvert -igmm [GMMfile] -ogmm [output transformed GMM file] -ipdb [original PDBfile] -itpdb [target PDB file]
: number of Gaussian distribution functions.
: Algorithm type for EM-algorithm. The default is
-emalg P: Point-input EM. A set of 3D points is employed as observed inputs for the EM algorithm. In the case of atomic model, the centers of heavy atoms are used as the input 3D points. In the case of 3D density map, the position of grids are used as the input 3D points with their density values.
-emalg G: GMM-input EM. A set of 3D Gaussian distribution function(GDFs) is employed as observed inptus for the EM algorithm. In the case of atomic model, one isotropic GDF is assigned to each heavy atom. Its center is the center of the heavy atom. The variance of the GDF is
rr2var* [radius] * [radius], where [radius] is the atomic radius. In the case of 3D densitymap, one isotropic GDF is assigned to each grid. Its center is the position of the grid. The variance of the GDF is
ww2var* [grid_width] * [grid_width].
-emalg O: one-to-one_atom/grid. It simply assigns one GDF to one atom or one grid. It does not perform any modification by the EM algorithm.
-I: Initialization of GMM. 'K'-means, 'R'andom 'O':one-to-one_atom/voxel [K]
-delzw: Delete Zero-weight gdfs from the GMM. ('T' or 'F') [T]
-delid: Delete identical gdfs in the GMM. ('T' or 'F') [T]
-hetatm: Read HETATM ('T' or 'F') [F].
-hetatm F, it means the program only read "ATOM" line.
-ch: Chain ID. (or 'auth_asym_id' in mmCIF).
-ch -, it means the program reads all the chains in PDB file/mmCIF file.
-assembly: assembly_id for mmCIF file (-icif) .
assembly_idin mmCiF file (such as 1,2,3,PAU,XAU,..) can be assingned. The program performs symmetrc operations to asymmetric unit to generate XYZ coordinates of the assembly. If the option
assembly_idis not assigned, the program use the asymmetric unit.
-atmsel: Atom selection. 'A'll atom except hydrogen, 'R'esidue-based (only ' CA ' and ' P ') [A]
-maxatm: maximum allowed number of atoms for '-atmsel A'. If over '-minatmA', then change '-atmsel R'.[-1]
-model: 'S':read only single model (for NMR). 'M':read multiple models (for biological unit) [S]
-atmrw: Model for radius and weight. 'A':atom model, 'R':residue model, 'U':uniform raidus/weight,'C':decide from content. [C]
-varatm: Variance type for atom (for -emalg G). 'A': var = rr2var * Rvdw*Rvdw for each atom, 'R': var = (resoatm/2.0)^2.[A]
-rr2var: Constant for variance = Const * Rvdw*Rvdw for -emalg G -varatm A. Default is 1/5. [0.200000].
-resoatm: Resolution for atom for -emalg G -varatm R. [0.000000].
-radtype: radius type for '-atmrw A'. V:van der Waals radius, C:covalent radius [V]
-raduni: radius for uniform model for '-atmrw U'. [1.900000]
-ccatm: Calculation Corr Coeff bwn Atoms and GMMs.(It takes times..) (T or F)[F]
-zsd specify the threshold value of 3D density map.
In order to get the proper GMM of the map, you should assign a proper threshold value, such as the 'ContourLevel' value in EMDB entry.
-zth : if density < [-zth], it is regarded as zero density. [-1.000000]If a density of a voxel is less than the
-zthvalue, its density is assigned as zero. After that, the voxel is regarded as the place where no atom exists. Only positive
-zthvalue is meaningful. The negative
-zthvalue will be ignored.
-zsd : if density < MEAN + [-zsd]*SD, it is regarded as zero density. [3.000000]If you do not know the proper threshold value, the statistics of the density map will help you. If the option
-zsdis assigned, the threshold value is [MEAN of density] +
[-zth]* [SD of density]. Only positive
-zsdvalue is meaningful. The negative
-zsdvalue will be ignored. If both
-zsdare positive, the option
-zthhas a priority.
A computational time for the EM algorithm is roughly proportional to [Number of data points] x [Number of Gaussian functions].
It means that a 256x256x256 map requires at least 64 times longer computatioanl time than the 64x64x64 map does.
If you want to speed up the computation by decreasing resolution of density map, we recommend to use the '
-redsize : reducing size scale (2,3,4,...) If you add '
-redsize 2', the program transforms 256x256x256 map into 128x128x128 map. If you add '
-redsize 4', the program transforms 256x256x256 map into 64x64x64 map.
A similar reduction of voxel size can be done by the option '
-maxsize : maximum voxel size of each axis(if over, reducing size). [-1]If total number of the voxels s over the
-maxsize3, the program automatically set up the
-redsizevalue so that number of voxels is not over the
-maxsize3. For example, if the 180x180x120 map is given with the option '
-maxsize 64', the program set up the
-redsizeas 3, then the map becomes 60x60x40.
-mcth: raw density value for threshold density [-1000.000000] -mcsd: SD value for threshold density [3.000000] -mcnv: number for voxel for threshold density [-1] -mcvo: volume (A^3) for threshold density [-1.000000]These four options are ignored if a minus value is assigned. In the default setting, the option
-mcsd 3.0is assigned, it means that the threshold density map := [average density] + 3.0 * [standard deviation of density]. If the putative volume of the molecule is known, you can assign the volume by the option
-mcvo, in the angstrom cubed unit.
-gw. The default value is 4 angstrom.
-gw : grid width (angstrom) [4.000000]If you output in the VRML format (
-owrl), you can chose surface model by the option
-mcSW : model type. 'S'urface, 'W'ireframe [W]In the surface VRML model, you can assign the color of surface by the option
-mcRGBT: RGBT string (red:blue:green:transparency) [0:1:0:0]
An example of GMM (PDBcode:1omp, number of gaussian = 2) is shown as follows:
HEADER 3D Gaussian Mixture Model REMARK COMMAND gmconvert -ipdb /DB/PDBv3/om/pdb1omp.ent -ng 2 -ogmm 1omp_g2.gmm REMARK START_DATE Feb 16,2013 16:40:19 REMARK END_DATE Feb 16,2013 16:40:19 REMARK COMP_TIME_SEC 0.024982 2.498198e-02 REMARK FILENAME 1omp_g2.gmm REMARK NGAUSS 2 HETATM 1 GAU GAU A 1 1.491 -5.359 -14.106 0.473 0.473 REMARK GAUSS 1 W 0.4731064444 REMARK GAUSS 1 M 1.491210 -5.358893 -14.106107 REMARK GAUSS 1 CovM xx 68.3302668925 xy 1.4432837673 xz 20.6877652908 REMARK GAUSS 1 CovM yy 91.2978302824 yz -0.6854124916 zz 78.2751151639 HETATM 2 GAU GAU A 2 -1.303 4.885 12.899 0.527 0.527 REMARK GAUSS 2 W 0.5268935556 REMARK GAUSS 2 M -1.302614 4.884799 12.899464 REMARK GAUSS 2 CovM xx 99.7411943616 xy -54.9586151001 xz 7.6722146525 REMARK GAUSS 2 CovM yy 99.9101607174 yz -11.9555659402 zz 76.8822244943 TER
REMARK NGAUSS [Number of GDFs for GMM] REMARK GAUSS [GDFnumber] W [Weight for GDF] REMARK GAUSS [GDFnumber] M [Center position of GDF (x y z) ] REMARK GAUSS [GDFnumber] CovM xx [xx of CovM] xy [xy of CovM] xz [xz of CovM] REMARK GAUSS [GDFnumber] CovM yy [yy of CovM] yz [yz of CovM] zz [zz of CovM]