CarbBuilder is a program for building a reasonable structure of a polysaccharide from a primary structure specification in the CASPER format.
CarbBuilder is implemented in C# and should be portable across a variety of architectures with support for the .NET framework.
If you use CarbBuilder in your research, please cite these references:
- CarbBuilder: Software for Building Molecular Models of Complex Oligo- and Polysaccharide Structures. M. M. Kuttel, Jonas Stahle, and Goran Widmalm, J. Comput. Chem., 37(22), 2098-2105 (2016)
- - M. M. Kuttel et al., Proceedings of the 7th IEEE International Conference on e-Science, 2011, p 395-402 ==============================================
Installation
CarbBuiler is installed by downloading and expanding the archive, which contains:
+ CarbBuilder2.exe [the application file]
+ README.md [the README file]
+ structureFile directory [a directory containing structures for building and default settings]
On Linux or MacOS X you need to install Mono, an open source implementation of Microsoft's .NET Framework. Mono is free for download at http://www.mono-project.com/download/.
Execution
CarbBuilder is a command-line program: you need to open a terminal to execute it.
Examples of how to run CarbBuilder:
On Windows:CarbBuilder2.exe -i "->2)aLRha(1->2)aLRha(1->3)aLRha(1->3)bDGlcNAc(1->" -r 6 -o Shigella_Y_RU6
On Linux [Note that running under Linux requires installation of the mono framework] :mono CarbBuilder2.exe -i "->2)aLRha(1->2)aLRha(1->3)aLRha(1->3)bDGlcNAc(1->" -r 6 -o Shigella_Y_RU6
On MacOS [Note that running under MacOS requires installation of the mono framework]:mono CarbBuilder2.exe -i "->2)aLRha(1->2)aLRha(1->3)aLRha(1->3)bDGlcNAc(1->" -r 6 -o Shigella_Y_RU6
COMMAND-LINE ARGUMENTS
- -i [CASPER_sequence] (REQUIRED) Specify the primary sequence for the carbohydrate in the CASPER format. The residues currently supported by CarbBuilder are listed here.
- -r [number_of_repeating_units] (Optional)
- -o [outputfile_name] Write the pdb output to outputfile_name
- -d [dihedral_file_name] User-specified dihedral values for specific linkages.
- -all Generate all possible CarbBuilder structures (i.e. all possible combinations of the dihedral angle alternatives)for a carbohydrate sequence as frames in a single pdb file. This will be truncated if the number of structures is > 4000.
- -h Display help information
Specify number of repeating units to be built of the primary sequence. This will be ignored if the CASPER notation for a repeating unit is not provided. Example: ->4)aDGlc(->1
These must be in the format:
[Res1] 1 [link position] [Res2],2,[phiVal] [psiVal],[altPhiVal1] [altPsiVal1],[altPhiVal2] [altPsiVal2]
e.g:aDGlc 1 2 aDGlc,2,-26 -36,-39 169,-26 41,-151 24
Examples
- CarbBuilder2.exe -i "->4)aDGlc(1->" -r 20 -o starch This will build 20 repeats of a linear homopolysaccharide output in the starch.pdb file.
- CarbBuilder2.exe -i "aDMan(1->4)[aDGal(1->6)]aDGlc" -o trisacc This will build a branched trisaccharide.
- CarbBuilder2.exe -i "aDGlc3Ac" -o aDGlc3OAc This will build a glucose monsaccharide with 3 O-acetylation.
- CarbBuilder2.exe -i "->3)bDRibf(1->1)Ribol5P(O->" -r 6 -o Hib_6RU This will build 6-repeat units of the Haemophilus influenzae capcular polysaccharide, and illustrates both the linear ribotol residue as well as how to specify a phosphate linkage.
- CarbBuilder2.exe -i "->6)[aDNeu5Ac(2->3)bDGal(1->4)]bDGlcNAc(1->3)bDGal(1->4)bDGlc(1->" -r 6 -o strepBIII
This will build 6-repeat units of Group B Streptococcus serotype III polysaccharide. This example illustrates branching in a heterpolysaccharide, as well as how to specify linkage from the 2-position in sialic acid (aDNeu5Ac). Note that side chains are in square brackets in the CASPER format - [aDNeu5Ac(2->3)bDGal(1->4)] - and are always listed immediated before the branching residue.
CASPER format for structure representation in CarbBuilder
In the CASPER format, monosaccharide residues (e.g. aDGlc, bLGalf, bLRha6Ac, aDManNAc) are represented by (in order):
- a single letter lower case letter [a,b] representing the anomeric configuration
- a single upper case letter [D,L] representing the absolute configuration
- a three letter residue abbreviation, with the first letter capitalized (e.g. Glc, Gal, Man, Rha). The residue names currently supported by CarbBuilder are listed here.
- an optional 'f' to indicate a furanose residue (the default is pyranose)
- multiple optional substituents, in alphabetial order with substitution positions specified, eg. 2Ac, 3OMe, 2NAc. Note that carboxylic acids do not have a position specifed (e.g. GlcA, GalA etc.).
Linkages are described within round brackets, uning "->" to indicate the direction. For examople, aDGal(1->4)aDGlcOMe; aDNeuNAc(2->6)aDGal etc.
Side chains are represented within square brackets placed immediately before the branching residue
e.g.
- aDGal(1->4)aDGlc(1->2)[aLRha(1->4)]aDGlcOMe
- aDMan(1->2)[aDMan(1->3)][aDMan(1->6)]aDMan
Repeating units are represented by "open" linkages at either end:
->4)aDGlc(1->6)aDGal(1->
or, with a branch,:
->4)[aDGlc(1->6)]aDGal(1->
Some example structures:
A heptasaccharide:
bDGal(1->4)bDGlcNAc(1->2)aDMan(1->3)[bDGal(1->4)bDGlcNAc(1->2)aDMan(1->6)]aDMan
O128 O-antigen PS:
->6)[aLFuc(1->2)]bDGal(1->3)bDGalNAc(1->4)aDGal(1->3)bDGalNAc(1->
Shigella flexneri X:
->2)[aDGlc(1->3)]aLRha(1->2)aLRha(1->3)aLRha(1->3)bDGlcNAc(1->
Methyl beta-cellobioside:
bDGlc(1->4)bDGlcOMe
Xylose containing N glycan:
bDXyl(1->2)[aDMan(1->3)][aDMan(1->6)]bDMan(1->4)bDGlcNAc(1->4)bDGlcNAc
Repeating unit of E. coli O128 O-antigen:
->6)[aLFuc(1->2)]bDGal(1->3)bDGalNAc(1->4)aDGal(1->3)bDGalNAc(1->
Repeating unit of E. coli O105 O-antigen:
->4)aDGlcA2,3diAc(1->2)[bDRibf(1->3)]aLRha4Ac(1->3)bLRha(1->4)bLRha(1->3)bDGlcNAc6Ac(1->