Tuesday, January 11, 2011

SMILES......




SMILES





Introduction


The SMILESTM Toolkit is a chemical information programming library that supports a number of utility objects (streams, sequences, paths, substructs). It used the most current SMILESTM language providing full support for organic, inorganic, isotopic, and general (not limited to tetrahedral) chiral specifications, including partially specified chirality.



Objects supported by this Toolkit include:


Atom - object representing an atom in a molecule
Bond - object representing a bond in a molecule
Cycle - object representing a ring in a molecule
Integer - object representing an integer
Molecule - object representing a molecule
Real - object representing a real (floating-point) number
Sequence - object holding other objects in a particular order
Stream - object used to enumerate constituents of another object
String - object representing a string
Substruct - object representing a substructure


This Toolkit can be used to for:


1.Molecular analysis and manipulation
2.Parsing of SMILESTM
3.Generation of unique SMILESTM and unique isomeric SMILESTM
4.SSSR (Smallest Set of Smallest Rings) analysis
5.Generic functionality (objects, error handling)
6.Prerequisite for all other toolkits





SMILES - A Simplified Chemical Language



SMILES (Simplified Molecular Input Line Entry System) is a line notation (a typographical method using printable characters) for entering and representing molecules and reactions. Some examples are:




SMILES contains the same information as might be found in an extended connection table. The primary reason SMILES is more useful than a connection table is that it is a linguistic construct, rather than a computer data structure. SMILES is a true language, albeit with a simple vocabulary (atom and bond symbols) and only a few grammar rules. SMILES representations of structure can in turn be used as "words" in the vocabulary of other languages designed for storage of chemical information (information about chemicals) and chemical intelligence (information about chemistry).


Part of the power of SMILES is that unique SMILES exist. With standard SMILES, the name of a molecule is synonymous with its structure; with unique SMILES, the name is universal. Anyone in the world who uses unique SMILES to name a molecule will choose the exact same name.

One other important property of SMILES is that it is quite compact compared to most other methods of representing structure. A typical SMILES will take 50% to 70% less space than an equivalent connection table, even binary connection tables. For example, a database of 23,137 structures, with an average of 20 atoms per structure, uses only 1.6 bytes per atom when represented with SMILES. In addition, ordinary compression of SMILES is extremely effective. The same database cited above was reduced to 27% of its original size by Ziv-Lempel compression (i.e. 0.42 bytes per atom).






These properties open many doors to the chemical information programmer. Examples of uses for SMILES are:



a.Keys for database access
b.Mechanism for researchers to exchange chemical information
c.Entry system for chemical data
d.Part of languages for artificial intelligence or expert systems in chemistry



<><><><><><><><><><><>
<><><>
StructureSMILE notation





















Other examples











No comments:

Post a Comment