Kamis, 04 Desember 2014

TUTORIAL APLIKASI OPENBABEL


The obabel command line program converts chemical objects (currently molecules or reactions) from one file format to another. The Open Babel graphical user interface (GUI) is an alternative to using the command line and has the same capabilities. Since Open Babel 2.3, the GUI is available cross-platform on Windows, Linux and MacOSX. On Windows, you can find it in the Start Menu in the Open Babel folder; on Linux and MacOSX, the GUI can be started with the obgui command.
Since the functionality of the GUI mirrors that of obabel, you should consult the previous chapter to learn about available features and how to use them. This chapter describes the general use of the GUI and then focuses on features that are specific to the GUI.

Basic operation
Although the GUI presents many options, the basic operation is straightforward:
Select the type of the type of the input file from the dropdown list.
Click the “...” button and select the file. Its contents are displayed in the textbox below.
Choose the output format and file in a similar way. You can merely display the output without saving it by not selecting an output file or by checking “Output below only..”.
Click the “Convert” button.
The message window below the button gives the number of molecules converted, and the contents of the output file are displayed.
By default, all the molecules in an input file are converted if the output format allows multiple molecules.



Options
The options in the middle are those appropriate for the type of chemical object being converted (molecule or reaction) and the input and output formats. They are derived from the description text that is displayed with the -Hxxx option in the command line interface and with the “Format info” buttons here. You can switch off the display of any of the various types of option using the View menu if the screen is getting too cluttered.

Multiple input files
You can select multiple input files in the input file dialog in the normal way (for example, using the Control key in Windows). In the input filename box, each filename is displayed relative to the path shown just above the box, which is the path of the first file. You can display any of the files by moving the highlight with Tab/Shift Tab, Page Up/Down, the mouse wheel, or by double clicking.
Selecting one or more new file names normally removes those already present, but they can instead be appended by holding the Control key down when leaving the file selection dialog.
Files can be also be dragged and dropped (e.g. from Windows Explorer), adding the file when the Control key is pressed, replacing the existing files when it is not.
Normally each file is converted according to its extension and the input files do not have to be all the same, but if you want to use non-standard file names set the checkbox “Use this format for all input files...“
If you want to combine multiple molecules (from one or more files) into a single molecule with disconnected parts, use option “Join all input molecules...“

Wildcards in filenames
When input filenames are typed in directly, any of them can contain the wildcard characters * and ?. Typing Enter will replace these by a list of the matching files. The wildcarded names can be restored by typing Enter while holding down the Shift key. The original or the expanded versions will behave the same when the “Convert” button is pressed.
By including the wildcard * in both the input and output filenames you can carry out batch conversion. Suppose there were files first.smi, second.smi, third.smi. Using*.smi as the input filename and *.mol as the output filename would produce three files first.mol, second.mol and third.mol. If the output filename was NEW_*.mol, then the output files would be NEW_first.mol, etc.

Local input
By checking the “Input below...” checkbox you can type the input text directly. The text box changes colour to remind you that it is this text and not the contents of any files that will be converted.

Output file
The output file name can be fully specified with a path, but if it is not, then it is considered to be relative to the input file path.

Graphical display
The chemical structures being converted can be displayed (as SVG) in an external program. By default this is Firefox but it can be changed from an item on the Viewmenu (for instance, Opera and Chrome work fine). When “Display in firefox” (under the output file name) is checked, the structures will be shown in a new Firefox tab. With multiple molecules the display can be zoomed (mousewheel) and panned (dragging with mouse button depressed). Up to 100 molecules are easily handled but with more the system may be slow to manipulate. It may also be slow to generate, especially if 2D atom coordinates have to be calculated (e.g.from SMILES). A new Firefox tab is opened each time Convert is pressed.

Using a restricted set of formats
It is likely that you will only be interested in a subset of the large range of formats handled by Open Babel. You can restrict the choice offered in the dropdown boxes, which makes routine selection easier. Clicking “Select set of formats” on the Viewmenu allows the formats to be displayed to be selected. Subsequently, clicking “Use restricted set of formats” on the View menu toggles this facility on and off.
Using a restricted set overcomes an irritating bug in the Windows version. In the fileOpen and Save dialogs the files displayed can be filtered by the current format, All Chemical Formats, or All Files. The All Chemical Formats filter will only display the first 30 possible formats (alphabetically). The All Files will indeed display all files and the conversion processes are unaffected.

Other features
Most of the interface parameters, such as the selected format and the window size and position, are remembered between sessions.
Using the View menu, the input and output text boxes can be set not to wrap the text. At present you have to restart the program for this to take effect.
The message box at the top of the output text window receives program output on error and audit logging, and some progress reports. It can be expanded by dragging down the divider between the windows.

Example files
In the Windows distribution, there are three chemical files included to try out:
serotonin.mol which has 3D atom coordinates
oxamide.cml which is 2D and has a large number of properties that will be seen when converting to SDF
FourSmallMols.cml which (unsurprisingly) contains four molecules with no atom coordinates and can be used to illustrate the handling of multiple molecules:
Setting the output format to SMI (which is easy to see), you can convert only the second and third molecules by entering 2 and 3 in the appropriate option boxes. Or convert only molecules with C-O single bonds by entering CO in the SMARTS option box.

Tutorial selengkapnya dapat dilihat disini

Filtering Structure
Setup
We are going to use a dataset of 16 benzodiazepines. These all share the following substructure (image from Wikipedia):
  • Create a folder on the Desktop called Work and save benzodiazepines.sdf there
  • Set up a conversion from SDF to SMI and set benzodiazepines.sdf as the input file
  • Tick Display in Firefox
  • Click CONVERT
Remove duplicates
If you look carefully at the depictions of the first and last molecules (top left and bottom right) you will notice that they depict the same molecule.
  1. Look at the SMILES strings for the first and last molecules. If the two molecules are actually the same, why are the two SMILES strings different? (Hint: try using CAN - canonical SMILES instead of SMI.)
We can remove duplicates based on the InChI (for example):
  • Tick the box beside remove duplicates by descriptor and enter inchi as the descriptor
  • Click CONVERT
Duplicates can be removed based on any of the available descriptors. The full list can be found in the menu under Pluginsdescriptors.
  1. Are any of the other descriptors useful for removing duplicates?
Filtering by substructure
  1. How many of the molecules contain the following substructure?
The SMILES string for this molecule is c1ccccc1F. This is also a valid SMARTS string.
  1. Use the SMARTSviewer at the ZBH Center for Bioinformatics, University of Hamburg, to verify the meaning of the SMARTS string c1ccccc1F.
Let’s filter the molecules using this substructure:
  • In the Options section, enter c1ccccc1F into the box labeled Convert only if match SMARTS or mols in file
  • Click CONVERT.
  1. How many structures are matched?
  • Now find all those that are not matched by preceding the SMARTS filter with a tilde ~, i.e. ~c1ccccc1F.
  • Click CONVERT.
  1. How many structures are not matched?
Filter by descriptor
As discussed above, Open Babel provides several descriptors. Here we will focus on the molecular weight, MW.
To begin with, let’s show the molecular weights in the depiction:
  • Clear the existing title by entering a single space into the box Add or replace molecule title
  • Set the title to the molecular weight by entering MW into the box Append properties or descriptors in list to title
  • Click CONVERT
You should see the molecular weight below each molecule in the depiction. Notice also that the SMILES output has the molecular weight beside each molecule. This could be useful for preparing a spreadsheet with the SMILES string and various calculated properties.
Now let’s sort by molecular weight:
  • Enter MW into the box Sort by descriptor and click CONVERT
Finally, here’s how to filter based on molecular weight. Note that none of the preceding steps are necessary for the filter to work. We will convert all those molecules with molecular weights between 300 and 320 (in the following expression & signifies Boolean AND):
  • Enter MW>300 & MW<320 into the box Filter convert only when tests are true and click CONVERT
  1. If | (the pipe symbol, beside Z on the UK keyboard) signifies Boolean OR, how would you instead convert all those molecules that do not have molecular weights between 300 and 320?

0 komentar:

Posting Komentar