The Interface Panel: Part II

Creating a new Music File

1. Select a music program. To begin with, only the DEFAULT and ADH programs will be available, but you can create other programs from these. Highlight the DEFAULT program and then enter the 8-character name you want to use for your new music file, for example MUSIC, in the box by the Clone to button. The program will add the .KAM extension. Then hit the Clone to button to save the Default program under the name MUSIC.KAM. The new file name, MUSIC.KAM, will then appear in the Music File listing.

2. Select a protein sequence to load into MUSIC.KAM. You can use one of the files provided with the program, or search the data bases for a sequence that interests you. The DEFAULT.PRO file is preloaded with the amino acid sequences of the beta-globin protein of four different species: human, bat, whale and echidna. The DEFAULT data file contains the information for these sequences.

3. To read a data file, highlight it and click the Edit PRO button. The sequence and some information about the sequence will appear in a Microsoft Notepad text file. The sequence is given using the standard single-letter amino acid codes used in protein data bases. For example, the sequence for echidna beta-globin reads:

VHLSG(SEKTAVTNLW)<GH>VN(VNELGGEALGRLLVV)
Y(PWTQRF)F(ESF)GDLSS(ADAVM)<G>N(AKVKAHGAKVLTSFGDAL)
<KN>(LDNLKGTFAKLSELHCD)<KL>HVD<P>
(ENFNRLGNVLVVVLARHFSKE)FT(PEAQAAWQKLVSGVSHALA)<HK>YH

Note that the file names all have the extension .PRO. When you add new files to your data base, you should save them to your program directory using this extension.

The marks ( ), [ ], and < > are used as flags to designate regions of alpha helix, beta sheet and turns respectively. These are the parts of the sequence that will play if you select Alp, Bta, or Trn in the Structure Selection sections of the Music Panels. However, these markings can also be used to enclose other parts of a sequence that interest you. Information on the location of alpha, beta and turn regions of many proteins is included in the protein data bases.

4. To load a protein sequence into the program, one of the sequences must be marked with a tilde (~) at each end, for example:

~VHLSG(SEKTAVTNLW)<GH>VN(VNELGGEALGRLLVV)
Y(PWTQRF)F(ESF)GDLSS(ADAVM)<G>N(AKVKAHGAKVLTSFGDAL)
<KN>(LDNLKGTFAKLSELHCD)<KL>HVD<P>
(ENFNRLGNVLVVVLARHFSKE)FT(PEAQAAWQKLVSGVSHALA)<HK>YH~

Only the parts of the text file included between the beginning and ending ~ marks are loaded into the Music program; all other sequences and any comments you might want to include with your sequence data will be ignored.

Note: if you import a file from one of the genetics data bases, be sure to check for and remove stray ~ marks. Other marks can be ignored; ( ), [ ] and < > marks are read only when they are between two ~ marks. In addition, if you with to include other non-text markings within the sequence for your own information, you may do so: the program will read only letters and the structural markings noted above.

After you have marked a sequence with the ~, enter a carriage return after the tilde, then save and exit the text file. You will be returned to the Interface Panel.

5. To load your marked data file into the MUSIC program, decide which of the four data sets to load it into and click on the circle beside the set you have selected. Set 1 is the default selection. The echidna sequence marked above is preloaded into Set 4; however you can load it again, for practice. Select Set 4 and then click on the Update button.

You will get a program message telling you that the file has been uploaded. Near the bottom of the message, it should say:

146 amino acids translated. 0 stops; 30 flags.

This message tells you that the sequence data has been correctly entered and that the program has detected 30 flag markings. When you have read this to verify that your data has been loaded, close the message file by clicking on the upper right "X."  This will take you to the Music Panels.

7. If you wish to load several different files into the same music program, either from the same or different data files, you must repeat the operations above for each protein sequence. If you have several different protein sequences stored on a single data file, you may load all (up to 4 ) the sequences from this file; however, you will need to edit the data file (to mark (~) the sequences for loading) and reload each sequence separately.

Setting up new data files.

The data file below (SWISS-PROT: P03999) is an edited version of information from the Swiss-Prot <http://expasy.hcuge.ch/> data base. The protein is for one of the human retinal pigments for color vision: the blue cone pigment. Cut and paste the entire file below into a Notepad text file and save it as BLUECONE.PRO. in the BioSon Program directory.

Activate the BioSon program. In the Interface Panel, highlight your new BLUECONE.PRO file and click on the EDIT button. The file you have saved will come up as a Microsoft Notepad text file. The file does not contain information on alpha, beta, and turn regions of the protein, but does identify other structural features. This is a transmembrane protein that folds back and forth as it crosses and recrosses the membrane. Transmembrane, extracellular and cytoplasmic regions of the protein are identified and may be flagged using the alpha ( ), beta [ ]and fold < > flags. Insert the flag markers into the file by setting the open marker (~) before the first amino acid and the closed marker (~) after the last amino acid of each region as identified in the data file. Write yourself a note in the file to identify which flag markers you are using for which of the structural domains of the protein. When you have finished inserting the flag markers, resave the file and exit to the Interface Panel.

You can now CLONE a new music file called BLUECONE from the DEFAULT file. Load your new BLUECONE. PRO data file into the newly created (cloned)   BLUECONE.KAM music file. You are now ready to begin composing your new piece.

SWISS-PROT: P03999
ID OPSB_HUMAN STANDARD; PRT; 348 AA.
AC P03999; Q13877;
DE BLUE-SENSITIVE OPSIN (BLUE CONE PHOTORECEPTOR PIGMENT).
GN BCP.
OS HOMO SAPIENS (HUMAN).
OC EUKARYOTA; METAZOA; CHORDATA; VERTEBRATA; TETRAPODA; MAMMALIA;
OC EUTHERIA; PRIMATES.
CC -!- FUNCTION: VISUAL PIGMENTS ARE THE LIGHT-ABSORBING MOLECULES THAT
CC MEDIATE VISION. THEY CONSIST OF AN APOPROTEIN, OPSIN, COVALENTLY
CC LINKED TO CIS-RETINAL.
CC -!- SUBCELLULAR LOCATION: INTEGRAL MEMBRANE PROTEIN.
CC -!- TISSUE SPECIFICITY: THE THREE COLOR PIGMENTS ARE FOUND IN THE CONE
CC PHOTORECEPTOR CELLS.
CC -!- PTM: SOME OR ALL OF THE CARBOXYL-TERMINAL SER OR THR RESIDUES MAY
CC BE PHOSPHORYLATED.
CC -!- DISEASE: DEFECTS IN BCP ARE THE CAUSE OF TRITAN COLOR BLINDNESS
CC (TRITANOPIA).
CC -!- THIS OPSIN HAS AN ABSORPTION MAXIMA AT 420 NM.
CC -!- SIMILARITY: BELONGS TO FAMILY 1 OF G-PROTEIN COUPLED RECEPTORS.
CC BELONGS TO THE OPSIN SUBFAMILY.
KW PHOTORECEPTOR; RETINAL PROTEIN; TRANSMEMBRANE; GLYCOPROTEIN; VISION;
KW PHOSPHORYLATION; G-PROTEIN COUPLED RECEPTOR; DISEASE MUTATION.
FT DOMAIN 1 33 EXTRACELLULAR.
FT TRANSMEM 34 58 1 (POTENTIAL).
FT DOMAIN 59 70 CYTOPLASMIC.
FT TRANSMEM 71 96 2 (POTENTIAL).
FT DOMAIN 97 110 EXTRACELLULAR.
FT TRANSMEM 111 130 3 (POTENTIAL).
FT DOMAIN 131 149 CYTOPLASMIC.
FT TRANSMEM 150 173 4 (POTENTIAL).
FT DOMAIN 174 199 EXTRACELLULAR.
FT TRANSMEM 200 227 5 (POTENTIAL).
FT DOMAIN 228 249 CYTOPLASMIC.
FT TRANSMEM 250 273 6 (POTENTIAL).
FT DOMAIN 274 281 EXTRACELLULAR.
FT TRANSMEM 282 306 7 (POTENTIAL).
FT DOMAIN 307 348 CYTOPLASMIC.
FT CARBOHYD 14 14 PROBABLE.
FT DISULFID 107 184 POTENTIAL.
FT BINDING 293 293 RETINAL CHROMOPHORE.

SQ SEQUENCE 348 AA; 39135 MW; 37CCD502 CRC32;
MRKMSEEEFY LFKNISSVGP WDGPQYHIAP VWAFYLQAAF MGTVFLIGFP LNAMVLVATL
RYKKLRQPLN YILVNVSFGG FLLCIFSVFP VFVASCNGYF VFGRHVCALE GFLGTVAGLV
TGWSLAFLAF ERYIVICKPF GNFRFSSKHA LTVVLATWTI GIGVSIPPFF GWSRFIPEGL
QCSCGPDWYT VGTKYRSESY TWFLFIFCFI VPLSLICFSY TQLLRALKAV AAQQQESATT
QKAEREVSRM VVVMVGSFCV CYVPYAAFAM YMVNNRNHGL DLRLVTIPSF FSKSACIYNP
IIYCFMNKQF QACIMKMVCG KAMTDESDTC SSQKTEVSTV SSTQVGPN