For this the application weblogo 3 (Crooks et al., 2004; http://weblogo.threeplusone.com) was used. Sequence logos have been useful in visualizing patterns in aligned sequence motifs (Schneider & Stephens, 1990) and have indeed been used to analyse Tat motifs (see e.g. Bendtsen et al., 2005). We used this to compare the Tat motifs of haloarchaeal Tat substrates with that of the consensus E. coli Akt inhibitor motif
(S/TRRxFLK). Signal peptide-containing sequences were extracted from genomes of E. coli and three fully sequenced haloarchaea: H. marismortui, N. pharaonis, and H. salinarum. The datasets obtained (see Supporting Information, Table S1) were filtered as outlined in Materials and methods to minimize the number of false-positive hits. Current information of prokaryotic signal peptides in general and the Tat system more specifically is mostly derived from bacterial systems, and as such, our searches may have been biased towards bacterial-like signal peptides. This, and our additional filtering, has most likely led to the absence of some genuine Tat signal peptides. Indeed, some proteins that are known to be Tat substrates in E. coli are missing from our dataset, including FdnH, HyaA, and HybO, all of which have been shown experimentally CDK inhibitor to be Tat substrates (Hatzixanthis et al., 2003; Berks et al., 2005).
However, these three contain C-terminal transmembrane helices, which is the reason why our filtering steps rejected them. Nevertheless, only a fairly small proportion of Tat substrates have such additional
membrane-spanning domains, and we think that this approach has also resulted in datasets with very few or no false-positive proteins. The twin-arginine motifs obtained were aligned manually and used to generate sequence logos (Fig. 1). As can be observed from the top panel, our method used indeed led to a motif with the consensus SRRxFLK SPTLC1 as observed before (Berks, 1996). The twin-arginine motifs in haloarchaea were similar, but with a number of notable differences. Firstly, the dominance of Phe in position 5 is less pronounced than in E. coli; Val is found in that position in a very similar frequency. Secondly, Leu in position 6 appears to be far more frequent in haloarchaeal Tat motifs as compared with the E. coli Tat motif. Finally, the Lys in position 7 is less common in haloarchaea as compared with E. coli. Some of these differences may be attributable to the overall differences in the amino acid composition between halophilic and nonhalophilic proteins. For instance, haloarchaea contain, on average, fewer large hydrophobic residues such as Phe, as well as a relatively low percentage of lysine residues as compared with bacteria such as E. coli or Bacillus subtilis (Bolhuis et al., 2007). In this respect, the prominence of Leu in position 6 is actually interesting as this residue is, like Phe, less frequent in haloarchaeal proteins.