Can Peptides Replace DNA and Contribute to the Encoded Compound Library?

In addition to DNA, peptides, as amino acid polymers, have great potential for information storage (Figure 1A). Recently, Stephen L. Buchwald and Bradley L. Pentelute of Massachusetts Institute of Technology (MIT) published a paper in Science, abiotic peptides as carriers of information for the encoding of small-molecule library synthesis, which specifies the construction of peptide-encoded libraries (PELs) (Figure 1B).

After careful design, the encoded peptide has excellent information density and chemical stability. The sequences are optimized to fine-tune polarity and make them easy to sequence, enabling high-fidelity decoding by tandem mass spectrometry. This peptide “barcode” label is chemically stable and tolerates a wide range of reaction conditions, giving synthetic diversity to the conversion of small molecules, including acidic conditions or transition metal catalysis that are currently incompatible with DNA-encoded libraries (DELs). To verify this, they construct PELs containing tens of thousands of small drug molecules and screen out small molecules with high affinity for carbonic anhydrase IX, BRD4, and MDM2.

The potential of peptides to store information, PELs construction, and small molecule hit screening.
Figure 1. The potential of peptides to store information, PELs construction, and small molecule hit screening.

The authors identify a set of information units (Figure 2A) containing 16 amino acid monomers that natural and unnatural amino acids could easily be introduced by chemical synthesis under the proper protection of their side chains. The peptide tag of 11 amino acid monomers shown in Figure 2A includes four spacer monomers (green triangles and blue diamonds) that appear at fixed locations to increase sequencing fidelity by acting as structural constraints during compound screening. Each of the eight coding sites (one is a spacer) can be occupied by one of the 16 amino acid monomers (hexadecimal), thus providing a theoretical 4.3 billion different codes, much higher than the coding capacity of binary and quaternary (four DNA bases of DEL) (Figure 2C). In addition, alkaline residues enhance the solubility of peptide labels and improve sequencing accuracy, while protic side chains have fine-adjust polarity. Alkaline and protic residues at different fixed locations are screened to obtain the optimal label structure, which is characterized by a lysine near the C-terminal and N-terminal, a serine at the core of the label, and an aliphatic residue at the N-terminal. The information encoded by peptide tags can be easily decoded by nLC-MS/MS (Figure 2B). In fact, with optimization, peptide labels as low as 10 fmol can be successfully sequenced using nLC-MS/MS.

High stability and high capacity of peptide labels in information storage
Figure 2. High stability and high capacity of peptide labels in information storage

Advantages of PELs

Figure 3 shows that PEL technology has many advantages over DEL. Most notably, PEL technology is more stable and can support more demanding and diverse chemical reactions, including metal-catalyzed reactions and reactions requiring strong acid or base conditions, which indicates that PEL technology can synthesize a wider range of drug-like molecules. Another advantage is that applications of peptide and small molecule solid-phase synthesis can use excess reactants, which in turn leads to higher yields and purity of final small molecules, which is expected to significantly improve the quality of the compound library.

Stability of model peptides and DNA labels under relevant conditions
Figure 3. Stability of model peptides and DNA labels under relevant conditions

They subsequently design a molecular skeleton with two sites for orthogonal synthesis (Figure. 4A) that enables the synthesis of PEL through the continuous small molecule and peptide synthesis via a completely orthogonal protecting group strategy. The molecular framework is bound to solid phase carrier polystyrene beads by Rink Amide Linker, which can be cleaved under strongly acidic conditions (while leading to global deprotection). Lysine residues act as branching points to covalently connect peptides and small molecules. Peptides are attached to the molecular skeleton by the Seramox Linker (Smx) and can be orthogonally cleaved under optimized oxidation conditions to release peptides for sequencing. Peptides and small molecules are functionalized in turn using orthogonal protecting groups (Fmoc, Trt, Alloc), thus encoding by conjugation of specific amino acids to peptides before or after the functionalization of the corresponding small molecule. As shown in Figure 4B, the authors perform palladium-catalyzed cross-coupling reactions based on the chemical stability of the protected peptide and the advantages of solid-phase synthesis. For palladium-catalyzed C-C bond formation reactions, they found that a fourth-generation palladium precatalyst (XPhos Pd G4) with a biaryl phosphine ligand XPhos is capable of efficiently cross-coupling 36 aryl boric acids with 14 resin-bound arylbromo-peptide conjugates with sufficient purity for library synthesis (>70% purity). Similarly, the Alphos-linked palladium dimer is capable of efficiently cross-coupling 41 aniline derivatives with 11 resin-bound bromate-peptide conjugates with purity equally suitable for library synthesis (>70% purity). It should be noted that the substrate range and purity of these reactions are significantly superior to palladium-catalyzed cross-coupling reactions in the presence of DNA, enabling cross-coupling of various heterocycles prevalent in drugs (Figure 4C). The compatibility of peptide tags with different synthesis conditions is further determined by acid-mediated Pictet-Spengler reactions, which previously reported a loss of coding information even in stable DEL systems. The authors prepare two PELs using combinatorial chemistry. The library of compounds is characterized by a central building block (BB1) containing a protected amine coupled to a carboxylic acid building block (BB2) and aryl bromide for a palladium-catalyzed amine or boric acid (BB3) cross-coupling reaction. A library of 41,000 molecules of C-N or 39,000 molecules of C-C is obtained, respectively. Notably, the high efficiency of solid-phase synthesis allows this multi-step compound library synthesis to take place in less than a week. In addition, quantitative evaluation of drug-like properties of individual members of the compound library shows satisfactory properties of most compounds (Figure 4E).

A peptide conjugate consisting of 3 blocks and 11 encoded amino acids
Figure 4. A peptide conjugate consisting of 3 blocks and 11 encoded amino acids

Discovery of high-affinity small molecules based on PELs

The authors use the two PELs for affinity screening to identify small molecules with a nanomolar affinity for carbonic anhydrase IX (CA IX) (Figure 5A). In an automated procedure, biotinized CA IX is fixed to magnetic beads with streptavidin and incubated with PEL. The unbound molecules are then removed by repeated washing steps. In the retained conjugates, the encoding peptide is released under oxidative conditions and analyzed by nLC-MS/MS. They obtain 11 hit molecules from PELs, which are easily synthesized in a solid phase carrier within 2 days, requiring purification only once. Verification results show that all hit molecules exhibit CA IX affinity in the range of several to tens of nanomoles (Figure 5B). In addition, based on PEL, hit molecules with high affinity for BRD4 and MDM2 are also identified (Figure 6).

Discovery of small molecules with high affinity for CA IX based on PEL
Figure 5. Discovery of small molecules with high affinity for CA IX based on PEL

Discovery of small molecules with high affinity for BRD4 and MDM2 based on PEL
Figure 6. Discovery of small molecules with high affinity for BRD4 and MDM2 based on PEL

Conclusion

In this paper, researchers demonstrate that abiotic peptides can be used as information storage media to encode small molecule synthesis and construct an innovative drug discovery platform, namely peptide-encoded libraries (PELs). Hit resynthesis derived from PEL can still be carried out through solid-phase synthesis using the same conditions as building the library, making the synthesis process faster and identifying potential by-products easier. PEL is regarded as one of the starting points for the next-generation coding library technologies that will have a broad impact on drug discovery and biochemical research.