The increasing use and constant development of next-generation sequencing (NGS) have increased its potential for disease diagnoses, clinical oncology, variant detection, comparative genomic profiling, new genome assembly, and exploratory research. Among the many necessary components, sequencing adapters are critical for NGS in ligation-based workflows. This article will cover the basics of sequencing adapters, including their structure, their various functions, and important workflow considerations.

Structure and function of NGS adapters

Adapters are short DNA sequences ligated onto the ends of DNA inserts (target sequences) during library preparation. The basic structure of an NGS adapter contains several important parts required for its functions.

Flow cell binding sequence (P5 and P7): This region allows the sequencing library to bind to the flow cell and prevents libraries from being washed away during the sequencing process.

Index sequence (i5 and i7): Most adapters contain index sequences that serve as “barcodes” or “tags” to identify the sample library that an individual DNA insert belongs to. These indexes are used in the analysis process to separate the sequencing reads by sample when multiple libraries are combined in the same sequencing run, known as multiplexing.

Sequencing primer binding site (Rd1 SP and Rd2 SP): During sequencing, a primer binds to this region and initiates the sequencing by synthesis of the target DNA.

Unique molecular identifiers (UMI) are short sequences inside the adapter that function as additional barcodes similar to indexes. However, the UMI is used to identify individual target sequences within a sample, while index sequences are used to distinguish different sample libraries from each other.

NGS adapter

Figure 1. The basic structure of an NGS adapter prior to amplification, showing key functional motifs. Post-amplification, the two strands will be complementary to each other along their entire length.

 

Advantages of unique dual-indexing schemes

In unique dual indexing schemes, there are no shared index sequences (i5 and i7) between the other adapters used; this contrasts with combinatorial dual indexing (DI) schemes, where it is possible that there is some index redundancy between samples. Adapters that contain unique dual indexes (UDIs) reduce the effect of index-hopping, a common sequencing error.1 Index hopping occurs when index sequences become improperly switched with other libraries or free-floating adapters. This leads to index misassignment as the index and target sequence are incorrectly attributed to another sequencing library. Because each set of UDIs contains unique and known sets of indexes, unidentified index pairs created from index-hopping are instead discarded from further data analysis. The use of UDIs also allows for increased multiplexing, which significantly lowers the sequencing costs.

NGS adapter

Figure 2. Demonstration of how UDIs can prevent inaccurate data. When index hopping occurs in workflows with combinatorial dual indexes (top), this can lead to reads being assigned to the wrong sample during data analysis. When workflows utilizing UDIs (bottom) encounter index hopping, incorrectly indexed reads can be readily removed from further analysis.

Improving data quality with UMIs

Due to the uniqueness of each sequence, UMIs can be used to detect and filter out PCR duplicates.2 This is important for quantitative sequencing protocols and confirmation of variants (as seen in Figure 3). Adapters containing UMIs are not required, but they are recommended when investigating low-level transcripts, variant detection, or when using low-input samples.

NGS adapter

Figure 3. Using UMIs to confirm true mutations using PCR duplicates. The duplicates are aligned during data analysis and used to confirm the presence of a variant across all duplicates (true mutation) or just a single variant (artifact) that is removed from analysis.

Common types of adapters

A full-length adapter is a type of NGS adapter that contains all the essential regions needed for sequencing once it is ligated to the insert. These adapters do not need to be further modified and are frequently used because they fit into many workflows and often require less steps during library preparation. Both UDIs and UMIs are available with full-length adapters, although their configuration is preset.

Truncated adapters are shorter adapters that are ligated onto the insert without indexes or the flow cell binding region. These additional sequences are added to the adapter later in library preparation through an additional PCR step. Truncated adapters can be selected with UMIs, and have many options for indexes including UDIs. Many workflows utilize truncated adapters due to their increased ligation efficiency and the tendency to produce fewer adapter dimers than full-length adapters.

Workflow considerations

There are several important considerations when determining the proper adapters for an NGS workflow. The suitable adapter will be different depending on the library prep, number of samples, and scope of the study. Take the following workflow options into consideration before choosing an adapter:

  • Sample workflows that are PCR-free (e.g., WGS) require the use of full-length adapters; however, either full-length or truncated adapters are appropriate for most library preps that contain a PCR step.
  • Adapters with UMIs are recommended for studies involving low-frequency variant detection, deep sequencing, and low-input samples because they enable the removal of PCR duplicates and increase the accuracy of rare variant detection.
  • Sequencers with patterned flow cells have higher rates of index-hopping,3 and therefore, the use of adapters containing UDIs is strongly encouraged to mitigate index misassignment.
  • Truncated adapters may be preferable for users seeking lower rates of adapter dimers, increased ligation efficiencies, and increased index capabilities.
  • Adapters with UDIs can be used to increase multiplexing and improve data quality, consequently lowering the overall sequencing cost per sample.
  • Library prep kits sometimes have specific requirements for the types of adapters that may be used; this may dictate which adapter types are compatible with some workflows.

References

1. Costello M, et al. BMC Genomics. 2018;19:332

2. Fu, et al. BMC Genomics. 2018;19:531

3. Illumina. Effects of Index Misassignment on Multiplexing and Downstream Analysis. 2017. Accessed October 2022.

About the Author

Benjamin Atha has over 9 years of experience working in molecular biology laboratories. He received his B.A. in biology from Hood College, and also received his M.S. in biological sciences from Towson University where his thesis focused on protein functions and post-translation modifications. After graduation, Ben began working with next-generation sequencing at Walter Reed Army Institute of Research and for the USDA. He now writes for Biocompare and serves as the editor for SEQanswers.