KmerKeys: a web resource for searching indexed genome assemblies and variants (2024)

Article Navigation

Volume 50 Issue W1 5 July 2022

Article Contents

  • Abstract

  • INTRODUCTION

  • MATERIALS AND METHODS

  • RESULTS

  • DISCUSSION

  • DATA AVAILABILITY

  • SUPPLEMENTARY DATA

  • ACKNOWLEDGEMENTS

  • FUNDING

  • REFERENCES

  • < Previous
  • Next >

Journal Article

,

Dmitri S Pavlichin

Division of Oncology, Department of Medicine, Stanford University School of Medicine

,

Stanford

,

CA

, 94305,

USA

Search for other works by this author on:

Oxford Academic

,

HoJoon Lee

Division of Oncology, Department of Medicine, Stanford University School of Medicine

,

Stanford

,

CA

, 94305,

USA

Search for other works by this author on:

Oxford Academic

,

Stephanie U Greer

Division of Oncology, Department of Medicine, Stanford University School of Medicine

,

Stanford

,

CA

, 94305,

USA

Search for other works by this author on:

Oxford Academic

,

Susan M Grimes

Stanford Genome Technology Center West, Stanford University

,

Palo Alto

,

CA

, 94304,

USA

Search for other works by this author on:

Oxford Academic

,

Tsachy Weissman

Department of Electrical Engineering, Stanford University

,

Palo Alto

,

CA

, 94304,

USA

Search for other works by this author on:

Oxford Academic

HanleeP Ji

Division of Oncology, Department of Medicine, Stanford University School of Medicine

,

Stanford

,

CA

, 94305,

USA

Stanford Genome Technology Center West, Stanford University

,

Palo Alto

,

CA

, 94304,

USA

To whom correspondence should be addressed. Tel: +1 650 721 1503; Fax: +1 650 725 1420; Email: genomics_ji@stanford.edu

Search for other works by this author on:

Oxford Academic

Co-first authors.

Author Notes

Nucleic Acids Research, Volume 50, Issue W1, 5 July 2022, Pages W448–W453, https://doi.org/10.1093/nar/gkac266

Published:

26 April 2022

Article history

Received:

18 January 2022

Revision received:

22 March 2022

Accepted:

20 April 2022

Published:

26 April 2022

  • PDF
  • Split View
  • Views
    • Article contents
    • Figures & tables
    • Video
    • Audio
    • Supplementary Data
  • Cite

    Cite

    Dmitri S Pavlichin, HoJoon Lee, Stephanie U Greer, Susan M Grimes, Tsachy Weissman, HanleeP Ji, KmerKeys: a web resource for searching indexed genome assemblies and variants, Nucleic Acids Research, Volume 50, Issue W1, 5 July 2022, Pages W448–W453, https://doi.org/10.1093/nar/gkac266

    Close

Search

Close

Search

Advanced Search

Search Menu

Abstract

K-mers are short DNA sequences that are used for genome sequence analysis. Applications that use k-mers include genome assembly and alignment. However, the wider bioinformatic use of these short sequences has challenges related to the massive scale of genomic sequence data. A single human genome assembly has billions of k-mers. As a result, the computational requirements for analyzing k-mer information is enormous, particularly when involving complete genome assemblies. To address these issues, we developed a new indexing data structure based on a hash table tuned for the lookup of short sequence keys. This web application, referred to as KmerKeys, provides performant, rapid query speeds for cloud computation on genome assemblies. We enable fuzzy as well as exact sequence searches of assemblies. To enable robust and speedy performance, the website implements cache-friendly hash tables, memory mapping and massive parallel processing. Our method employs a scalable and efficient data structure that can be used to jointly index and search a large collection of human genome assembly information. One can include variant databases and their associated metadata such as the gnomAD population variant catalogue. This feature enables the incorporation of future genomic information into sequencing analysis. KmerKeys is freely accessible at https://kmerkeys.dgi-stanford.org.

INTRODUCTION

Large genomic sequencing projects have been generating a wealth of variants in the population. Citing an example, catalogues of variants derived from population studies include the Genome Aggregation Database (gnomAD) (1), ClinVar (2) and The Cancer Genome Atlas (TCGA). These studies are broadening our understanding of the full range of human genetic diversity and facilitate the discovery of genetic factors that influence disease susceptibility (3,4). The interpretation of variants requires links to the human reference genome. However, the current reference was constructed using the genome sequence of a small number of individuals and as a result, does not account for many genomic features across the breadth of human genetic diversity (5–7). Addressing this limitation, there are ongoing projects that involve constructing a pangenome reference derived from a broader sampling of the human population (8,9). These aggregated reference assemblies from hundreds of individuals would improve the sequence analysis (8,10). Linking variant catalogues to this next generation of human reference genomes at this scale poses a significant challenge, which limits the accessibility for genomic researchers. Therefore, annotation of genomic features requires a format that can be related to different genome assemblies.

We developed a method for using K-mers for genomic annotation that includes genomic coordinates, counts, and pointers to datasets. K-mers are nucleotide sequences of length K. K-mer analysis methods are appealing in their conceptual simplicity, because these short sequences can be readily manipulated and compared among different sequence data sets. K-mer-based tools have a variety of different functions that include: enumeration (11,12), read filtering (13), evolutionary distance estimation (14), metagenomics (15), and RNAseq analysis (16). The majority of applications are geared towards mapping sequences from FASTA/Q files. Beyond mapping, k-mers have specific advantages for organizing and querying sequence databases; one can index genomic data, facilitate the organization of these data sets and offer highly efficient querying of large collections of genomic sequence data. Along these lines, we developed a website data resource that indexes k-mer sequences for reference assemblies and links them to variant catalogues from gnomAD.

MATERIALS AND METHODS

Overview of KmerKeys

KmerKeys is our web portal (https://kmerkeys.dgi-stanford.org/) deployed as a public cloud-hosted service on Amazon Web Services (AWS). KmerKeys has the following cloud-based indices: ∼2.5 billion distinct 31-mers from two genome assemblies (GRCh38 and a T2T assembly of CHM13) and 17 million exonic variants from gnomAD (version 2.1.1) (Figure 1A). We used six different lengths of k-mers; 19, 20, 21, 25, 30 and 31. All indices on KmerKeys are searchable either by sequence in FASTA-style input or by a set of intervals within a selected dataset in genomic coordinates. When querying by sequence, an input string is decomposed into its constituent k-mers in a sliding window fashion, which are then individually queried against the hash table. When querying by coordinate, we first retrieve the sequence at the specified coordinates and then query its constituent short k-mers. Both exact and fuzzy queries are supported, the latter performed up to two mismatches.

KmerKeys: a web resource for searching indexed genome assemblies and variants (4)

Figure 1.

(A) Web application of KmerKeys. There are three indices available to query; i) GRCh38, ii) T2T assembly of CHM13, and iii) gnomAD v2. Users can query these indices based on k-mer length, and using either sequences or coordinates as input. Results will be generated with (B) The summary output shows the overall frequency of all query k-mers grouped by edit distance in the following format: i) the sequence of the k-mer, ii) the number of locations with an exact match, iii) the number of locations with an edit distance of 1 and iv) the number of locations with an edit distance of 2 the user's choice of fuzzy search by setting minimum/maximum allowed mismatches. (C) The detailed output shows: i) the frequency of a given sequence at a specific assembly coordinate, ii) the number of neighbor k-mers that are a small edit distance away, iii) the frequency of neighbor k-mers with their locations, and iv) the positions of the mismatching nucleotides on the neighbor k-mers.

Open in new tabDownload slide

The KmerKeys website was designed and optimized for high query speed, anticipating that indices would be rarely constructed and frequently queried. This choice of trade-off is suited for a cloud-based shared resource where the same memory and compute resources are shared by multiple users, and the architecture can straightforwardly scale to ever more assemblies and larger datasets. All visitors’ queries to the KmerKeys public resource use the same pool of memory and threads, thus reducing the average cost per query. The web application also allows anyone without a computational background to readily access our tool.

Inputs and outputs

KmerKeys takes input either as sequence in FASTA-style or a set of intervals within a selected dataset in genomic coordinates. There are two types of outputs (Figure 1B andC): i) summary outputs and ii) detailed outputs. A summary output allows the user to rapidly review the landscape of sequence uniqueness across a region of interest. Further, we provide a visual summary plot directly adjacent to the table. Detailed output shows the locations of matched sequences and neighboring sequences with positions of mismatched nucleotides. To enable efficient online querying, the detailed output displays only the first 1000 lines. To allow the user to obtain results for queries that extend beyond 1000 lines, files are written to an AWS S3 bucket with a download link generated for users which is available for one hour.

Data structure of KmerKeys

KmerKeys is a performant data structure that associates arbitrary genomic metadata with k-mer keys, allowing for large query speed and fuzzy search. In the hash table of KmerKeys, the bipartite variant graph has billions of k-mers (circles) and millions of locations (squares) (Figure 2A). For GRCh38 and the CHM13 assembly, KmerKeys has the hash table of all k-mers from both genomes as keys and associates them with the following metadata: i) the frequencies of the k-mer in GRCh38/CHM13 assembly and ii) the k-mer location(s) in GRCh38/CHM13 assembly. Therefore, each location will be associated with a given short sequence at that position, but also could be linked to multiple locations if the k-mer appears multiple times. To increase the performance, we employed a number of mathematical concepts previously unexploited in the k-mer indexing setting. They include an invertible Fibonacci hash function together with linear hash collision resolution and a quotient filter-inspired bitpacking scheme. Together, these features offer fast (constant expected time) queries that leverage memory caching for speed, bitpacking to reduce space and a simpler implementation than related data structures like the quotient filter (see Supplementary Methods). Further, we used this hash table to associate k-mers with metadata, thereby supporting optional memory mapping of values (the metadata) or the keys to reduce memory usage, and optimizing further for the setting of indexing locations and counts in a FASTA file. As a result, our implementation of this hash table supports millions of table lookups per second on a single thread. Basically, KmerKeys is designed to provide fast queries, O(1), or constant time in k and in the length of the indexed sequence, at the expense of extra memory relative to existing indexing tools including Burrows-Wheeler Transform (BWT) search. The scaling of BWT is logarithmic, O(m log n), in the indexed sequence length while ours is constant, O(1) for a single search and O(m) for multiple searches, but independent of n, which is the indexed sequence length.

KmerKeys: a web resource for searching indexed genome assemblies and variants (5)

Figure 2.

Overview of KmerKeys. (A) Data structure of KmerKeys. The hash functionassociates each k-mer key (sequence) with its metadata (locations and frequencies). (B) k-mer representation of variants. In the bipartite variant graph, each k-mer key (circle) is a k-mer generated by a variant in the GRCh38 sequence. The k-mer keys are associated with metadata (square), which includes the variant coordinate, sequence, type, and other useful information.

Open in new tabDownload slide

K-mer based representation of variants

We developed a method to represent genetic variation that includes single nucleotide variants (SNVs) and insertion deletions (indels). Basically, the set of k-mers for a given length k overlapping the substituted base pair or spanning the insertion or deletion were associated with the coordinates based on GRCh38 (Figure 2B). For a single base pair substitution, this is the set of k-mers overlapping the substituted base pair. For short indels, this is the set of k-mers spanning the insertion or deletion (see Supplementary Methods). This representation of a variant allows use of the same schema that we used for indexing assemblies; a collection of variants represented in this way corresponds to a bipartite graph, with k-mers on one side and variants on the other denoted as circles and squares in Figure 2B. Importantly, this representation of a variant does not depend on a reference coordinate system. Therefore, we can associate any assembly coordinates with any other kind of metadata, like clinical information.

Web implementation

We developed KmerKeys in the Julia programming language (17,18). The primary benefit of Julia is its level of language expressiveness and concision similar to Python, enabling rapid prototyping and experimentation without sacrificing much performance relative to compiled languages like C and C++. Thus, using Julia enabled us to prototype and release a performant version of our tool in the same language, which accelerated development. The front-end is a web app created using Angular (https://angularjs.org/). The front-end interfaces with a computational back-end running on a separate server, an AWS EC2 instance with sufficiently large memory to support billions of k-mer indices. Queries submitted via the front-end are sent to the back-end, which generates a response returned via the front-end.

RESULTS

Uniqueness of k-mers in the human reference genome (GRCh38) and the T2T assembly

Knowing the uniqueness of any sequence is critical information for a range of applications. This property is characterized by outputs of KmerKeys. Figure 3A andB show the summary and detailed outputs from the query of the coordinates, chr17:7671806–7671856 of GRCh38, which is located within the intron between exons 5 and 6 of TP53. The example search result showed that the k-mer sequences from the first 5 positions are strongly unique within 2 mismatches while the sequence at the 6th position has 5 neighbor sequences within 2 mismatches (Figure 3A). Detailed output shows the nucleotides and position of mismatches relative to query k-mers (Figure 3B). Similar trends were observed in the CHM13 assembly although there are minor differences. As shown in Supplementary Figure S1, the 9th 25-mer is not unique in GRCh38 but is unique in the CHM13 assembly. These examples demonstrated that fuzzy, approximate searching provides the extent of uniqueness of k-mers. The information about neighbor k-mers is not easily retrieved by widely used tools such as Jellyfish (11) and KMC (12).

KmerKeys: a web resource for searching indexed genome assemblies and variants (6)

Figure 3.

The uniqueness of k-mers measured by fuzzy search. (A) Example of summary output. The summary output displays each k-mer sequence in sequential order with counts of k-mers at each edit distance up to 2 mismatches, along with a plot that visually displays those counts. (B) Example of detailed output. Identical nucleotides are indicated by a dot (.) and different nucleotides are shown at their positions. The example search result showed that the k-mer sequences from the first 5 positions are unique within 2 mismatches while the sequence at the 6th position has 5 neighbor sequences with 2 mismatches. For instance, the 25-mer at the 6th position is identical to the 25-mer at chr6:137348026–137348050 except for two mismatches: i) A instead of T at the 3rd base position and ii) G instead of A at the 19th base position. (C) Example k-mers with unique exact matches identified by KmerKeys but not found by the web versions of BLAT/BLAST. (D) Example of detailed output from a KmerKeys web application query of 25-mers in gnomAD v2 in an intronic region of TP53.

Open in new tabDownload slide

In addition, KmerKeys offers accurate search capabilities for specific short sequence that are an improvement over existing tools such as BLAST (19) and BLAT (20). For example, KmerKeys identified k-mers appearing uniquely in GRCh38 which were not identified by BLAST or BLAT (Figure 3C). We randomly sampled 100,000 unique 20, 21, 30 and 31-mers in GRCh38 (that is, each k-mer occurs at exactly one position in GRCh38). BLAT failed to identify approximately 1% of 20 and 21-mers and about 0.4% of 30 and 31-mers (Supplementary File S1). We also found that the web-based BLAST (21), though not the standalone software, sometimes missed unique k-mers (Supplementary Figure S2). In general, these missed unique k-mers contain the over-represented (appears more than 1024 times) 11-mers and the vast majority of them would be masked as repeat elements by RepeatMasker. Interestingly, several of the missing k-mers were located in coding regions (Figure 3C). To save computation time, BLAT and BLAST utilize 11-mers for the initial search for potential genomic regions where the actual sequence could be found. KmerKeys, on the other hand, simply indexes all k-mers from a given reference, thus guaranteeing the comprehensive searches.

Population variant searching from gnomAD

We demonstrated the extensibility of our data structure for population-based genetic variation. This involved generating an index of all exonic variants in gnomAD (v2.1.1) through our k-mer-based variant representation. KmerKeys linked 17,119,203 variants in gnomAD with 523,498,431 31-mers. Users can query whether a sequence or coordinates based on GRCh38 contains the variants reported in gnomAD. We demonstrate an example using the gnomAD variants found in a genomic region of TP53, (Figure 3D). All the 25-mers within this region except 10 bp of the upstream portion overlap with at least one variant. It is important to note that none of the k-mers associated with variants are present in GRCh38. The fuzzy search function makes it possible to demonstrate how variants with unique k-mers from other genomes can be mapped back to the reference. This feature is unique among web-based resources. This function could provide useful information about whether 20-mers of interest could be unique in other individuals. In addition, we provide the compressed bed file (Supplementary File S2) that contains all indexed variants with their 21 different allele frequencies (AFs). Users can download it and quickly retrieve all 21 AFs of variants based on the GRCh38 genomic coordinates using tabix of Samtools (22). In fact, we designed primers for RPP30, a typical control gene for human DNA, that bind to genomic regions where no variants are reported by gnomAD (23). This feature enables to maximize the on-target rate for primers.

DISCUSSION

In this study, we describe KmerKeys, a web data application that provides k-mer-based querying of human genome assemblies. For this application, we achieved the following: 1) we developed a data structure that efficiently and accurately associates arbitrary metadata with k-mers, 2) we devised a k-mer-based representation of variants that allows lists of variants to be jointly indexed with assemblies and primary sequencing, and 3) we launched a web application demonstrating the above, allowing users to query the locations and counts of k-mers in two whole human genome assemblies and exonic gnomAD v2 variants. KmerKeys has the potential to be used for DNA primer design and CRISPR/Cas9 target design. Using its search function, one can identify primer candidates that have the potential for off-target sites. Further, our data structure could provide a framework for representing variants at the population level and across multiple genomes simultaneously.

DATA AVAILABILITY

KmerKeys is freely accessible at https://kmerkeys.dgi-stanford.org.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

ACKNOWLEDGEMENTS

We would like to thank Billy Lau and Shubham Chandak for helpful discussions. We also thank Lucas Johnson for AWS setup and Jung Yoo for providing comments on the manuscript.

FUNDING

This work was supported by National Institutes of Health grant [U01HG01096] and the Clayville Foundation.

Conflict of interest statement. None declared.

REFERENCES

1.

Karczewski

K.J.

,

Francioli

L.C.

,

Tiao

G.

,

Cummings

B.B.

,

Alfoldi

J.

,

Wang

Q.

,

Collins

R.L.

,

Laricchia

K.M.

,

Ganna

A.

,

Birnbaum

D.P.

et al..

The mutational constraint spectrum quantified from variation in 141,456 humans

.

Nature

.

2020

;

581

:

434

443

.

2.

Landrum

M.J.

,

Lee

J.M.

,

Riley

G.R.

,

Jang

W.

,

Rubinstein

W.S.

,

Church

D.M.

,

Maglott

D.R.

ClinVar: public archive of relationships among sequence variation and human phenotype

.

Nucleic Acids Res.

2014

;

42

:

D980

D985

.

3.

VanHout

C.V.

,

Tachmazidou

I.

,

Backman

J.D.

,

Hoffman

J.D.

,

Liu

D.

,

Pandey

A.K.

,

Gonzaga-Jauregui

C.

,

Khalid

S.

,

Ye

B.

,

Banerjee

N.

et al..

Exome sequencing and characterization of 49,960 individuals in the UK biobank

.

Nature

.

2020

;

586

:

749

756

.

4.

Dewey

F.E.

,

Grove

M.E.

,

Pan

C.

,

Goldstein

B.A.

,

Bernstein

J.A.

,

Chaib

H.

,

Merker

J.D.

,

Goldfeder

R.L.

,

Enns

G.M.

,

David

S.P.

et al..

Clinical interpretation and implications of whole-genome sequencing

.

JAMA

.

2014

;

311

:

1035

1045

.

5.

Schneider

V.A.

,

Graves-Lindsay

T.

,

Howe

K.

,

Bouk

N.

,

Chen

H.C.

,

Kitts

P.A.

,

Murphy

T.D.

,

Pruitt

K.D.

,

Thibaud-Nissen

F.

,

Albracht

D.

et al..

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

.

Genome Res.

2017

;

27

:

849

864

.

6.

Lander

E.S.

,

Linton

L.M.

,

Birren

B.

,

Nusbaum

C.

,

Zody

M.C.

,

Baldwin

J.

,

Devon

K.

,

Dewar

K.

,

Doyle

M.

,

FitzHugh

W.

et al..

Initial sequencing and analysis of the human genome

.

Nature

.

2001

;

409

:

860

921

.

Google Scholar

OpenURL Placeholder Text

7.

International Human Genome Sequencing, C.

Finishing the euchromatic sequence of the human genome

.

Nature

.

2004

;

431

:

931

945

.

8.

Sherman

R.M.

,

Salzberg

S.L.

Pan-genomics in the human genome era

.

Nat. Rev. Genet.

2020

;

21

:

243

254

.

9.

Li

R.

,

Li

Y.

,

Zheng

H.

,

Luo

R.

,

Zhu

H.

,

Li

Q.

,

Qian

W.

,

Ren

Y.

,

Tian

G.

,

Li

J.

et al..

Building the sequence map of the human pan-genome

.

Nat. Biotechnol.

2010

;

28

:

57

63

.

10.

Sherman

R.M.

,

Forman

J.

,

Antonescu

V.

,

Puiu

D.

,

Daya

M.

,

Rafaels

N.

,

Boorgula

M.P.

,

Chavan

S.

,

Vergara

C.

,

Ortega

V.E.

et al..

Assembly of a pan-genome from deep sequencing of 910 humans of african descent

.

Nat. Genet.

2019

;

51

:

30

35

.

11.

Marcais

G.

,

Kingsford

C.

A fast, lock-free approach for efficient parallel counting of occurrences of k-mers

.

Bioinformatics

.

2011

;

27

:

764

770

.

12.

Kokot

M.

,

Dlugosz

M.

,

Deorowicz

S.

KMC 3: counting and manipulating k-mer statistics

.

Bioinformatics

.

2017

;

33

:

2759

2761

.

13.

Chen

S.

,

Huang

T.

,

Wen

T.

,

Li

H.

,

Xu

M.

,

Gu

J.

MutScan: fast detection and visualization of target mutations by scanning FASTQ data

.

BMC Bioinformatics

.

2018

;

19

:

16

.

14.

Deorowicz

S.

,

Gudys

A.

,

Dlugosz

M.

,

Kokot

M.

,

Danek

A.

Kmer-db: instant evolutionary distance estimation

.

Bioinformatics

.

2019

;

35

:

133

136

.

Google Scholar

OpenURL Placeholder Text

15.

Wood

D.E.

,

Lu

J.

,

Langmead

B.

Improved metagenomic analysis with kraken 2

.

Genome Biol.

2019

;

20

:

257

.

16.

Bray

N.L.

,

Pimentel

H.

,

Melsted

P.

,

Pachter

L.

Near-optimal probabilistic RNA-seq quantification

.

Nat. Biotechnol.

2016

;

34

:

525

527

.

17.

Bezanson

J.

,

Edelman

A.

,

Karpinski

S.

,

Shah

V.B.

Julia: A Fresh Approach to Numerical Computing

.

2017

;

59

:

SIAM Review

65

98

.

Google Scholar

OpenURL Placeholder Text

18.

Perkel

J.M.

Julia: come for the syntax, stay for the speed

.

Nature

.

2019

;

572

:

141

142

.

19.

Altschul

S.F.

,

Gish

W.

,

Miller

W.

,

Myers

E.W.

,

Lipman

D.J.

Basic local alignment search tool

.

J. Mol. Biol.

1990

;

215

:

403

410

.

20.

Kent

W.J.

BLAT–the BLAST-like alignment tool

.

Genome Res.

2002

;

12

:

656

664

.

Google Scholar

OpenURL Placeholder Text

21.

Wheeler

D.

,

Bhagwat

M.

BLAST quickstart: example-driven web-based BLAST tutorial

.

Methods Mol. Biol.

2007

;

395

:

149

176

.

22.

Li

H.

,

Handsaker

B.

,

Wysoker

A.

,

Fennell

T.

,

Ruan

J.

,

Homer

N.

,

Marth

G.

,

Abecasis

G.

,

Durbin

R.

1000 Genome Project Data Processing Subgroup

The sequence alignment/map format and SAMtools

.

Bioinformatics

.

2009

;

25

:

2078

2079

.

23.

Lau

B.T.

,

Pavlichin

D.

,

Hooker

A.C.

,

Almeda

A.

,

Shin

G.

,

Chen

J.

,

Sahoo

M.K.

,

Huang

C.H.

,

Pinsky

B.A.

,

Lee

H.J.

et al..

Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies

.

Genome Med

.

2021

;

13

:

62

.

Author notes

Co-first authors.

© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

  • Supplementary data

  • Supplementary data

    Comments

    0 Comments

    Comments (0)

    Submit a comment

    You have entered an invalid code

    Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

    Advertisem*nt

    Citations

    Views

    2,461

    Altmetric

    More metrics information

    Metrics

    Total Views 2,461

    2,090 Pageviews

    371 PDF Downloads

    Since 4/1/2022

    Month: Total Views:
    April 2022 159
    May 2022 158
    June 2022 46
    July 2022 101
    August 2022 55
    September 2022 99
    October 2022 124
    November 2022 91
    December 2022 97
    January 2023 114
    February 2023 94
    March 2023 115
    April 2023 80
    May 2023 73
    June 2023 70
    July 2023 68
    August 2023 109
    September 2023 100
    October 2023 83
    November 2023 66
    December 2023 64
    January 2024 88
    February 2024 91
    March 2024 111
    April 2024 90
    May 2024 104
    June 2024 11

    Citations

    Powered by Dimensions

    2 Web of Science

    Altmetrics

    ×

    Email alerts

    Article activity alert

    Advance article alerts

    New issue alert

    Subject alert

    Receive exclusive offers and updates from Oxford Academic

    Citing articles via

    Google Scholar

    • Latest

    • Most Read

    • Most Cited

    Crystal structure of a tetrameric RNA G-quadruplex formed by hexanucleotide repeat expansions of C9orf72 in ALS/FTD
    Aurora: a fluorescent deoxyribozyme for high-throughput screening
    OptoLacI: optogenetically engineered lactose operon repressor LacI responsive to light instead of IPTG
    iSuRe-HadCre is an essential tool for effective conditional genetics
    Variable patterns of retrotransposition in different HeLa strains provide mechanistic insights into SINE RNA mobilization processes

    More from Oxford Academic

    Science and Mathematics

    Books

    Journals

    Advertisem*nt

    KmerKeys: a web resource for searching indexed genome assemblies and variants (2024)

    References

    Top Articles
    Latest Posts
    Article information

    Author: Kimberely Baumbach CPA

    Last Updated:

    Views: 5895

    Rating: 4 / 5 (41 voted)

    Reviews: 80% of readers found this page helpful

    Author information

    Name: Kimberely Baumbach CPA

    Birthday: 1996-01-14

    Address: 8381 Boyce Course, Imeldachester, ND 74681

    Phone: +3571286597580

    Job: Product Banking Analyst

    Hobby: Cosplaying, Inline skating, Amateur radio, Baton twirling, Mountaineering, Flying, Archery

    Introduction: My name is Kimberely Baumbach CPA, I am a gorgeous, bright, charming, encouraging, zealous, lively, good person who loves writing and wants to share my knowledge and understanding with you.