Defining kir and hla class I genotypes at Highest Resolution via High-Throughput Sequencing

Yüklə 448,46 Kb.

Pdf görüntüsü

səhifə	2/12
tarix	15.03.2018
ölçüsü	448,46 Kb.
	#31858

1 2 3 4 5 6 7 8 9 ... 12

an immune response. On infected or transformed cell sur-

faces, pathogen-speciﬁc or tumor-speciﬁc peptides are

bound to HLA class I, and gross changes in the surface level

of HLA class I can be induced. All such differences activate

lymphocytes and the immune response.

4,19

For the inter-

actions between KIRs and HLA class I molecules to be effec-

tive, they have to respond to a wide diversity of tumors and

pathogens, many of which are rapidly evolving.

This has

been achieved with a diversity of interactions within each

individual and differences in those interactions from one

individual to another. The latter provides barriers that

can impede the spread of infection within families, com-

munities, and populations.

Crucial features that distinguish KIR and HLA alleles

from those of most other genes are the depth, breadth,

and functional importance of their sequence divergence.

Thus, alleles can differ by multiple nucleotide substitu-

tions, and three or four alternative nucleotides are present

at functionally critical positions. KIR and HLA alleles

segregate as constituents of distinct lineages, which are

further diversiﬁed by intra-genic and inter-genic recombi-

nation.

13,21

In turn, these lineages are maintained in all

human populations, and both genomic regions exhibit

clear evidence of the impact of balancing selection.

22,23

Moreover, the strong, highly reproducible signals of natu-

ral selection observed for the HLA class I and KIR regions

suggest that their genomic variation is critical for human

survival.

24,25

The development of methods for assessing the nature

and extent of KIR genomic diversity has been limited by

the complexity of the region. The widely used methods

that exist for typing KIRs focus principally on gene con-

tent.

12,26–28

In contrast, the methods being used for deter-

mining allelic variation are costly, time consuming,

6,16,29

and unsuitable for high-throughput studies. The results

of the few allele-level population studies of KIRs,

16,29–32

however, show that such investigation is likely to be infor-

mative. For example, some KIRs are restricted to popula-

tion groups of speciﬁc geographic ancestry.

30,31

Other

KIRs have lost expression but appear common and widely

distributed.

29,32

To extend such studies to other popula-

tions, as well as disease cohorts, we have developed a

sequencing and bioinformatics method that determines

complete KIR and HLA class I genomic diversity.

Material and Methods

Overview

To target KIR and HLA class I genes for next-generation nucleotide

sequencing (NGS), we designed sets of speciﬁc oligonucleotide

probes to capture the KIR region (140–240 kb) and HLA-A,

HLA-B, and HLA-C (each ~3 kb) from libraries prepared from

sheared genomic DNA. We then developed a bioinformatics pipe-

line (PING [Pushing Immunogenetics to the Next Generation])

speciﬁcally to convert sequence data obtained from the highly

polymorphic KIR genes into high-resolution genotypes. A sum-

mary of the pipeline is shown in

Figure 1

A. PING ﬁrst sorts the

sequence reads to isolate those that represent fragments from

the KIR genomic region from those that do not (a process termed

ﬁltering). PING then obtains the ﬁnal KIR genotypes from these

ﬁltered reads by using a composite of two core modules that

describe the gene and allele content for each individual and also

return information on newly identiﬁed SNPs and recombinant al-

leles. The ﬁrst module (PING_gc), which determines the KIR gene

copy number, is used to inform the second module (PING_allele),

which generates allele data (

Figure 1

A and

Figure S1

). Each module

is split into two sub-modules. KIR Filter Fish (KFF), which is used

in both main modules, probes the KIR sequence data with speciﬁc

sequence search strings and determines which genes (KFFgc) or

alleles (KFFallele) are present. The function served by KFF is

equivalent to genotyping with sequence-speciﬁc oligonucleotide

probes (SSOPs).

To complement KFF, MIRAgc (based around

the program MIRA)

and Son of SAMtools (SOS; based around

SAMtools)

create alignments to reference sequences in order to

determine the gene and allele content, respectively. The output

is designed to comply with the genotype list (GL string) format

that is used for reporting HLA and KIR data by clinical transplan-

tation laboratories.

We validated the typing obtained from the

complete capture, NGS, and bioinformatics method (hereafter

referred to as the capture/NGS method) by using standard molec-

ular techniques, and we further tested the bioinformatics compo-

nent by using existing datasets from whole-genome sequencing

experiments. A summary of the data generated or otherwise

obtained is shown in

Figure 1

B. KIR and HLA class I allele

sequences used for probe design and as reference data were

obtained from the Immuno Polymorphism Database (IPD; see

Web Resources

Throughout this paper, any unique DNA

sequence that spans a coding region (coding DNA sequence

[CDS]) is considered a distinct allele. An explanation of KIR and

HLA nomenclature is given in

Appendix A

Human Subjects and Data

Ethical approval for this study was obtained from the Stanford

University Administrative Panels on Laboratory Care and Human

Subjects in Medical Research and the Committee on Human

Research at the University of California, San Francisco. Written

informed consent was obtained from all individuals.

To develop and validate the capture/NGS method, we generated

data from three sources of human genomic DNA:

1. A Panel of IHWG Lymphoblastoid B Cell Lines. Genomic DNA

was extracted from 97 International Histocompatibility

Working Group (IHWG) cell lines. These cells have been

used extensively in developing methods for genotyping

polymorphic loci, including KIR and HLA.

37–41

Most of

the cell lines (93%) are homozygous for HLA-A, HLA-B,

and HLA-C.

37–41

A substantial majority of the IHWG cells

(80%) are derived from donors of European origin and repre-

sent many of the common HLA alleles.

42,43

Also studied was

genomic DNA from a chimpanzee B cell line, derived from

Clint

(Yerkes pedigree number C0471), a chimpanzee of

the Pan troglodytes verus (western chimpanzee) subspecies

and subject of the chimpanzee genome project.

2. West African Trios. Genomic DNA samples from 30 family

trios (both of the parents and one child) from Mali in

West Africa were analyzed.

3. European Control Samples. De-identiﬁed DNA samples from

188 unrelated healthy individuals of European origin,

376

The American Journal of Human Genetics 99, 375–391, August 4, 2016

Yüklə 448,46 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 ... 12