bisc

 

Methodology

A. Blood sample collection:

Blood sample can be collected from carotid vein of animal. Blood samples can also be collected on standard FTA card and sent to lab/commercial vendors for genotyping. FTA offers advantage of no need of cold chain and any kind of infrastructure to keep the blood samples. They can be sent even by ordinary surface mail for data generation.

B. DNA Extraction:

A total of 500 blood samples from unrelated animals with typical phenotypic characteristics known for a given breed were selected from eight domesticated cattle breeds, viz. Dangi (67), Khillar (66), Nimari (65), Malvi (68), Kankrej (56), Gir (33), Gaolao (75) and Kenkatha (70) from their respective breeding tracts distributed in western and central India (Table 1). Genomic DNA was isolated from all blood samples using a rapid non-enzymatic method followed by standard phenol–chloroform extraction (John et al., 1991). Twenty-two microsatellite loci viz. CSRM60, ILSTS005, ILSTS011, ILSTS006, MM12, ILSTS030, BM1824, HAUT27, BM1818, ETH152, INRA035, ETH10, INRA005, ILSTS034,  CSSM66, ETH3, INRA063 and ILSTS033(Table2) were chosen from the microsatellite loci panel recommended by the Food and Agricultural Organization (FAO) and the International Society for Animal Genetics (FAO 2004) as well as after consulting previous studies (Chaudhary et al., 2009;Kaleet al., 2010) for evaluating cattle genetic diversity and breed characterization. Genotyping data were analysed using software as described in detail in Table 3.

C. DNA Fingerprinting Data Generation:

Eighteen microsatellite loci (Table A) were chosen from the microsatellite loci panel recommended by the Food and Agricultural Organization (FAO) and the International Society for Animal Genetics (ISAG) (FAO 2004) as well as consulting previous studies (Chaudhary et al. 2009; Kale et al. 2010) for evaluating cattle genetic diversity and breed characterization. The object was to investigate high polymorphic markers spread all over the genome. Primers of these microsatellite loci were optimized for their amplification conditions such as annealing temperature and reaction recipe. They were grouped into three multiplexes (Table B). Polymerase chain reaction (PCR) amplifications were carried out in 15 µL total volumes, containing 0.025–0.04 µM of each primer in which the forward primer was labelled either with 5ˊ-FAM, HEX, TAMARA or ROX (Eurofins MWG Operon, India), 1X PCR Master Mix (Fermentas, USA) and ~90 ng DNA. PCRs were performed in a Veriti® Thermal cycler (Applied Biosystems, USA), with cycling parameters: 95ºC for 10 min followed by 35 cycles each of 95ºC for 45 s, annealing at 55 ºC for 45 s, and 72ºC for 45 s, and a final extension at 72ºC for 10 min. One microlitre of each PCR product was then mixed with an internal standard (GENESCAN™ 500 LIZ™, Applied Biosystems) according to the manufacturer’s instructions. Products were electrophoresed on a POP4 polymer in an ABI PRISM® 310 Genetic Analyzer (Applied Biosystems) and fragment analysis was carried out using GENEMAPPER v3.7 (Applied Biosystems). Genotyping was repeated for the samples where PCR amplification failed.
fc

Table-A:  Primer sequences for different microsatellites used to estimate Genetic variability among Dangi and Khillar cattle breed

S. No.

Microsatellite Locus

Primer sequence

Repeat Unit

Dye Labelled

Amplicon size
range (bp)

1

INRA005

5'-CAATCTGCATGAAGTATAAATAT-3'

(TG)n

ROX

132-150

5'-CTTCAGGCATACCCTACACC-3'

2

CSSM66

5'-ACACAAATCCTTTCTGCCAGCTGA-3'

(CA)n

FAM

172-210

5'-AATTTAATGCACTGAGGAGCTTGG-3'

3

INRA035

5'-ATCCTTTGCAGCCTCCACATTG-3'

(TG)n

FAM

97-123

5'-TTGTGCTTTATGACACTATCCG-3'

4

ETH3

5'-GAACCTGCCTCTCCTGCATTGG-3'

(TG)n

FAM

97-123

5'-ACTCTGCCTGTGGCCAAGTAGG-3'

5

MM12

5'-CAAGACAGGTGTTTCAATCT-3'

(CA)n

TAMARA

91-139

5'-ATCGACTCTGGGGATGATGT-3'

6

ILSTS033

5'-TATTAGAGTGGCTCAGTGCC-3'

(TG)n

ROX

128-162

5'-ATGCAGACAGTTTTAGAGGG-3'

7

INRA063

5'-ATTTGCACAAGCTAAATCTAACC-3'

(CA)n

ROX

149-189

5'-AAACCACAGAAATGCTTGGAAG-3'

8

ILSTS011

5'-GCTTGCTACATGGAAAGTGC-3'

(TG)n

FAM

258-272

5'-CTAAAATGCAGAGCCCTACC-3'

9

HAUT27

5'-TTTTATGTTCATTTTTTGACTGG-3'

(CA)n

HEX

130-152

5'-AACTGCTGAAATCTCCATCTTA-3'

10

ETH10

5'-GTTCAGGACTGGCCCTGCTAACA-3'

(CA)n

FAM

203-223

5'-CCTCCAGCCCACTTTCTCTTCTC-3'

11

BM1824

5'-GAGCAAGGTGTTTTTCCAATC-3'

(TG)n

TAMARA

176-196

5'-CATTCTCCAACTGCTTCCTTG-3'

12

ILSTA030

5'-CTGCAGTTCTGCATATGTGG-3'

(CA)n

TAMARA

146-160

5'-CTTAGACAACAGGGGTTTGG-3'

13

ILSTS006

5'-TGTCTGTATTTCTGCTGTGG-3'

(TG)n

FAM

276-302

5'-ACACGGAAGCGATCTAAACG-3'

14

CSRM60

5'-AAGATGTGATCCAAGAGAGAGGCA-3'

(CA)n

FAM

69-121

5'-AGGACCAGATCGTGAAAGGCATAG-3'

15

ETH152

5'-AGGGAGGGTCACCTCTGC-3'

(TG)n

TAMARA

184-206

5'-CTTGTACTCGTAGGGCAGGC-3'

16

ILSTS034

5'-AAGGGTCTAAGTCCACTGGC-3'

(TG)n

HEX

126-196

5'-GACCTGGTTTAGCAGAGAGC-3'

17

ILSTS05

5'-GGAAGCAATGAAATCTATAGCC-3'

(TATATG)n

FAM

177-193

5'-TGTTCTGTGAGTTTGTAAGC-3'

18

BM1818

5'-AGCTGGGAATATAACCAAAGG-3'

(TG)n

HEX

235-285

5'-AGTGCTTTCAAGGTCCATGC-3'

 

Table-B: Multiplex-PCR panel of primers

Panel

No. of primers

List of markers

Panel – I

8

CSSM60, ILSTS006, MM12, ILSTS030, BM1824, MM8, ILSTS005, ILSTS011

Panel– II

8

HEL1, HAUT27, BM1818, HEL51, INRA035,
ETH10, INRA005, ETH152

Panel – III

6

CSSM663, ETH3, ILSTS034, HEL9, ILSTS033, INRA063


D. Algorithm: Memory-Based Learning (MBL) Algorithm

Memory-Based Learning (MBL) is based on the hypothesis that performance in cognitive tasks depends on reasoning on the basis of similarity of new situations to stored representations of earlier experiences, rather than on the application of mental rules abstracted from earlier experiences. This approach has surfaced in different contexts using a variety of alternative names such as similarity-based, example-based, exemplar-based, analogical, case-based, instance-based, and lazy learning (Cost and Salzberg, 1993; Aha, Kibler, and Albert, 1991; Aha, 1997). Historically, MBL algorithms are descendants of the k-nearest neighbour (henceforth k-NN) algorithm (Aha, Kibler, and Albert, 1991).

An MBL system has two components, viz. a learning component which is memory-based since it involves adding training instances to memory, and a performance component which is similarity-based. In the performance component, the product of the learning component is used as a basis for mapping input to output; this usually takes the form of performing classification.

The most basic metric that works for patterns with symbolic features is the Overlap metric (Hamming distance or Manhattan metric or city-block distance or L1 metric) given in following equations; where    is the distance between instances X and Y, represented by n features, and  is the distance per feature. The k-NN algorithm with this metric is called IB1 (Aha, Kibler and Albert, 1991).

1

where

The major difference of k-NN algorithm with IB1 algorithm, implemented in TiMBL (Tilburg Memory-Based Learner) software originally proposed by (Aha, Kibler, and Albert, 1991), is that in TiMBL version, the value of k refers to k-nearest distancesrather than k-nearest examples. With k = 1, for instance, TiMBL’s nearest neighbour set can contain several instances that are equally distant to the test instance. TiMBL, which is used in the study is an open source software package implementing several memory-based learning (MBL) algorithms.