Machine learning refers to the development of systems that can learn from data. A machine learning algorithm can, after exposure to an initial set of data, evaluate new, previously unseen examples and relate them to the initial "training" data. It is ideally suited for classification problems that involve implicit patterns, and it is most effective when used in conjunction with large amounts of data. Although machine learning has not previously been used in DNA mixture analysis, it is well-suited to such analysis because of two key problem characteristics. First, there is a large repository of human DNA mixture data in electronic format. Second, patterns in such data are often obscure and beyond the capability of manual analysis; however, they can be statistically evaluated by using one or more machine learning algorithms. The system was trained, tested, and validated using electronic data obtained from 1,405 non-simulated DNA mixture samples composed of 1-4 contributors and generated from a combination of 16 individuals. This report concludes that the proposed method for DNA mixture deconvolution, including determining the number of contributors, is a robust and reproducible method that was developed using an expansive AmpFISTR Identifiler PCR Amplification Kit. A description of materials and methods covers data acquisition and exportation, the locus-sample-specific threshold (LSST) calculation, data partitioning, feature scaling, feature selection, and machine learning algorithms. A more detailed discussion of the optimized system will be addressed in the Final Report. 10 figures, 8 tables, and 21 references
Downloads
No download available
Similar Publications
- Detection of Ignitable Liquid Residues in Fire Debris by Using Direct Analysis in RealTime Mass Spectrometry (DART-MS)
- Post-burn and Post-blast Rapid Detection of Trace and Bulk Energetics by 3D-printed Cone Spray Ionization Mass Spectrometry
- Skeletal Trauma in Forensic Anthropology: Improving the Accuracy of Trauma Analysis and Expert Testimony