Statistical Examination of Handwriting Characteristics using Automated Tools


Sargur N. Srihari

Abstract

In the examination of handwritten items, the questioned document (QD) examiner follows a sequence of steps in many of which there is a degree of uncertainty to be resolved by experience and power of recall. Some examples of of such decisions are: determining type/comparability, whether there is an adequate quantity of information and determining whether a set of characteristics is individualizing or representative of a class. Statistical models can play a significant role in assisting the QD examiner in dealing with uncertainty. In particular, the need for a statistical description of handwriting characteristics has long been felt. Efforts have been limited due to lack of efficient methods for collecting the data, computational problems of dealing with the very large number of combinatorial possibilities and the lack of clear direction for use of such results by the QD examiner. This research developed new statistical methods and software tools to: (i) extract samples of commonly encountered letter forms from the handwriting of typical writers, (ii) determine characteristics that would be used by QD examiners to describe common letter forms, (iii) have QD examiners enter perceived characteristics of the samples with a user interface, (iv) determine the frequency of occurrence of combinations of handwriting characteristics, (v) use those frequencies to construct a probabilistic model while handling the combinatorial possibilities and sample requirements, and (vi) use such models to infer the probability of characteristics to determine whether they are individualizing and in forming an opinion. Previously collected samples of extended handwriting, whose writers were representative of the United States population, were used to extract snippets of common letter combinations. From these scanned images the words th and and were extracted. The word snippets of each writer were presented to QD examiners who entered values for several characteristics using an interactive tool developed for the purpose; the characteristics depended on writing type: cursive or hand-printed. From this data the frequencies of the characteristics and their combinations were evaluated. Since the combinations of characteristics is very large, exact statistical models are infeasible. Instead, probabilistic graphical models are used to model the joint distribution. Both directed and undirected graphical models were learnt from data using algorithms that use independence tests between pairs of variables and a global measure of the goodness. Methods for inferring useful probabilities from the models were developed, e.g., rarity as a measure of individualizing characteristics, and the probability of random correspondence of the observed sample among n writers. Using these methods, the probabilities of nearly 1, 500 writing styles of and were determined and tabulated. An indication of how the developed techniques can be incorporated into the work-flow of the QD examiner is given.

 Earn a Degree in Crime Scene Investigation, Forensic Science, Computer Forensics or Forensic Psychology

Read the report:




Receive our free monthly newsletter and/or job posting alerts Click to sign up