Christopher Calderon
Numerica Corporation

Coupling Multiple Hypothesis Testing with Proportion Estimation in Heterogeneous Categorical Sensor Signal Networks

This is joint work with A. Jones, S. Lundberg, and R. Paffenroth False alarms generated by sensors pose a substantial problem to a variety of fusion applications. We focus on situations where the frequency of a genuine alarm is “rare" but the false alarm rate is relatively high. The work is motivated by chemical and biological threat detection applications. The goal is to mitigate the false alarms while retaining high power to detect true events (missing a true signal is considered much more detrimental than declaring a false alarm in applications of interest). Furthermore we would like to “fuse information” by utilizing a multiple testing framework. Problems facing our application include: 1) the frequency of a genuine rare attack is not easy to quantify; 2) the misclassification rates are often unknown (or are not accurately described by nominal false alarm rates); 3) the statistical properties differ substantially from sensor to sensor. We propose to utilize data streams contaminated by false alarms (generated in the field) to compute statistics on sensor misclassification rates. The nominal misclassification rate of a deployed sensor is often suspect because it is unlikely that these estimates were tuned to the specific environmental conditions in which the sensor was deployed (i.e. sensor performance can have nontrivial spatial and temporal effects). Recent categorical measurement error methods will be applied to the collection of data streams to “train” the sensors and provide point estimates along with confidence intervals for the misclassification and estimated prevalence. Open questions still remain as to how to best combine these estimated signals to make a decision about the presence of a chemical or biological threat. There are also questions on how to efficiently assess/detect statistically changes in population parameters. Directions explored to date include false discovery rate methods aiming to roughly incorporate correlation effects into the computed false discovery rate statistics via ``empirical nulls”. We have also started preliminary work investigating resampling based approaches applied to “dimension reduced” sensor output with the hope that a more precise estimate of the correlation between the reduced dimensions can be empirically obtained and utilized in testing/decision making.