Netstat Ltd.

Probabilistic networks --- a remarkable technology

Probabilistic networks are a class of statistical modeling tools for structurally inter-related observations (random variables). They have been an area of intense research and development during the last two decades, specifically since publication of Probabilistic Reasoning in Intelligent Systems by Judea Pearl, Morgan Kaufmann Publishers, San Mateo, 1988. Also known as "Probabilistic Graphical Models", they contain various subclasses known under the names "Bayesian Networks", "Markov Networks", "Hidden Markov Models (HMM)" and "Decision Graphs".

The main advantage of probabilistic networks lies in their capability to extend the reach of formal statistical analysis beyond just a few variables. Under general conditions, modeling the combined state of interrelated random variables quickly gets out of hand and becomes computationally intractable due to a "combinatorial explosion" of the joint state space. Consequently, the number of observations required to model and estimate this general case quickly becomes astronomical.

Probabilistic networks exploit structural relationships between the random variables to vastly reduce this combinatorial explosion problem. Key to this is a generalization of the so-called Markov-chain property of probability theory from simple chains to more general network (graph) structures. The network structures are intuitively interpretable as associations or causal relationships between the variables of the application domain, while also allowing for efficient algorithms. As a result, what was previously a computationally intractable problem, now becomes feasible. A multitude of statistical methods are applicable to these models --- parameter estimation, finding the most likely explanation given partial observations, or automated decision procedures --- all backed up by solid theoretical foundations.

The current state of the art is extensively covered in the (1200 page!) textbook Probabilistic Graphical Models by Daphne Koller and Nir Friedman, MIT Press, Boston, 2009.

A number of software implementations of these methods exists and their potential is remarkable, as many research prototypes of domain applications have demonstrated. But there are two main impediments to more widespread use. First, the statistical methodology is still relatively little known beyond its core community of research institutions. It seems, the uptake rate from current research to mainstream usage is much slower within the community of statistics practitioners than in the traditionally more fast-paced IT community. Second, for the full benefits of probabilistic networks to take effect, they require a good systems and process integration within the information flow of the application domain to be supported.

Successful applications range from fault diagnostics of motor vehicles to detection of insurance fraud, from environmental monitoring to medical diagnostics in constrained indication areas, from image recognition to autonomous robotics.

But the core idea is simple: wherever repeated assessments or decisions have to be made in a standardised setting governed by limited observations and statistical uncertainty, probabilistic networks can deliver optimal solutions. Netstat Ltd. intends to overcome the two major impediments to technology transfer by closely working with our clients to develop realistic statistical models of their application domains as well as integrating them seamlessly into existing information flows. We intend to do more market development by identifying promising areas of application and demonstrate their commercial value.

External links on probabilistic networks: