Our goal is to learn about the political interests and preferences of Members of Parliament (MPs) by mining their parliamentary activity in order to develop a recommendation/filtering system to determine how relevant documents should be distributed among MPs. We propose the use of positive unlabeled learning to tackle this problem since we only have information about relevant documents (the interventions of each MP in debates) but not about irrelevant documents and so it is not possible to use standard binary classifiers which have been trained with positive and negative examples. Additionally, we have also developed a new positive unlabeled learning algorithm that compares favourably with: a) a baseline approach which assumes that every intervention by any other MP is irrelevant; b) another well-known positive unlabeled learning method; and c) an approach based on information retrieval methods that matches documents and legislators' representations. The experiments have been conducted with data from the regional Spanish Andalusian Parliament.

In this paper, we examine the problem of building a user profile from a set of documents. This profile will consist of a subset of the most representative terms in the documents that best represent user preferences or interests. Inspired by the discrete concentration theory we have conducted an axiomatic study of seven properties that a selection function should fulfill: the minimum and maximum uncertainty principle, invariant to adding zeros, invariant to scale transformations, principle of nominal increase, transfer principle and the richest get richer inequality. We also present a novel selection function based on the use of similarity metrics, and more specifically the cosine measure which is commonly used in information retrieval, and demonstrate that this verifies six of the properties in addition to a weaker variant of the transfer principle, thereby representing a good selection approach. The theoretical study was complemented with an empirical study to compare the performance of different selection criteria (weight- and unweight-based) using real data in a parliamentary setting. In this study, we analyze the performance of the different functions focusing on the two main factors affecting the selection process: profile size (number of terms) and weight distribution. These profiles are then used in a document filtering task to show that our similarity-based approach performs well in terms not only of recommendation accuracy but also efficiency (we obtain smaller profiles and consequently faster recommendations).

One step towards breaking down barriers between citizens and politicians is to help people identify those politicians who share their concerns. This paper is set in the field of expert finding and is based on the automatic construction of politicians' profiles from their speeches on parliamentary committees. These committee-based profiles are treated as documents and are indexed by an information retrieval system. Given a query representing a citizen's concern, a profile ranking is then obtained. In the final step, the different results for each candidate are combined in order to obtain the final politician ranking. We explore the use of classic combination strategies for this purpose and present a new approach that improves state-of-the-art performance and which is more stable under different conditions. We also introduce a two-stage model where the identification of a broader concept (such as the committee) is used to improve the final politician ranking.

In the context of e-government and more specifically that of parliament, this paper tackles the problem of finding Members of Parliament (MPs) according to their profiles which have been built from their speeches in plenary or committee sessions. The paper presents a common solution for two problems: firstly, a member of the public who is concerned about a certain issue might want to know who the best MP is for dealing with their problem (recommending task); and secondly, each new piece of textual information that reaches the house must be correctly allocated to the appropriate MP according to its content (filtering task). This paper explores both these ways of searching for relevant people conceptually by encapsulating them into a single problem: that of searching for the relevant MP's profile given an information need. Our research work proposes various profile construction methods (by selecting and weighting appropriate terms) and compares these using different retrieval models to evaluate their quality and suitability for different types of information needs in order to simulate real and common situations.

The amount of information we are exposed to on a daily basis is increasing exponentially. Besides, Information Retrieval Systems (IRSs) return the same results for a given query regardless of who submitted it. In order to address the problems of finding useful, relevant information and adapting the results to the user, the use of personalization techniques is now more necessary than ever. They are not, however, particularly popular in live environments as users remain unconvinced about their reliability and, more importantly, are concerned about privacy issues. We have developed and compared six generic user profile representations in order to improve the personalization process and address the problem of privacy. We propose a new weighting scheme to build the profiles and a new personalization technique to join the advantages of some of the previous profiles. A comprehensive evaluation study of the proposed generic user profiles was performed and this revealed very good personalization performance results and some interesting conclusions about their use in a political context, more specifically with official documents from the Andalusian Parliament.

Due to the information overload we are faced with nowadays, personalization services are becoming almost essential, in order to find relevant information tailored to each individual or group of people with common interests. Therefore, it is very important to be able to build efficient and robust personalization techniques to be part of these services. The evaluation step is a crucial stage in their development and improvement, so much more research is needed to overcome this issue. We have proposed an automatic evaluation methodology for personalized information retrieval systems (ASPIRE), which joins the advantages of both system-centred (repeatable, comparable and generalizable results) and user-centred (considers the user) evaluation approaches, and makes the evaluation process easy and fast. Its reliability and robustness have been assessed by means of a user-oriented evaluation. ASPIRE may be considered as an interesting alternative to the costly and difficult user studies, able to discriminate between either different personalization techniques or different parameter configurations of a given personalization method.

Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where instances are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen to label an instance. Due to the problem complexity (the solution is one among an exponential number of alternatives), a very common solution (the {\em binary} method) is frequently used, learning a binary classifier for every category, and combining them all afterwards. The assumption taken in this solution is not realistic, and in this work we give examples where the decisions for all the labels are not taken independently, and thus, a supervised approach should learn those existing relationships among categories to make a better classification. Therefore, we show here a generic methodology that can improve the results obtained by a set of independent probabilistic binary classifiers, by using a combination procedure with a classifier trained on the co-occurrences of the labels. We show an exhaustive experimentation in three different standard corpora of labeled documents (Reuters-21578, Ohsumed-23 and RCV1), which present noticeable improvements in all of them, when using our methodology, in three probabilistic base classifiers.

As the amount of information increases every day and the users normally formulate short and ambiguous queries, personalized search techniques are becoming almost a must. Using the information about the user stored in a user profile, these techniques retrieve results that are closer to the user preferences. On the other hand, the information is being stored more and more in an semi-structured way, and XML has emerged as a standard for representing and exchanging this type of data. XML search allows a higher retrieval effectiveness, due to its ability to retrieve and to show the user specific parts of the documents instead of the full document. In this paper we propose several personalization techniques in the context of XML retrieval. We try to combine the different approaches where personalization may be applied: query reformulation, reranking of results and retrieval model modification. The experimental results obtained from a user study using a parliamentary document collection support the validity of our approach.

Within probabilistic classification problems, learning the Markov boundary of the class variable consists in the optimal approach for feature subset selection. In this paper we propose two algorithms that learn the Markov boundary of a selected variable. These algorithms are based on the score+search paradigm for learning Bayesian networks. Both algorithms use standard scoring functions but they perform the search in constrained spaces of class-focused directed acyclic graphs, going through the space by means of operators adapted for the problem. The algorithms have been validated experimentally by using a wide spectrum of databases, and their results show a performance competitive with the state-of-the-art.

This paper presents a new approach for memory-based collaborative filtering algorithms. In general, user-based rating prediction is a process in which each neighbor suggests a rating for the target item and the suggestions are combined by weighting the contribution of each neighbor. We present a new alternative that is independent of user rating scales and is based on what we call predictive probabilities. We explore how these probabilities can be used to select nearest neighbors for recommendations and integrate different types of dependence in the ratings. The neighborhood selection criterion depends on the capability of a user to predict past ratings. Our hypothesis is that if a user was good when predicting past ratings for an active user, then his predictions will also be helpful in recommending ratings for the same user in the future.

We consider the scenario in which an automatic classifier (previously built) is available. It is used to classify new instances but, in some cases, the classifier may request the intervention of a human (the oracle), who gives it the correct class. In this scenario, first it is necessary to study how the performance of the system should be evaluated, as it cannot be based solely on the predictive accuracy obtained by the classifier but it should also take into account the cost of the human intervention; second, studying the concrete circumstances under which the classifier decides to query the oracle is also important. In this paper we study these two questions and include also an experimental evaluation of the different proposed alternatives.

Recommender systems enable users to access products or articles that they would otherwise not be aware of due to the wealth of information to be found on the Internet. The two traditional recommendation techniques are content-based and collaborative filtering. While both methods have their advantages, they also have certain disadvantages, some of which can be solved by combining both techniques to improve the quality of the recommendation. The resulting system is known as a hybrid recommender system. In the context of artificial intelligence, Bayesian networks have been widely and successfully applied to problems with a high level of uncertainty. The field of recommendation represents a very interesting testing ground to put these probabilistic tools into practice. This paper therefore presents a new Bayesian network model to deal with the problem of hybrid recommendation by combining content-based and collaborative features. It has been tailored to the problem in hand and is equipped with a flexible topology and efficient mechanisms to estimate the required probability distributions so that probabilistic inference may be performed. The effectiveness of the model is demonstrated using the Movie-Lens and IMDB data sets.

Focusing on the context of XML retrieval, in this paper we propose a general methodology for managing structured queries (involving both content and structure) within any given structured probabilistic information retrieval system which is able to compute posterior probabilities of relevance for structural components given a non-structured query (involving only query terms but not structural restrictions). We have tested our proposal using two specific information retrieval systems (Garnata and PF/Tijah), and the structured document collections from the last six editions of the INitiative for the Evaluation of XML Retrieval (INEX).

Building Recommender Systems has received considerable attention in recent years. The main problem with those systems is presented in those items for which we have little information that causes incorrect predictions. In this paper we present a novel idea with which we are going to obtain quality information from the database, and which will allow to improve the predictions of the systems. To test our idea, we are going to use two different Recommender Systems (an own probabilistic model and other based on neighbourhood foreing to ourself) and different databases (Jester and Movielens).

In the Andalusian Parliament, at Spain, all the matters discussed in the sessions are stored in two different media: a textual transcription containing all the speeches of the Members of the Parliament, plus a video recording all the session. The only link between these two information sources is the event that they represent. But it could be very interesting and useful for a user, that given a part of a record of parliamentary proceedings, she could access directly to the part of the video related to this text. To make this connection, a previous process of video segmentation must be performed, in order to split the video in the most appropriate units. Later, a synchronization step between the different parts of the transcriptions and the segments is needed. In this paper, approaches to give a solution to this problem are presented.

Purpose: This paper presents an overview of the reorganization of the Andalusian Parliament's digital library to improve the electronic representation and access of its official corpus by taking advantage of a documents' internal organization. Video recordings of the parliamentary sessions have also been integrated with their corresponding textual transcriptions. Design/methodology/approach: After analysing the state of the Andalusian parliament's digital library and determining the aspects that could be improved both in the repository and access mechanisms, this paper describes each component of the developed integrated information system. Findings: In this project, we have developed a methodology to tackle the problem and this methodology could be applied to other similar institutions and organizations. Exploiting the internal structure of the Parliament's official documents has also proved to be extremely interesting for users as they are directed towards the most relevant parts of the documents. Originality/value: The application of an information retrieval system for structured documents to a real framework and the integration of multimedia sources (e.g. text and video) for retrieval purposes.

We propose a method which, given a document to be classified, automatically generates an ordered set of appropriate descriptors extracted from a thesaurus. The method creates a Bayesian network to model the thesaurus and uses probabilistic inference to select the set of descriptors having high posterior probability of being relevant given the available evidence (the document to be classified). Our model can be used without having preclassified training documents, although it improves its performance as long as more training data become available. We have tested the classification model using a document dataset containing parliamentary resolutions from the regional Parliament of Andalucia at Spain, which were manually indexed from the Eurovoc thesaurus, also carrying out an experimental comparison with other standard text classifiers.

While the problem of building recommender systems has attracted considerable attention in recent years, most recommender systems are designed for recommending items to individuals. The aim of this paper is to automatically recommend and rank a list of new items to a group of users. We will investigate the value of using Bayesian networks to represent the different uncertainties involved in a group recommending process, i.e. those uncertainties related to mechanisms that govern the interactions between group members and the processes leading to the final choice or recommendation. We will also show how the most common aggregation strategies might be encoded using a Bayesian network formalism. The proposed model can be considered as a collaborative Bayesian network-based group recommender system, where group ratings are computed from the past voting patterns of other users with similar tastes.

The problem of building Recommender Systems has attracted considerable attention in recent years. The objective of this paper is to automatically suggest and rank a list of new items to a user based on the past voting patterns of other users with similar tastes. The proposed model can be considered as a collaborative Bayesian network-based Recommender System. In order to model the problem, we will also take into account the fact that the user can usually use vague ratings for the products, which might be represented as fuzzy labels. The combination of Bayesian networks that enables an intuitive representation of the mechanisms that govern the relationships between the users and also the Fuzzy Set theory enabling us to represent ambiguity or vagueness in the description of the ratings, and this in turn improves the accuracy of the system.

The use of several types of structural restrictions within algorithms for learning Bayesian networks is considered. These restrictions may codify expert knowledge in a given domain, in such a way that a Bayesian network representing this domain should satisfy them. The main goal of this paper is to study whether the algorithms for automatically learning the structure of a Bayesian network from data can obtain better results by using this prior knowledge. Three types of restrictions are formally defined: existence of arcs and/or edges, absence of arcs and/or edges, and ordering restrictions. We analize the possible interactions between these types of restrictions and also how the restrictions can be managed within Bayesian network learning algorithms based on both the score+search and conditional independence paradigms. Then we particularize our study to two classical learning algorithms: a local search algorithm guided by a scoring function, with the operators of arc addition, arc removal and arc reversal, and the PC algorithm. We also carry out experiments using these two algorithms on several data sets.

We propose a new scoring function for learning Bayesian networks from data using score+search algorithms. This is based on the concept of mutual information and exploits some well-known properties of this measure in a novel way. Essentially, a statistical independence test based on the chi-square distribution, associated with the mutual information measure, together with a property of additive decomposition of this measure, are combined in order to measure the degree of interaction between each variable and its parent variables in the network. The result is a non-Bayesian scoring function called MIT (mutual information tests) which belongs to the family of scores based on information theory. The MIT score also represents a penalization of the Kullback-Leibler divergence between the joint probability distributions associated with a candidate network and with the available dataset. Detailed results of a complete experimental evaluation of the proposed scoring function and its comparison with the well-known K2, BDeu and BIC/MDL scores are also presented.

There is a commonly held opinion that the algorithms for learning unrestricted types of Bayesian networks, especially those based on the score+search paradigm, are not suitable for building competitive Bayesian network-based classifiers. Several specialized algorithms that carry out the search into different types of directed acyclic graph (DAG) topologies have since been developed, most of these being extensions (using augmenting arcs) or modifications of the Naive Bayes basic topology. In this paper, we present a new algorithm to induce classifiers based on Bayesian networks which obtains excellent results even when standard scoring functions are used. The method performs a simple local search in a space unlike unrestricted or augmented DAGs. Our search space consists of a type of partially directed acyclic graph (PDAG) which combines two concepts of DAG equivalence: classification equivalence and independence equivalence. The results of exhaustive experimentation indicate that the proposed method can compete with state-of-the-art algorithms for classification.

Due to the uncertainty of many of the factors that influence the performance of an emergency medical service, we propose using Bayesian networks to model this kind of system. We use different algorithms for learning Bayesian networks in order to build several models, from the hospital manager's point of view, and apply them to the specific case of the emergency service of a Spanish hospital. This first study of a real problem includes preliminary data processing, the experiments carried out, the comparison of the algorithms from different perspectives, and some potential uses of Bayesian networks for management problems in the health service.

The retrieval performance of an Information Retrieval System usually increases when it uses the relationships among the terms contained in a given document collection. But the main problems of this use are how to obtain these relationships efficiently and later, how to use them to retrieve documents given a user's query. To exploit term relationships, overcoming these two drawbacks, in this paper a new retrieval model based on a Bayesian network that represents term relationships is presented. An efficient learning method to capture these relationships, based on term clustering, as well as their use for retrieval purposes, are also shown.

In this paper we present an Information Retrieval System (IRS) which is able to work with structured document collections. The model is based on the Influence Diagrams formalism: a generalization of Bayesian Networks that provides a visual representation of a decision problem. These offer an intuitive way to identify and display the essential elements of the domain (the structured document components and their usefulness) and also how these are related to each other. They have also associated quantitative knowledge that measures the strength of the interactions. By means of this approach, we shall present structured retrieval as a decision-making problem. Two different models have been designed: SID (Simple Influence Diagram) and CID (Context-based Influence Diagram). The main difference between these two models is that the latter also takes into account influences provided by the context in which each structural component is located.

A common approach for learning Bayesian networks from data is based on the use of a scoring metric, to evaluate the fitness of any given candidate network to the data, and a method to explore the search space, which usually is the set of directed acyclic graphs. The most efficient search methods used in this context are greedy Hill-Climbing, either deterministic or stochastic. One of these methods that has been applied with some success is Hill-Climbing with random restart. In this paper we study a new algorithm of this type to restart a local search when it is trapped at a local optimum. It uses problem-specific knowledge about Bayesian networks and the information provided by the database itself (by testing the conditional independences which are true in the current solution of the search process). We also study a new definition of neighborhood for the space of directed acyclic graphs, by using the classical operators of arc addition and arc deletion togheter with a new operator for arc reversal. The proposed methods are empirically tested using two different domains: ALARM and INSURANCE.

In this paper a new probabilistic information retrieval model, based on Bayesian networks, is proposed. We first consider a basic model, which represents only direct relationships between the documents in the collection and the terms or keywords used to index them. Next, we study two versions of an extended model, that also represents direct relationships between documents. In either case the Bayesian networks are used to efficiently compute, by means of a new and exact propagation algorithm, the posterior probabilities of relevance of the documents in the collection given a query. The performance of the proposed retrieval models is tested through a series of experiments with several standard document collections.

Relevance Feedback consists in automatically formulating a new query according to the relevance judgments provided by the user after evaluating a set of retrieved documents. In this paper, we introduce several relevance feedback methods for the {\it Bayesian Network Retrieval Model}. The theoretical frame on which our methods are based uses the concept of partial evidences, which summarize the new pieces of information gathered after evaluating the results obtained by the original query. These partial evidences are inserted into the underlying Bayesian network and a new inference process (probabilities propagation) is run in order to compute the posterior relevance probabilities of the documents in the collection given the new query. The quality of the proposed methods is tested using a preliminary experimentation with different standard document collections.

Although many algorithms have been designed to construct Bayesian network structures using different approaches and principles, they all employ only two methods: those based on independence criteria, and those based on a scoring function and a search procedure (although some methods combine the two). Within the score+search paradigm, the dominant approach uses local search methods in the space of directed acyclic graphs (DAGs), where the usual choices for defining the elementary modifications (local changes) that can be applied are arc addition, arc deletion, and arc reversal. In this paper, we propose a new local search method that uses a different search space, and which takes account of the concept of equivalence between network structures: restricted acyclic partially directed graphs (RPDAGs). In this way, the number of different configurations of the search space is reduced, thus improving efficiency. Moreover, although the final result must necessarily be a local optimum given the nature of the search method, the topology of the new search space, which avoids making early decisions about the directions of the arcs, may help to find better local optima than those obtained by searching in the DAG space. Detailed results of the evaluation of the proposed search method on several test problems, including the well-known Alarm Monitoring System, are also presented.

This paper presents an Information Retrieval model based on the Bayesian Network formalism. The topology of the network (representing the dependence relationships between terms and documents) as well as the quantitative knowledge (the probabilities encoding the strength of these relationships) will be mined from the document collection using automatic learning algorithms. The relevance of a document to a given query is obtained by means of an inference process through a complex network of dependences. A new inference technique, called Propagation+Evaluation, has been developed in order to obtain the exact probabilities of relevance in the whole network efficiently.

The Bayesian Network Retrieval Model is able to represent the main (in)dependence relationships among the terms from a document collection by means of a specific type of Bayesian network, namely a polytree. But, although the learning and propagation algorithms designed for this topology are very efficient, in collections where the number of terms is very large, these two tasks could be very time-consuming processes. In this paper it is shown how, by reducing the size of the polytree, which will be composed only of a subset of terms selected attending to their retrieval quality, the performance of the model is mantained, but the efforts needed to learn and, later, propagate in the model are considerably reduced. A method to select the best terms, based on their inverse document frequency and term discrimination value, is also presented.

Abductive inference in Bayesian belief networks (BBNs) is intended as the process of generating the K most probable configurations given an observed evidence. When we are only interested in a subset of the network's variables, this problem is called partial abductive inference. Both problems are NP-hard, and so exact computation is not always possible. In this paper a genetic algorithm is used to perform partial abductive inference in BBNs. The main contribution is the introduction of new genetic operators specificly designed for this problem. By using these genetic operators we try to take advantage of the calculations previously carried out, when a new individual is being evaluated. The algorithm is tested using the alarm belief network and compared with a previous genetic algorithm based on classical genetic operators. From the experimental results we can conclude that the new genetic operators improve the behaviour of the operators previously used.

An important type of methods for learning Bayesian networks from data are those based on the use of a scoring metrics, to evaluate the fitness of any given candidate network to the data base, and a search procedure to explore the set of candidate networks. The most usual search methods are greedy hill-climbing, either deterministic or stochastic, although other techniques, as genetic algorithms, have also been used. In this paper we propose a new algorithm for learning Bayesian networks based on a recently introduced search metaheuristics, which has been successfully applied to solve a variety of combinatorial optimization problems: Ant Colony Optimization (ACO). We describe all the elements necessary to tackle our learning problem using this metaheuristics, and experimentally compare the performance of our ACO-based algorithm with other algorithms based on different search procedures used in the literature. The experimental work is carried out using two different domains: ALARM and INSURANCE.

Partial abductive inference in Bayesian belief networks (BBNs) is intended as the process of generating the K most probable configurations for a set of unobserved variables (the explanation set). This problem is NP-hard and so exact computation is not always possible. In previous works genetic algorithms (GAs) have been used to solve the problem in an approximate way by using exact probabilities propagation as the evaluation function. However, although the translation of a partial abductive inference problem into a (set of) probabilities propagation problem(s) enlarges the class of solvable problems, it is not enough for large networks. In this paper we try to enlarge the class of solvable problems by reducing the size of the graphical structure in which probabilities propagation will be carried out. To achieve this reduction we present a method that yields a (forest of) clique tree(s) from which the variables of the explanation set have been removed, but in which configurations of these variables can be evaluated. Experimental results show a significant speedup of the evaluation function when propagation is performed over the obtained reduced graphical structure.

An essential component in Machine Learning processes is to estimate any uncertainty measure reflecting the strength of the relationships between variables in a dataset. In this paper we focus on those particular situations where the dataset has incomplete entries, as most real-life datasets have. We present a new approach to tackle this problem. The basic idea is to initially estimate a set of probability intervals that will be used to complete the missing values. Then, these values are used to obtain new bounds of the expected number of entries in the dataset. The probability intervals are narrowed iteratively until convergence. We have shown that the same processes can be used to estimate both, probability intervals and probability distributions, and give conditions that guarantee that the estimator is the correct one.

Previous algoritms for the construction of belief networks structures from data are mainly based either on independence criteria or on scoring metrics. The aim of this paper is to present a hybrid methodology that is a combination of these two approaches, which benefits from characteristics of each one, and to develop two operative algoritms based on this methodology. Results of the evaluation of the algorithms on the well-known Alarm network are presented, as well as the algorithms performance issues and some open problems.

Abductive inference in Bayesian belief networks (BBN) is intended as the process of generating the K most probable configurations given observed evidence. When we are only interested in a subset of the network variables, this problem is called partial abductive inference. Due to the noncommutative behaviour of the two operators (summation and maximum) involved in the computational process of solving partial abductive inference in BBNs, the process can be unfeasible by exact computation even for networks in which other types of probabilistic reasoning are not very complicated. This paper describes an approximate method to perform partial abductive inference in BBNs based on the simulated annealing (SA) algorithm. The algorithm can be applied to multiple connected networks and for any value of K. The evaluation function is based on clique tree propagation, and allow to evaluate neighbour configurations by means of local computations, in this way the efficiency with respect to previous algorithms (based on the use of genetic algorithms (GAs)) is improved.

The problem of assessing numerical values of possibility distributions is considered in the paper. Particularly, we are interested in estimating the possibility values from a data set. The proposed estimation methods are based on the idea of transforming a probability distribution (obtained from the data set) into a possibility one. They also take into account that smaller data base sizes involve a greater uncertainty about the model and therefore, less precise assessments should be obtained in such cases. Moreover, in order to validate the estimated joint possibility distribution, a set of properties which guarantee that we obtain reasonable results will be studied. Finally, we are interested in analyzing the feasibility of the decisions or the conclusions that can be obtained by manipulating the estimated possibility distribution, so that some of the properties of this distribution, after applying to it the marginalization and conditioning operators, are also studied.

Abductive inference in Bayesian belief networks is intended as the process of generating the K most probable configurations given an observed evidence. These configurations are called explanations and in most of the approaches found in the literature, all the explanations have the same number of literals. In this paper we propose some criteria to simplify the explanations in such a way that the resulting configurations are still accounting for the observed facts. Computational methods to perform the simplification task are also presented. Finally the algorithms are experimentally tested using a set of experiments which involves three different Bayesian belief networks.

In the paper we describe a new independence-based approach for learning Belief Networks. The proposed algorithm avoids some of the drawbacks of this approach by making an intensive use of low order conditional independence tests. Particularly, the set of zero and first order independence statements are used in order to obtain a prior skeleton of the network, and also to fix and remove arrows from this skeleton. Then, a refinement procedure, based on minimum cardinality d-separating sets, which uses a small number of conditional independence tests of higher order, is carried out to produce the final graph. Our algorithm needs an ordering of the variables in the model as the input. An algorithm that partially overcomes this problem is also presented.

Abductive inference in Bayesian belief networks is the process of generating the K most probable configurations given an observed evidence. When we are only interested in a subset of the network's variables, this problem is called partial abductive inference. Both problems are NP-hard, and so exact computation is not always possible. This paper describes an approximate method based on genetic algorithms to perform partial abductive inference. We have tested the algorithm using the alarm network and from the experimental results we can conclude that the algorithm presented here is a good tool to perform this kind of probabilistic reasoning.

The notion of independence is of great importance in any formalism for managing uncertainty, for both theoretical and practical reasons. In this paper we study the concept of independence in the framework of possibility theory. Our approach to defining conditional independence relationships is based on comparing conditional possibility measures. Different comparison criteria are presented, based on the ideas of `not to modify', `not to gain', and `to obtain similar' information after conditioning. For each definition of independence considered, an axiomatic study has been carried out. Moreover, there are different operators to define conditional possibility measures, which are related to different views of possibility theory. Particularly, in the first part of the paper, we use Hisdal conditioning (whereas Dempster conditioning will be used in the second part). Finally, we study the marginal problem for possibility measures and, as an application, we show that it is possible to store large n-dimensional possibility distributions efficiently, using independence relationships among variables.

From both a theoretical and a practical point of view, the study of the concept of independence has great importance in any formalism that manages uncertainty. In Independence Concepts in Possibility Theory: Part I several independence relationships were proposed, using different comparison criteria between conditional possibility measures, and using Hisdal conditioning as the conditioning operator. In this paper, we follow the same approach, but considering possibility measures as particular cases of consonant plausibility measures and, therefore, using Dempster conditioning instead of Hisdal's. We formalize several intuitive ideas to define independence relationships, namely `not to modify', `not to gain' and `to obtain similar' information after conditioning, and study their properties. We also compare the results with the previous ones obtained in \cite{ParteI} using Hisdal conditioning. Finally, the marginal problem, i.e., how to obtain a joint possibility distribution from a set of marginals, and the problem of factorizing large possibility distributions, in terms of its conditionally independent components, are considered.

Graphical structures such as Bayesian networks or Markov networks are very useful tools for representing irrelevance or independency relationships, and they may be used to efficiently perform reasoning tasks. Singly connected networks are important specific cases where there is no more than one undirected path connecting each pair of variables. The aim of this paper is to investigate the kind of properties that a dependency model must verify in order to be equivalent to a singly connected graph structure, as a way of driving automated discovery and construction of singly connected networks in data. The main results are the characterizations of those dependency models which are isomorphic to singly connected graphs (either via the d-separation criterion for directed acyclic graphs or via the separation criterion for undirected graphs), as well as the development of efficient algorithms for learning singly connected graph representations of dependency models.

Belief networks are graphical structures able to represent dependence and independence relationships among variables in a given domain of knowledge. We focus in the problem of automatic learning of these structures from data, and restrict our study to a specific type of belief networks: Simple Graphs, i.e., directed acyclic graphs where every pair of nodes with a common direct child have no common ancestor nor is one an ancestor of the other. Our study is based on an analysis of the independences that can be represented by means of simple graphs. This theoretical study, which includes new characterizations of simple graphs in terms of independence relationships, is the basis of an efficient algorithm that recovers simple graphs using only zero and first order conditional independence tests, thus overcoming some of the practical difficulties of the existing algorithms.

This paper presents an abstract definition of partial inconsistency and one operator used to remove it: normalization. When there is partial inconsistency in the combination of two pieces of information, this partial inconsistency is propagated to all the information in the system thereby degrading it. To avoid this effect, it is necessary to apply normalization. Four different formalisms are studied as particular cases of the axiomatic framework presented in this paper: probability theory, infinitesimal probabilities, possibility theory, and symbolic evidence theory. It is shown how, in one of these theories (probability), normalization is not an important problem: a property which is verified in this case gives rise to the equivalence of all the different normalization strategies. Things are very different for the other three theories: there are a number of different normalization procedures. The main objective of this paper will be to determine conditions based on general principles indicating how and when the normalization operator should be applied.

Decomposable dependency models possess a number of interesting and useful properties. This paper presents new characterizations of decomposable models in terms of independence relationships, which are obtained by adding a single axiom to the well-known set characterizing dependency models that are isomorphic to undirected graphs. We also briefly discuss a potential application of our results to the problem of learning graphical models from data.

We study probability intervals as an interesting tool to represent uncertain information. A number of basic operations necessary to develop a calculus with probability intervals, such as combination, marginalization, conditioning and integration are studied in detail. Moreover, probability intervals are compared with other uncertainty theories, such as lower and upper probabilities, Choquet capacities of order two and belief and plausibility functions. The advantages of probability intervals with respect to these formalisms in computational efficiency are also highlighted.

The problem of learning rules for a fuzzy inference model directly from empirical observations, without resorting to assessments from experts is considered. We develop a method that builds uncertain rules from a set of examples. These rules match the following pattern: If X is A then Y is B is [alpha, beta], where A and B are fuzzy sets representing fuzzy restrictions on the variables X and Y; alpha and beta are real numbers expressing lower and upper degrees of certainty in the truth of the rule. The method is based on the minimization of a distance measure between the real output associated to a given input and the output predicted by the inference model using a parameterized version of the same rule to be learnt. Our approach is computationally efficient in running time as well as in storage requirements. Moreover, it can be used in both training (batch-processing) and adaptation (iterative-processing) modes of learning.

The management of uncertainty and imprecision is becoming more and more important in knowledge-based systems. Fuzzy logic provides a systematic basis for representing and inferring with this kind of knowledge. This paper describes an approach for fuzzy inference based on an uncertainty forward propagation method and a change in the granularity of the elements involved. The proposed model is able to handle very general kinds of facts and rules, and it also verifies the most usual properties required by a fuzzy inference model.

Starting from the concept of 'equiordered functions' we characterize two types of fuzzy integrals that can be defined on any kind of fuzzy measure: Sugeno and Choquet integrals. The characterization theorems that we obtain allow us to carry out a comparative study of these two fuzzy integrals, pointing out their formal similarities and their conceptual differences.

In order to propagate uncertain information through a belief network, we need tools for both forward and backward propagation. In this article we propose a mechanism to propagate uncertain information represented by a very general class of fuzzy measures (namely, lower and upper probabilities) forward. Once the general solution to this problem is established, we study several interesting particular cases by considering classes of fuzzy measures more restrictive than lower and upper probabilities. Finally, in order to handle only possibility measures, some approximation techniques of the resultant measures are considered.

Recently, the two most general and well-known fuzzy integrals (Sugeno's integral and Choquet's integral) were characterized. It was also shown how both integrals fitted the same formal model, and they only differed in the operators used. In this paper we try to analyze whether it is possible to improve those characterization theorems, and we also study the possibility of defining other fuzzy integrals following the same model but using other operators.

In this article a concept of conditional fuzzy measure is presented, which is a generalization of conditional probability measure. Its properties are studied in the general case and in some particular types of fuzzy measures as representable measures, capacities of order two, and belief-plausibility measures. In the case of capacities of order two it coincides with the concept given by Dempster for representable measures. However, it differs from the Dempster's rule for conditioning belief-plausibility measures. As it is shown, Dempster's rule of conditioning is based on the idea of combining information and our definition is based on a restriction in the set of possible worlds.

In a recent paper a method to study fuzzy measures by means of certain sets of associated probabilities has been developed. In this paper, distances in the set of fuzzy measures are defined trough distances between the associated probabilities. Finally, some applications of these distances to measure the uncertainty and specificity of fuzzy measures as well as to approximate fuzzy measures, are considered.

We define a functional on fuzzy measures (called monotone expectation) based on Choquet's integral operator, and we study its convergence properties. The monotone expectation is used to extend fuzzy measures to fuzzy subsets. Also Sugeno's bound for the difference between his integral and the mathematical expectation is generalized.

In this paper we develop a method to study fuzzy measures associating certain sets of probabilities to them. This development is based on the consideration of one property of a fuzzy integral, called monotone expectation. Some results about the monotone expectation in relation to associated probabilities are also obtained.

Starting from a subjective assignation of weights related to the relative importance of the different level sets, we define a new method of comparison of fuzzy numbers. This procedure is an extension of some well-known indices (Adamo, Tsumura et al., Yager). Some properties of our index are studied in this paper, as well as its behaviour on several particular cases.