Visual Information Processing Group
Home Members Publications Projects Events Resources Ph. D.

Learning from Crowds with Variational Gaussian Processes


Pablo Ruiz, Pablo Morales-Álvarez, Rafael Molina, and Aggelos K. Katsaggelos, “Learning from Crowds with Variational Gaussian Processes”, Pattern Recognition, vol. 88, 298-311, 2019. doi:10.1016/j.patcog.2018.11.021 [BibTeX entry][ (797 KB.)]


Solving a supervised learning problem requires to label a training set. This task is traditionally performed by an expert, who provides a label for each sample. The proliferation of social web services (e.g., Amazon Mechanical Turk) has introduced an alternative crowdsourcing approach. Anybody with a computer can register in one of these services and label, either partially or completely, a dataset. The effort of labeling is then shared between a great number of annotators. However, this approach introduces scientifically challenging problems such as combining the unknown expertise of the annotators, handling disagreements on the annotated samples, or detecting the existence of spammer and adversarial annotators. All these problems require probabilistic sound solutions which go beyond the naive use of majority voting plus classical classification methods. In this work we introduce a new crowdsourcing model and inference procedure which trains a Gaussian Process classifier using the noisy labels provided by the annotators. Variational Bayes inference is used to estimate all unknowns. The proposed model can predict the class of new samples and assess the expertise of the involved annotators. Moreover, the Bayesian treatment allows for a solid uncertainty quantification. Since when predicting the class of a new sample we might have access to some annotations for it, we also show how our method can naturally incorporate this additional information. A comprehensive experimental section evaluates the proposed method with synthetic and real experiments, showing that it consistently outperforms other state-of-the-art crowdsourcing approaches.


  • Gaussian Processes are used to address the crowdsourcing problem.
  • Variational inference is used for the first time to train the model.
  • Annotations provided for test instances can be integrated into the prediction.
  • We provide an experimental comparison with state-of-the-art crowdsourcing methods in both synthetic and real datasets.
  • The proposed method outperforms all state-of-the-art methods it was compared against.
  • A synthetic example

    We introduce a controlled one-dimensional example to show the behavior of the proposed method. Figure 1a) shows the underlying synthetic classification dataset used. The features are uniformly sampled in the interval [-π, π]. The real labels are assigned according to the sign of the cosine function on each sample: class C1 (resp. class C0) if the cosine is positive (resp. negative).

    a) b)
    c) d)
    e) f)
    Figure 1. a) Original data set labeled using sign of cosine function. b) - f) Labels provided by annotators 1,2,3,4 and 5 respectively.

    Our goal is to learn an automatic classifier which distinguishes between samples belonging to class C1 and samples belonging to C0. The first step is to label a training set. Unlike a classical classification problem where only one or two experts annotates the whole training dataset, in this example we assume that this effort is shared by 5 annotators with different levels of expertise.

    The 5 annotators are simulated by fixing the values of sensitivity and specificity (α and β in Fig. 1(b-f)). That is, if the true label of a given sample is 1 (resp. 0), the annotator assigns it to class C1 (resp. C0) with probability α (resp. β). In Fig. 1(b-f) we show the labels assigned by each annotator. As expected from the sensitivity and specificity values, annotators 1, 2, 3, and 5 make fewer mistakes than annotator 4, who assigns most samples to the opposite class (it has an adversarial behavior).

    During the training step, the proposed method learns the underlying probabilistic model using ONLY the information provided by the annotator. That is:

    1. Given a new sample, the proposed method can predict its label, as well as the uncertainty about this prediction.
    2. The unknown true labels of the training set are estimated during training.
    3. The proposed method estimates the unknown sensitivity and specificity values of each annotator, detecting spammer and adversarial behaviors.
    4. If one or several annotators provide labels for a test sample, the proposed method includes this additional information in a natural way to produce a combined prediction human-machine.


    The proposed method is evaluated on three different types of datasets: Synthetic (samples and crowdsourcing annotations are synthetically generated), semi-synthetic (samples are real but crowdsourcing annotations are synthetic), and real (samples and crowdsourcing annotations are real).

  • Synthetic data can be downloaded here.
  • Semi-synthetic data was obtained from the UCI Machine Learning Repository: Heart and Sonar. The processed data and the annotations generated for our experiments can be downloaded here.
  • The real examples can be downloaded from the author’s website [1].
  • MATLAB code

    A MATLAB implementation of the proposed method can be downloaded here. In the experimental section of the paper the proposed method is compared against the following state-of-the-art methods [1] Rodrigues, [2] Raykar and [3] Yan. In these links we provide our own MATLAB implementation of Raykar and Yan. A MATLAB implementation for Rodrigues can be downloaded from the author’s website.


    [1] F. Rodrigues, F. Pereira, B. Ribeiro, Gaussian process classification and active learning with multiple annotators, in: ICML, 2014, pp. 433–441.

    [2] V. Raykar, S. Yu, L. Zhao, G. Hermosillo-Valadez, C. Florin, L. Bogoni, L. Moy, Learning from crowds, J. Mach. Learn. Res. 11 (2010) 1297–1322.

    [3] Y. Yan, R. Rosales, G. Fung, M. Schmidt, G. Hermosillo-Valadez, L. Bogoni, L. Moy, J. Dy, Modeling annotator expertise: Learning when everybody knows a bit of something, in: AISTATS, 2010, pp. 932–939.


    The programs are granted free of charge for research and education purposes only. Scientific results produced using the software provided shall acknowledge the use of the VGPCR implementation provided by us. If you plan to use it for non-scientific purposes, don't hesitate to contact us.

    Because the programs are licensed free of charge, there is no warranty for the program, to the extent permitted by applicable law. except when otherwise stated in writing the copyright holders and/or other parties provide the program "as is" without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the quality and performance of the program is with you. Should the program prove defective, you assume the cost of all necessary servicing, repair or correction.

    In no event unless required by applicable law or agreed to in writing will any copyright holder, or any other party who may modify and/or redistribute the program, be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or inability to use the program (including but not limited to loss of data or data being rendered inaccurate or losses sustained by you or third parties or a failure of the program to operate with any other programs), even if such holder or other party has been advised of the possibility of such damages.

    Visual Image Processing
    University of Granada