Evaluating machine-assisted annotation in under-resourced settings

Felt, Paul; Ringger, Eric K.; Seppi, Kevin; Heal, Kristian S.; Haertel, Robbie A.; Lonsdale, Deryle

doi:10.1007/s10579-013-9258-8

Evaluating machine-assisted annotation in under-resourced settings

Original Paper
Published: 15 November 2013

Volume 48, pages 561–599, (2014)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Paul Felt¹,
Eric K. Ringger¹,
Kevin Seppi¹,
Kristian S. Heal²,
Robbie A. Haertel¹ &
…
Deryle Lonsdale³

305 Accesses
1 Citation
Explore all metrics

Abstract

Machine assistance is vital to managing the cost of corpus annotation projects. Identifying effective forms of machine assistance through principled evaluation is particularly important and challenging in under-resourced domains and highly heterogeneous corpora, as the quality of machine assistance varies. We perform a fine-grained evaluation of two machine-assistance techniques in the context of an under-resourced corpus annotation project. This evaluation requires a carefully controlled user study crafted to test a number of specific hypotheses. We show that human annotators performing morphological analysis of text in a Semitic language perform their task significantly more accurately and quickly when even mediocre pre-annotations are provided. When pre-annotations are at least 70 % accurate, annotator speed and accuracy show statistically significant relative improvements of 25–35 and 5–7 %, respectively. However, controlled user studies are too costly to be suitable for under-resourced corpus annotation projects. Thus, we also present an alternative analysis methodology that models the data as a combination of latent variables in a Bayesian framework. We show that modeling the effects of interesting confounding factors can generate useful insights. In particular, correction propagation appears to be most effective for our task when implemented with minimal user involvement. More importantly, by explicitly accounting for confounding variables, this approach has the potential to yield fine-grained evaluations using data collected in a natural environment outside of costly controlled user studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Crowdsourcing

Corpus Annotation

An Open Source Tool for Crowd-Sourcing the Manual Annotation of Texts

Notes

http://mturk.amazon.com.
https://developers.google.com/web-toolkit/.
Details about feature status and release dates will be available as these changes occur at https://facwiki.cs.byu.edu/nlp/index.php/Ccash.
http://www.responsa.co.il/home.en-US.aspx.

References

Alex, B., Grover, C., Haddow, B., Kabadjov, M., Klein, E., Matthews, M., et al. (2008). Assisted curation: Does text mining really help? In Pacific symposium on biocomputing, Citeseer, Vol. 5, pp. 56–67.
Baldridge, J., & Osborne, M. (2004). Active learning and the total cost of annotation. In Proceedings of the 42nd annual meeting of the association for computational linguistics, pp. 9–16.
Bangalore, S., & Joshi, A. (1999). Supertagging: An approach to almost parsing. Computational Linguistics, 25(2), 237–265.
Google Scholar
Barque, L., Nasr, A., & Polguére, A. (2010). From the definitions of the ‘Trésor de la Langue Française’ to a semantic database of the French language. In Proceedings of the XIV Euralex international congress, pp. 245–252.
Barry, J. (2011). Doing Bayesian data analysis: A tutorial with R and BUGS. Europe’s Journal of Psychology 7(4), 778–779.
Google Scholar
Berger, J., & Berry, D. (1988). Statistical analysis and the illusion of objectivity. American Scientist, 76(2), 159–165.
Google Scholar
Brants, T., & Plaehn, O. (2000). Interactive corpus annotation. In Proceedings of the second international conference on language resources and evaluation.
Carmen, M., Felt, P., Haertel, R., Lonsdale, D., McClanahan, P., Merkling, O., et al. (2010). Tag dictionaries accelerate manual annotation. In Proceedings of the seventh international conference on language resources and evaluation.
Carter, D. (1997). The TreeBanker. A tool for supervised training of parsed corpora. In Proceedings of the workshop on computational environments for grammar development and linguistic engineering, pp. 9–15.
Chang, M., Ratinov, L., & Roth, D. (2007). Guiding semi-supervision with constraint-driven learning. In Proceedings of the 45th annual meeting of the association for computational linguistics, pp. 280–287.
Chiou, F., Chiang, D., & Palmer, M. (2001). Facilitating treebank annotation using a statistical parser. In Proceedings of the first international conference on human language technology research, pp. 1–4.
Cowie, J., & Lehnert, W. (1996). Information extraction. Communications of the ACM 39(1), 80–91.
Article Google Scholar
Dandapat, S., Biswas, P., Choudhury, M., & Bali, K. (2009) Complex linguistic annotation—No easy way out!: A case from Bangla and Hindi POS labeling tasks. In Proceedings of the third linguistic annotation workshop, pp. 10–18.
Davies, M. (2009). The 385+ million word corpus of contemporary American english (1990-2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics, 14(2), 159–190.
Article Google Scholar
Davies, M. (2010). The corpus of historical American english: 400 million words, 1810–2009. Available online at http://corpus.byu.edu/coha/.
Druck, G., Mann, G., & McCallum, A. (2008). Learning from labeled features using generalized expectation criteria. In Proceedings of the 31st annual international conference on research and development in information retrieval, pp. 595–602.
Felt, P., Merkling, O., Carmen, M., Ringger, E., Lemmon, W., Seppi, K., et al. (2010). CCASH: A web application framework for efficient distributed language resource development. In Proceedings of the seventh international conference on language resources and evaluation.
Felt, P., Ringger, E., Seppi, K., Heal, K., Haertel, R., & Lonsdale, D. (2012). First results in a study evaluating pre-labeling and correction propagation for machine-assisted Syriac morphological analysis. In Proceedings of the Eighth international conference on language resources and evaluation.
Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., et al (2010). Building watson: An overview of the DeepQA project. AI Magazine, 31(3), 59–79.
Google Scholar
Fort, K., & Sagot, B. (2010). Influence of pre-annotation on POS-tagged corpus development. In Proceedings of the fourth linguistic annotation workshop, association for computational linguistics, pp. 56–63.
Ganchev, K., Pereira, F., Mandel, M., Carroll, S.,& White, P. (2007). Semi-automated named entity annotation. In Proceedings of the 45th annual meeting of the association for computational linguistics: Linguistic annotation workshop, pp. 53–56.
Ganchev, K., Graça, J., Blitzer, J., & Taskar, B. (2008). Multi-view learning over structured and non-identical outputs. In Proceedings of the conference on Uncertainty in Artificial Intelligence, pp. 204–211.
Gelman, A. (2004). Bayesian data aAnalysis. CRC Press: Boca Raton, FL.
Google Scholar
Haertel, R., Ringger, E., Seppi, K., Carroll, J., & McClanahan, P. (2008a). Assessing the costs of sampling methods in active learning for annotation. In Proceedings of the 46th annual meeting of the association for computational linguistics, pp. 65–68.
Haertel, R., Seppi, K., Ringger, E., & Carroll, J. (2008b). Return on investment for active learning. In Proceedings of the 22nd annual conference on neural information processing systems: Workshop on cost-sensitive learning.
Kiraz, G. (1994). Automatic concordance generation of Syriac texts. Orientalia Christiana Analecta, 247, 461–475.
Google Scholar
Kittur, A., Chi, E., & Suh, B. (2008). Crowdsourcing user studies with mechanical turk. In Proceedings of the 26th annual conference on human factors in computing systems, pp. 453–456.
Klebanov, B., & Beigman, E. (2009). From annotator agreement to noise models. Computational Linguistics 35(4), 495–503.
Article Google Scholar
Kristjansson, T., Culotta, A., Viola, P., & McCallum, A. (2004). Interactive information extraction with constrained conditional random fields. In Proceedings of the 19th conference on artificial intelligence, pp.412–418.
Leech, G., Garside, R., & Bryant, M. (1994). CLAWS4: The tagging of the British National Corpus. In Proceedings of the 15th conference on computational linguistics, association for computational linguistics, Vol. 1, pp. 622–628.
Liang, P., Jordan, M., & Klein, D. (2009). Learning from measurements in exponential families. In Proceedings of the 26th annual international conference on machine learning, pp. 641–648.
Liang, P., Jordan, M. I., & Klein, D. (2013). Learning dependency-based compositional semantics. Computational Linguistics, 39(2), 389–446.
Google Scholar
Manning, C., & Schütze, H. (1999). Foundations of statistical natural language processing. MIT Press: Cambridge, MA.
Google Scholar
Marcus, M., Marcinkiewicz, M., & Santorini, B. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.
Google Scholar
McClanahan, P., Busby, G., Haertel, R., Heal, K., Lonsdale, D., Seppi, K., et al. (2010) A probabilistic morphological analyzer for Syriac. In Proceedings of the conference on empirical methods in natural language processing, pp. 810–820.
Menke, J., & Martinez, T. (2004). Using permutations instead of student’s t distribution for P-values in paired-difference algorithm comparisons. In Proceedings of the international joint conference on neural networks, pp. 1331–1335.
Michel, J., Shen, Y., Aiden, A., Veres, A., Gray, M., Pickett, J., et al. (2011). Quantitative analysis of culture using millions of digitized books. Science 331(6014), 176.
Google Scholar
Naseem, T., Chen, H., Barzilay, R., & Johnson, M. (2010). Using universal linguistic knowledge to guide grammar induction. In Proceedings of the conference on empirical methods in natural language processing, pp. 1234–1244.
Ngai, G., & Yarowsky, D. (2000). Rule writing or annotation: cost-efficient resource usage for base noun phrase chunking. In Proceedings of the 38th annual meeting of the association for computational linguistics, pp. 117–125.
Oepen, S., Toutanova, K., Shieber, S., Manning, C., Flickinger, D., & Brants, T. (2002). The LinGO redwoods treebank motivation and preliminary applications. In: Proceedings of the 19th international conference on computational linguistics, association for computational linguistics, Vol. 2, pp. 1–5.
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.
Article Google Scholar
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the conference on empirical methods in natural language processing, pp. 79–86.
Rehbein, I., Ruppenhofer, J., & Sporleder, C. (2009). Assessing the benefits of partial automatic pre-labeling for frame-semantic annotation. In Proceedings of the third linguistic annotation workshop, association for computational linguistics, pp. 19–26.
Ringger, E., Carmen, M., Haertel, R., Seppi, K., Lonsdale, D., McClanahan, P., et al. (2008). Assessing the costs of machine-assisted corpus annotation through a user study. In: Proceedings of the sixth international conference on language resources and evaluation.
Settles, B. (2010). Active learning literature survey. Computer sciences technical report 1648, Madison: University of Wisconsin.
Shneiderman, B., & Plaisant, C. (2006). Strategies for evaluating information visualization tools: Multi-dimensional in-depth long-term case studies. In: Proceedings of the 2006 AVI workshop on beyond time and errors: Novel evaluation methods for information visualization, pp. 1–7.
Smith, N. A. (2011) Linguistic structure prediction. Synthesis Lectures on Human Language Technologies, 4(2), 1–274.
Article Google Scholar
Smith, D., Rydberg-Cox, J., & Crane, G. (2000). The perseus project: A digital library for the humanities. Literary and Linguistic Computing, 15(1), 15–25
Article Google Scholar
Soon, W., Ng, H., & Lim, D. (2001). A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27(4), 521–544.
Article Google Scholar
Stan Development Team. (2013). Stan: A c++ library for probability and sampling, version 1.1. URL http://mc-stan.org/.
Tanaka, T., Bond, F., Oepen, S., & Fujita, S. (2005) High precision treebanking: Blazing useful trees using POS information. In Proceedings of the 43rd annual meeting on association for computational linguistics, association for computational linguistics, pp. 330–337.
Tomanek, K., Wermter, J.,& Hahn, U. (2007) Efficient annotation with the jena annotation environment (JANE). In Proceedings of the 45th annual meeting of the association for computational linguistics: Linguistic annotation workshop, pp. 9–16.
Tov, E., & Reynolds. N., (2006) The dead sea scrolls electronic library. Brigham Young University and the Neal A. Maxwell Institute for Religious Scholarship, Provo, UT.
Von Ahn, L., & Dabbish, L. (2008). Designing games with a purpose. Communications of the ACM, 51(8), 58–67.
Article Google Scholar
Wright, W. (1871). Apocryphal acts of the apostles. London: Williams and Norgate.
Zettlemoyer, L., & Collins, M. (2005). Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Proceedings of the twenty-first conference annual conference on uncertainty in artificial intelligence (pp. 658–666). Arlington, VA: AUAI Press.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Brigham Young University, Provo, UT, USA
Paul Felt, Eric K. Ringger, Kevin Seppi & Robbie A. Haertel
Neal A. Maxwell Institute for Religious Scholarship, Center for the Preservation of Ancient Religious Texts, Brigham Young University, Provo, UT, USA
Kristian S. Heal
Department of Linguistics and English Language, Brigham Young University, Provo, UT, USA
Deryle Lonsdale

Authors

Paul Felt
View author publications
You can also search for this author in PubMed Google Scholar
Eric K. Ringger
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Seppi
View author publications
You can also search for this author in PubMed Google Scholar
Kristian S. Heal
View author publications
You can also search for this author in PubMed Google Scholar
Robbie A. Haertel
View author publications
You can also search for this author in PubMed Google Scholar
Deryle Lonsdale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Felt.

Appendices

Appendix 1: RStan time model

Appendix 2: RStan accuracy model

Appendix 3: Derivation of complete conditionals

This appendix walks through a mathematical derivation of the complete conditional distributions necessary to use Gibbs sampling to obtain samples from the posterior probability distributions over the unobserved parameters in the latent variable model of annotation time first presented in Sect. 5.3.1.

Note that the following derivation is not required by those simply wishing to formulate and run a model such as that described in Sect. 5.3 using existing statistical computing libraries (see "Appendix 2" section). The following derivation is for those readers whose background may not be in statistics, but wish to acquaint themselves with the details required to implement own Gibbs sampler.

1.1 Introduction

A complete conditional distribution for a variable represents the probability of that variable given the priors, the data, and values for every other parameter in the model. These distributions are obtained by first calculating ln(g), the unnormalized joint posterior distribution. After it is defined, ln(g) is used as a basis for finding the complete conditional distributions over each parameter.

We first take a moment to summarize the model. We model the number of seconds taken to annotate a word y _hatbro as a combination of the following variables:

1.2 Variables

σ ² Variance common to all words
θ _h Annotators
α _a Current condition
τ _t Grammatical Category
β _b Bucketed word position (0,1,2,3+)
ρ _r Hyperlinks clicked
ω _o Hyperlinks shown
κ Offset common to all words

1.3 Priors

Prior justifications may be found in Sect. 5.3.1. Gamma distributions are parameterized by shape and scale.

σ² ∼ Gamma(50,50)
$\theta_h \sim N(0,\frac{40}{3})$
$\alpha_a \sim N(0,\frac{40}{3})$
$\tau_t \sim N(0,\frac{40}{3})$
$\beta_b \sim N(0,\frac{40}{3})$
$\rho_r \sim N(0,\frac{40}{3})$
$\omega_o \sim N(0,\frac{40}{3})$
$\kappa \sim N(90,\frac{50}{3})$

1.4 Likelihood

The density of a single data point is distributed as y _hatbro|θ _h, α _a, τ _t, β _b, ρ _r, ω_o, κ ∼ N(θ _h + α _a + τ _t + β _b + ρ _r + ω _o + κ, σ²) Assuming that the probability of each data point is independent, the likelihood, or probability of the data set, may be written as the product of the probability of each data point.

$$ \begin{aligned} L(\underline{y}|\underline{\varTheta}) &= L(\underline{y}|\sigma^2,\underline{\theta},\underline{\alpha},\underline{\tau}, \underline{\beta},\underline{\rho},\underline{\omega},\kappa)\\ &= \prod_{y_{hatbro} \in \underline{y}} p(y_{hatbro}|\sigma^2,\theta_h,\alpha_a,\tau_t,\beta_b,\rho_r,\omega_o,\kappa)\\ &= \prod_{y_{hatbro} \in \underline{y}} (2\pi)^{-\frac{1}{2}} (\sigma^2)^{-\frac{1}{2}} e^{-\frac{1}{2\sigma^2} (y_{hatbro}-(\theta_h+\alpha_a+\tau_t+\beta_b+\rho_r+\omega_o+\kappa))^2}\\ &= \prod_{y_{hatbro} \in \underline{y}} (2\pi)^{-\frac{1}{2}} (\sigma^2)^{-\frac{1}{2}} e^{-\frac{1}{2\sigma^2} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2}\\ &= (2\pi)^{-\frac{N}{2}} (\sigma^2)^{-\frac{N}{2}} e^{-\frac{1}{2\sigma^2} \sum_{y_{hatbro} \in \underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2} \end{aligned} $$

1.5 Joint posterior

Our end goal is to estimate the joint posterior distribution over all of our parameters given the evidence provided by the data, $p(\underline{\varTheta}|\underline{y})$, where $\underline{\varTheta}$ represents all of our parameters. Using Bayes’ rule, we write the posterior as a combination of the likelihood of the data and our prior probability distributions. $p(\underline{\varTheta}|\underline{y}) = \frac{L(\underline{y}|\underline{\varTheta})p(\underline{\varTheta})}{p(\underline{y})} = \frac{L(\underline{y}|\underline{\varTheta})p(\underline{\varTheta})}{\int\cdots\int L(\underline{y}|\underline{\varTheta})p(\underline{\varTheta}) d\underline{\varTheta}}$. Because the normalizing constant in the denominator of this quantity involves integrating over all $\underline{\varTheta}$’s it is intractable to compute. Fortunately, using Gibbs sampling to get samples from the joint posterior does not require being able to compute the normalizing constant. We drop the constant and calculate the numerator of our joint posterior, which we will call g. Recall that because the denominator of the posterior is a constant, g is proportional to the posterior distribution. We will similarly drop any other constants we find as we derive g. Finally, because of machine precision issues when working with small probabilities, it is most useful to work directly with the logarithm of g.

$$ ln(g) = ln(L(\underline{y}|\underline{\varTheta})p(\underline{\varTheta})) $$

Now insert our own parameter names.

$$= ln(L(\underline{y}|\sigma^2,\underline{\theta},\underline{\alpha}, \underline{\tau},\underline{\beta},\underline{\rho},\underline{\omega},\kappa) p(\sigma^2,\underline{\theta},\underline{\alpha}, \underline{\tau},\underline{\beta},\underline{\rho},\underline{\omega},\kappa)) $$

Because our parameters are all independent of one another, we can write their joint prior probabilities as a product of individual prior probabilities.

$$ \begin{aligned} &=ln(L(\underline{y}|\sigma^2,\underline{\theta},\underline{\alpha},\underline{\tau}, \underline{\beta},\underline{\rho},\underline{\omega},\kappa) p(\sigma^2) \prod_{h=1}^H p(\theta_h) \prod_{a=1}^A p(\alpha_a)\\ &\quad\prod_{t=1}^T p(\tau_t) \prod_{b=1}^B p(\beta_b) \prod_{r=1}^R p(\rho_r) \prod_{o=1}^O p(\omega_o) p(\kappa) ) \end{aligned} $$

Distribute the logarithm.

$$ \begin{aligned} &= ln(L(\underline{y}|\sigma^2,\underline{\theta},\underline{\alpha}, \underline{\tau},\underline{\beta},\underline{\rho},\underline{\omega},\kappa))\\ &+ln(p(\sigma^2)) +\sum_{h=1}^H ln(p(\theta_h))\\ &+\sum_{a=1}^A ln(p(\alpha_a))\\ & +\sum_{t=1}^T ln(p(\tau_t))\\ &+\sum_{b=1}^B ln(p(\beta_b))\\ & +\sum_{r=1}^R ln(p(\rho_r))\\ & +\sum_{o=1}^O ln(p(\omega_o))\\ &+ln(p(\kappa)) \end{aligned} $$

Now substitute the numerical form of the likelihood and priors.

$$ \begin{aligned} &=ln((2\pi)^{-\frac{N}{2}} (\sigma^2)^{-\frac{N}{2}} e^{-\frac{1}{2\sigma^2} \sum_{y_{hatbro} \in \underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2})\\ &\quad+ln(\frac{1}{\Upgamma(50)50^{50}} \sigma^{2(50-1)} e^{-\frac{\sigma^2}{50}})\\ &\quad+\sum_{h=1}^H ln\left((2\pi*(\frac{40}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\theta_h-0)^2}{2*(\frac{40}{3})^2} }\right)\\ &\quad+\sum_{a=1}^A ln\left( (2\pi*(\frac{40}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\alpha_a-0)^2}{2*(\frac{40}{3})^2} }\right)\\ &\quad+\sum_{t=1}^T ln\left( (2\pi*(\frac{40}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\tau_t-0)^2}{2*(\frac{40}{3})^2} }\right)\\ &\quad+\sum_{b=1}^B ln\left( (2\pi*(\frac{40}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\beta_b-0)^2}{2*(\frac{40}{3})^2} }\right)\\ &\quad+\sum_{r=1}^R ln\left( (2\pi*(\frac{40}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\rho_r-0)^2}{2*(\frac{40}{3})^2} }\right)\\ &\quad+\sum_{o=1}^O ln\left( (2\pi*(\frac{40}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\omega_o-0)^2}{2*(\frac{40}{3})^2} }\right)\\ & \quad + ln\left((2\pi*(\frac{50}{3})^2)^{-\frac{1}{2}} e^{-\frac{(\kappa-90)^2}{2*(\frac{50}{3})^2} }\right) \end{aligned} $$

Distribute logarithms deeper into terms and drop additive constants.

$$ \begin{aligned} &\propto - \frac{N}{2}ln(\sigma^2) - \frac{1}{2\sigma^2} \sum_{y_{hatbro} \in \underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2\\ &\quad+49ln(\sigma^2) - \frac{\sigma^2}{50}\\ &- \sum_{h=1}^H \frac{\theta_h^2}{2(\frac{40}{3})^2}\\ &\quad- \sum_{a=1}^A \frac{\alpha_a^2}{2(\frac{40}{3})^2}\\ &\quad- \sum_{t=1}^T \frac{\tau_t^2}{2(\frac{40}{3})^2}\\ &\quad- \sum_{b=1}^B\frac{\beta_b^2}{2(\frac{40}{3})^2}\\ &\quad- \sum_{r=1}^R \frac{\rho_r^2}{2(\frac{40}{3})^2}\\ &\quad- \sum_{o=1}^O \frac{\omega_o^2}{2(\frac{40}{3})^2}\\ &\quad-\frac{(\kappa-90)^2}{2(\frac{50}{3})^2} \end{aligned} $$

1.6 Complete conditionals

Now it remains to derive the complete conditional distributions of each parameter. A complete conditional distribution over parameter $\varTheta$ represents the probability of that parameter given the data and the value of every other parameter in the graph, and is necessary for Gibbs sampling to function correctly. It turns out that we can use g, which we have already calculated, to derive the complete conditional of each parameter simply by treating all variables except the parameter of interest as constants, and dropping as many of these constants as possible. Because g is not normalized, and complete conditionals are defined as proper probability distributions, we symbolically add to each complete conditional the constant c that would correctly normalize it. However, this is only for form’s sake, since Gibbs sampling does not require normalized complete conditionals. Again, because of machine precision issues, we express our conditionals in log space. Because these distribution are not in the form of a known distribution that we can easily sample from, it is necessary to do Metropolis-Hastings within Gibbs sampling. For details on the Metropolis-Hastings algorithm, we refer the reader to Gelman (2004).

$$ \begin{aligned} &\left[ \sigma^2 \right] = (49-\frac{N}{2})ln(\sigma^2) - \frac{\sigma^2}{50} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2+ c \\ &\left[ \alpha_a \right] = - \frac{\alpha_a^2}{2(\frac{40}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c \\ &\left[ \theta_h \right] = - \frac{\theta_h^2}{2(\frac{40}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c \\\ &\left[ \beta_b \right] = - \frac{\beta_b^2}{2(\frac{40}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c \\ &\left[ \tau_t \right] = - \frac{\tau_t^2}{2(\frac{40}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c \\ &\left[ \rho_r \right] = - \frac{\rho_r^2}{2(\frac{40}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c \\ &\left[ \omega_o \right] = - \frac{\omega_o^2}{2(\frac{40}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c \\ &\left[ \kappa \right] = -\frac{(\kappa-90)^2}{2(\frac{50}{3})^2} - \frac{1}{2\sigma^2} \sum_{\underline{y}} (y_{hatbro}-\theta_h-\alpha_a-\tau_t-\beta_b-\rho_r-\omega_o-\kappa)^2 + c\\ \end{aligned} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Felt, P., Ringger, E.K., Seppi, K. et al. Evaluating machine-assisted annotation in under-resourced settings. Lang Resources & Evaluation 48, 561–599 (2014). https://doi.org/10.1007/s10579-013-9258-8

Download citation

Published: 15 November 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10579-013-9258-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating machine-assisted annotation in under-resourced settings

Abstract

Access this article