Monday, July 17, 2006

 

Pascal RTE 1 Sonuclar

RTE 1'de basari orani dusuk.

Dogru cevap verme oranlarina bakilirsa (accuracy) sistemler 50% ile 60% arasinda skorlar elde etmisler.

Sistemler tercihen, verdikleri cevaplarin confidence-weight'ini de belirtebiliyorlar (0: verdigim cevaba hic guvenmiyorum, 1: tam guveniyorum). Confidence-weighted score'lar (cws) ise 0.5 ile 0.7 arasinda.

Birinci olan sistemin accuracy/cws = 70% / 0.782. Fakat bu sistem (otomatik olarak) datasetin 19%'sini cover etmis. Bu sistem bir tek "Word Overlap" algoritmasi kullaniyor.

Dataset, Information Extraction (IE), Information Retrieval (IR), Comparable Documents (CR), Reading Comprehension (RC), Question Answering (QA), Machine Translation (MT), Paraphrase Acquisition (PP) task'larinin dataset'lerinden olusturuluyor.

Sistemler genelde Comparable Documents (CR) dataset'lerinden daha fazla dogru cevap cikarmislar. Sadece CR kismi ele alinirsa, sistemlerden bir tanesi icin accuracy/cws = 87%/0.95.

Annotator'larin Comparable Documents (CR) 'lardan T-H pair'lari olusturma yontemleri:

"Annotators identified T-H pairs by examining a
cluster of comparable news articles that cover a
common story. They examined "aligned" sentence
pairs that overlap lexically, in which semantic entailment
may or may not hold. Some pairs were
identified on the web using Google news and others
taken from an available resource of aligned
English sentences. The
motivation for this setting is the common use of
lexical overlap as a hint for semantic overlap in
comparable documents, e.g. for multi-document
summarization."

Bu aciklamaya gore birinci olan sistemin [Perez(Madrid)] kullandigi teknik ile ortusuyor. Bu arada bu sistem dataset'i 100% cover ettiginde hic iyi degil. Accuracy = 0.495.

For more information:
The PASCAL Recognising Textual Entailment Challenge: http://www.cs.biu.ac.il/~glikmao/rte05/dagan_et_al.pdf

Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?