Saturday, July 22, 2006
RTE 2 Dataset icin Parser yazdim (Python'da)
Artik RTE 2 dataset'ini parse edebiliyoruz:
Yani elimizde her bir T-H (Text-Hypothesis) ikilisi icin:
(raw data)
THPair:
- id
- entailment: TRUE/FALSE
- task: QA/SUM/IE/IR
- text: "... ... ... ... .."
- hypothesis: ".. .. ..."
veya
(preprocessed data)
XTHPair:
-id
- entailment: TRUE/FALSE
- task: QA/SUM/IE/IR
- text:
XSentence:
- serial (in other words index within the context)
- nodes:
XNode:
- id (unique within the sentence)
- word
- lemma
- pos
- relation:
Relation'daki 'parent node', cumleleri bir parse tree seklinde ifade edebilmemizi sagliyor.
Yani elimizde her bir T-H (Text-Hypothesis) ikilisi icin:
(raw data)
THPair:
- id
- entailment: TRUE/FALSE
- task: QA/SUM/IE/IR
- text: "... ... ... ... .."
- hypothesis: ".. .. ..."
veya
(preprocessed data)
XTHPair:
-id
- entailment: TRUE/FALSE
- task: QA/SUM/IE/IR
- text:
- XSentence
- XSentence
- XSentence
- XSentence
XSentence:
- serial (in other words index within the context)
- nodes:
- XNode (in other words, atomic tokens in the sentence)
- XNode
- XNode
- ...
- XNode
XNode:
- id (unique within the sentence)
- word
- lemma
- pos
- relation:
- relation type
- parent XNode
Relation'daki 'parent node', cumleleri bir parse tree seklinde ifade edebilmemizi sagliyor.