日本語専門分野テキストコーパスからの複合語用語の抽出

Koyama, Teruo; Kageura, Kyo; Takeuchi, Koichi

Permalink : http://escholarship.lib.okayama-u.ac.jp/47734

ID	47734
フルテキストURL	ipsj_sigtr_2006_124_55-60.pdf 216 KB
タイトル（別表記）	A Method for Extracting Composite Terms from Japanese Domain Corpora
著者	小山照夫国立情報学研究所影浦峡東京大学大学院教育学研究科竹内孔一岡山大学大学院自然科学研究科 Kaken ID publons researchmap
抄録	テキストコーパスからの用語抽出は、自然言語処理技術の重要な応用である。従来テキストコーパスから用語候補を抽出する方法として、主として候補出現に関わる統計的指標を用いて用語性を判定する方法が採用されて来たが、統計的手法では出現頻度の低い候補についての判定が困難であった。今回の発表では、複合語に注目し、用語性を損なう形態素出現パターンを排除する形での用語候補抽出を行うことにより、高い精度で複合語用語抽出が可能となることを示す。
抄録（別表記）	Term extraction is one of the most important application of natural language processing technologies. Statistic criteria are widely adopted to evaluate the termhood of the extracted candidates. However, it is difficult to evaluate the termhood of less frequent candidates. In this study we propose a method for Japanese composite term extraction in which unproper morpheme patterns are eliminated. Using the new method, high precision of term extraction can be attained for Japanese composite terms.
発行日	2006-11-22
出版物タイトル	情報処理学会研究報告. 自然言語処理研究会報告
出版物タイトル（別表記）	IPSJ SIG Technical Report
巻	2006巻
号	124号
出版者	情報処理学会
出版者（別表記）	Information Processing Society of Japan
開始ページ	55
終了ページ	60
ISSN	09196072
NCID	AN10115061
資料タイプ	テクニカルレポート
オフィシャル URL	http://www.bookpark.ne.jp/cm/ipsj/search.asp?flag=6&keyword=IPSJ-NL06176008&mode=PDF
言語	日本語
著作権者	ここに掲載した著作物の利用に関する注意本著作物の著作権は情報処理学会に帰属します。本著作物は著作権者である情報処理学会の許可のもとに掲載するものです。ご利用に当たっては「著作権法」ならびに「情報処理学会倫理綱領」に従うことをお願いいたします。Notice for the use of this material The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). This material is published on this web site with the agreement of the author (s) and the IPSJ. Please be complied with Copyright Law of Japan and the Code of Ethics of the IPSJ if any users wish to reproduce, make derivative work, distribute or make available to the public any part or whole thereof. All Rights Reserved, Copyright (C) Information Processing Society of Japan.
論文のバージョン	publisher
査読	有り