このエントリーをはてなブックマークに追加
ID 67516
フルテキストURL
fulltext.pdf 1.18 MB
著者
Naing, Inzali Department of Information and Communication Systems, Okayama University
Aung, Soe Thandar Department of Information and Communication Systems, Okayama University
Wai, Khaing Hsu Department of Information and Communication Systems, Okayama University
Funabiki, Nobuo Department of Information and Communication Systems, Okayama University Kaken ID publons researchmap
抄録
Collecting reference papers from the Internet is one of the most important activities for progressing research and writing papers about their results. Unfortunately, the current process using Google Scholar may not be efficient, since a lot of paper files cannot be accessed directly by the user. Even if they are accessible, their effectiveness needs to be checked manually. In this paper, we propose a reference paper collection system using web scraping to automate paper collections from websites. This system can collect or monitor data from the Internet, which is considered as the environment, using Selenium, a popular web scraping software, as the sensor; this examines the similarity against the search target by comparing the keywords using the Bert model. The Bert model is a deep learning model for natural language processing (NLP) that can understand context by analyzing the relationships between words in a sentence bidirectionally. The Python Flask is adopted at the web application server, where Angular is used for data presentations. For the evaluation, we measured the performance, investigated the accuracy, and asked members of our laboratory to use the proposed method and provide their feedback. Their results confirm the method’s effectiveness.
キーワード
web scraping
Google Scholar
data collection
Bert
Selenium
flask framework
Angular
発行日
2024-07-10
出版物タイトル
Electronics
13巻
14号
出版者
MDPI
開始ページ
2700
ISSN
2079-9292
資料タイプ
学術雑誌論文
言語
英語
OAI-PMH Set
岡山大学
著作権者
© 2024 by the authors.
論文のバージョン
publisher
DOI
Web of Science KeyUT
関連URL
isVersionOf https://doi.org/10.3390/electronics13142700
ライセンス
https://creativecommons.org/licenses/by/4.0/
Citation
Naing, I.; Aung, S.T.; Wai, K.H.; Funabiki, N. A Reference Paper Collection System Using Web Scraping. Electronics 2024, 13, 2700. https://doi.org/10.3390/electronics13142700