Bibliographical Data Analytics
Requirements:
This project aims to develop a framework (queries, libraries, etc) to analyse very large datasets
of academic publications using graph databases. The main target dataset is the Open
Academic Graph, and includes title, authors, year, publication venue, citation count,
references, etc. Example goals of the project include: clustering papers in order to identify
research topics; defining and implementing metrics of importance, popularity, topic coverage
of a paper; generating reading lists based on an article provided as a starting point.
Data Source:
https://www.openacademic.ai/oag/
Neo4j 下载地址:
https://neo4j.com/download/other-releases/
本项目步骤共分为三部分:
1. 建模
a) 文献具有许多共性,但是要把共性细分用以优化模型便于后期生成准确的阅读清单
b) 四月二十号左右要有大概思路,五月十号要完成初稿交给导师检查,并根据导师意
见进行修改,五月下旬要完成建模的所有工作
2. 导数据
3. 数据查询以及生成推荐阅读清单
PS:
1. 所有代码要公开,包讲懂。
2. 这个文档只是建模这块的,预计完成时间为 2018 年五月中旬到 2018 年五月底。
3. 导数据预计开始时间为 2018 年六月初。
4. 查询及生成阅读清单目前时间未定。
5. 所有程序完成预计是在 2018 年九月中旬,具体时间调整等通知。
相关阅读:
附件 1:Tracking the Flow of Ideas through the Programming Languages Literature
附件 2:Connectivity in a Citation Network: The Development of DNA Theory
https://www.openacademic.ai/oag/
https://neo4j.com/download/other-releases/