程序代写代做代考 scheme python algorithm database hadoop deep learning 我建议你可以考虑加下以下内容

我建议你可以考虑加下以下内容

## 介绍我们使用的数据库 neo4j

## 介绍我们适用AWS云计算平台

## Data 数据

data souce, scheme, size

https://www.openacademic.ai/oag/

## 构建graph database

A graph database can store any kind of data using a few simple concepts:

1. Nodes – graph data records
2. Relationships – connect nodes
3. Properties – named data values

对于 我们这个项目 我们如下构建

1. nodes are: papers authors

2. relationships: Paper A reference Paper B. From fields `references`, we can add these relationships

authors A write paper B

3. Properties: Felds except `references` can all be taken as properties (name and value pairs)

## 数据导入

因为我们数据非常大, 高效导入很重要 适用 neo4j的工具 neo4j admin import

https://neo4j.com/docs/operations-manual/current/tutorial/import-tool/

## 计算paper pagerank

https://github.com/neo4j-contrib/neo4j-graph-algorithms

计算完成后,在neo4j数据库中, 每个论文node多了1个名为pagerank的property

## 确定论文主题

首先确定有哪些主题,通过map reduce 统计关键字词频率,将高频词作为统计主题

统计词频我们适用了 hadoop python streaming, 因为数据量很大,应用hadoop 分布式计算

可以高效求解。

可以介绍 **hadoop** **map reduce python streaming** 等方面内容

在这个过程中 我们要进行关键词的数据处理 注意是数据清理 形式归一化 之前发你的文档写过

https://docs.qq.com/doc/BGww0r2bdHlv0mp4qG2bKveX1rxMLY1i2JfX3IQmKC2Cjyb92DNwhy29cH8t176uIT0ev6iT1

通过上述步骤, 我们获得了主题词集合

每个论文,根据它的关键字, 看是否出现在主题词集合中来划分它的主题。 对每个论文,它的每个主题作为该论文node的label加入到neo4j数据库中. 1个论文可以有多个主题词

## 数据库查询示例 query examples

top 10 papers with topic deep learning and neural network ordered by pagerank value

“`sql
match (p:`deep learning`:`neural network`)
return p.title, p.pagerank
order by p.pagerank desc limit 10;
“`

top 10 papers with topic algorithm design and analysis ordered by number of citation

“`sql
match (p:`algorithm design and analysis`)
return p.title, p.n_citation
order by p.n_citation desc limit 10;
“`

top 10 papers that reference or referenced by ‘Random search for hyper-parameter optimization’
ordered by pagerank value

“`sql
match (a:Paper) — (b:Paper {title: ‘Random search for hyper-parameter optimization’})
return a.title, a.pagerank
order by a.pagerank desc limit 10;
“`

top 10 papers that reference ‘Random search for hyper-parameter optimization’
ordered by pagerank value

“`sql
match (a:Paper) –> (b:Paper {title: ‘Random search for hyper-parameter optimization’})
return a.title, a.pagerank
order by a.pagerank desc limit 10;
“`

top 10 papers referenced by ‘Random search for hyper-parameter optimization’
ordered by pagerank value

“`sql
match (a:Paper) <-- (b:Paper {title: 'Random search for hyper-parameter optimization'}) return a.title, a.pagerank order by a.pagerank desc limit 10; ```