Week 1
Week 2
Week 3
Week 4
Week 5
Week 6
Eval 1
Week 7
Week 8
Week 9
Week 10
Week 11
Week 12
Eval 2
Project Description
–> Completed Tasks –> Ongoing Tasks
# | Status | Objectives | Associated Deliverables | issue(s) |
---|---|---|---|---|
1 | Adapt the old reference database system to the new one. | - Publication View Page should work - Elastic Search should work | #1154 and !653; #1244 and !695 | |
2 | Clean the current bibliography data. | Data workflow as well as resulting data table. | This repository | |
3 | Identify the reference relationships between our publications and our entities, and populate the database with new data | Data workflow as well as resulting data table. | 1. .ipynb for finding provenience-pub relationships 2. Resulting CSV | |
4 | Enable single publication file submission and suggestion for connecting new entities | hello world | #1270 and !710 |
# | Status | Objectives | Associated Deliverables | issue(s) |
---|---|---|---|---|
1 | Enable bulk-uploading submission. | New interface on websites | Not yet done | |
2 | Miscellaneous issue fixing. | Numerous functional improvements and patching | Issues: #997 #1053 #1127 |
–> Completed Tasks –> Ongoing Tasks –> Work Demonstration
Week | Objectives | Deliverables |
---|---|---|
1 | Adapt the old reference database system to the new one. | Publication View Page should work Elastic Search should work |
2 | Adapt the old reference database system to the new one. | |
3 | Testing whether refactoring is broken Initial exploration of publications dataset to see hwo to clean | Test PR Write existing ipynb notebooks on the data |
4 | Sent initial round of data cleaning to Adam for proofreading, refining script | |
5 |
Explore ML-based methods for reference parsing (it worked poorly). Wrote regex methods to match and sent to Emilie to proofread exact_reference |
|
6 | Modified script to account for Bibtex Key updates. Prefill merge exact_ref in merge page. | !997 |
7 | Attempt to use machine learning based methods to do pdf mining and find publication-provenience relationships. | |
8 | Switched to pattern-matching based to do pdf mining. | |
9 | Finished pdf mining with pattern-matching and generated preliminary publication-proveniences connection dataset on Github. | 1. .ipynb for finding provenience-pub relationships 2. Resulting CSV |
10 | Exploring how to incorporate node-js and python to run single publication file parsing script. | |
11 | Finalization of node-js and python scripts to run the single pub file parsing. Trying to code cakePHP to read from resulting csv file. | |
12 | Buffer week, clean up new converted dataset and fix some issues that just popped up. |
Week 1
Week 2
Week 3
Week 4
Week 5
Week 6
Eval 1
Week 7
Week 8
Week 9
Week 10
Week 11
Week 12
Eval 2