This week I decided to ditch the sciwing
method because it is not 100% accurate at this point, and it is very slow to parse the different entries.
I feel like regex
might be more useful here. So I used different regex to match the authors and filter out irrelevant data to exact_reference
like year, etc.
Regex turns out to be good at this job, however Emilie pointed out that Bibtex keys may also be updated this way.
Lars pointed out some issues which may lead to merge conflicts in the future, so I just decided to revert these changes here.
# | Day | Date | A short description of the work done |
---|---|---|---|
1 | Monday | 2022/07/11 | Created scripts that uses regex to match the contents |
2 | Tuesday | 2022/07/12 | Fixed bugs regards to the problem of misidentifying parts of the “pl.” and “p.” as author name |
3 | Wednesday | 2022/07/13 | CDLI meeting |
4 | Thursday | 2022/07/14 | Fixed the issue of extraneous fields by only picking publications with an author attached. |
5 | Friday | 2022/07/15 | Sent Emilie the data about the fully processed data for checking |
6 | Saturday | 2022/07/16 | Started working on fixing the bibtex keys. |
7 | Sunday | 2022/07/17 | Took a break |