TODOs and Progress
Spam Progress And Notes
- Tag blur computation by known-good urls
- To do this, must:
- Identify known good urls from DMOZ [done]
- Compute similarities based on these URLs' tags [in progress]
- Use these sims to compute tag blur for Givalink posts [code written but not run]
- To do this, must:
- Tag spam computation by known-spam urls
- To do this, must:
- Compute Pr(t) for known-spam URLs in Bibsonomy [done]
- Compute tag_spam for Givalink posts [in progress]
- To do this, must:
- Both of the above must be inserted into Givalink database
- To do this, must
- Get migrations working [Ben]
- Create another migration to add tables / columns where appropriate [started]
- Modify above scripts to output to db [not started]
- To do this, must
- Some manual labeling of spam in Givealink
- Identify likely spam candidates by previous two measures
- Create page in administrator controller which allows manual spam classification