I think I’ve made enough progress with the coding to have something useful. And everything is using the AngularJS framework, so I’m pretty buzzword compliant. Well, that may be to bold a statement, but at least it can make data that can be analyzed by something a bit more rigorous than just looking at it in a spreadsheet.
Here the current state of things:
The user app: http://tajour.com/iRevApps/irev3.php
The data analysis app: http://tajour.com/iRevApps/irevdb.html
- Improved the UX on both apps. The main app first.
- You can now look at other poster’s posts without ‘logging in’. You have to type the passphrase to add anything though.
- It’s now possible to search through the posts by search term and date
- The code is now more modular and maintainable (I know, not that academic, but it makes me happy)
- Twitter and Facebook crossposting are coming.
- For the db app
- Dropdown selection of common queries
- Tab selection of query output
- Rule-based parsing (currently keyDownUp, keyDownDown and word
- Excel-ready cvs output for all rules
- WEKA ready output for keyDownUp, keyDownDown and word. A caveat on this. The WEKA ARFF format wants to have all session information in a single row. This has two ramifications:
- There has to be a column for every key/word, including the misspellings. For the training task it’s not so bad, but for the free form text it means there are going to be a lot of columns. WEKA has a marker ‘?’ for missing data, so I’m going to start with that, but it may be that the data will have to be ‘cleaned’ by deleting uncommon words.
- Since there is only one column per key/word, keys and words that are typed multiple times have to be grouped somehow. Right now I’m averaging, but that looses a lot of information. I may add a standard deviation measure, but that will mean double the columns. Something to ponder.
Lastly, Larry Sanger (co-founder of Wikipedia) has started a wiki-ish news site. It’s possible that I could piggyback on this effort, or at least use some of their ideas/code. It’s called infobitt.com. There’s a good manifesto here.
Normally, I would be able to start analyzing data now, with WEKA and SPSS (which I bought/leased about a week ago), but my home dev computer died and I’m waiting for a replacement right now. Frustrating.