Extracting CSV summaries by day
Gwern Branwen
gwern at gwern.net
Thu Oct 31 17:25:37 CET 2013
Once I've compiled a full set of classifying rules, then I want to
export my data to CSV so I can analyze it in R along with all the rest
of my data.
The default CSV output from arbtt-stats is pretty close to what I want:
$ arbtt-stats -f '$sampleage < 24:00' --output-format=csv
Tag,Time,Percentage
WWW,38m40s,69.05
IRC, 7m20s,13.10
PDF, 3m20s,5.95
Ideally the day would be another column and the tags would be columns
too so it would look more like this:
Date,WWW,IRC,PDF
...
2013-10-19,301,35,61
2013-10-20,600,0,30
...
I'm not sure how to do this. The $date tag seems like it *might* be
helpful, looking at
http://darcs.nomeata.de/arbtt/doc/users_guide/configuration.html#idp32216
... It works for getting arbitrary days:
$ arbtt-stats -f '$date >= 2013-09-10 && $date <= 2013-09-11'
--output-format=csv
Tag,Time,Percentage
WWW, 6h06m00s,55.01
IRC, 1h54m00s,17.13
PDF,41m20s,6.21
Music,21m20s,3.21
So I guess I could loop over the past few years and run a modified
call to arbtt-stats for each and every day and then process it into an
appropriate form. Would be pretty ugly though.
Or I could possibly kludge together something from the --intervals
flag by looping over the full output and summing:
$ arbtt-stats --output-format=csv --intervals=WWW --intervals=IRC --interval=PDF
...
PDF,10/29/13 21:31:36,10/29/13 21:31:36,40s
PDF,10/30/13 03:09:12,10/30/13 03:10:32, 2m00s
PDF,10/30/13 03:14:32,10/30/13 03:17:52, 4m00s
PDF,10/30/13 03:19:12,10/30/13 03:21:52, 3m20s
PDF,10/30/13 14:57:51,10/30/13 14:57:51,40s
PDF,10/30/13 16:09:15,10/30/13 16:09:15,40s
PDF,10/30/13 17:31:19,10/30/13 17:32:39, 2m00s
PDF,10/30/13 17:43:59,10/30/13 17:43:59,40s
PDF,10/30/13 20:30:07,10/30/13 20:30:07,40s
PDF,10/31/13 15:22:45,10/31/13 15:23:25, 1m20s
PDF,10/31/13 15:42:46,10/31/13 15:42:46,40s
PDF,10/31/13 15:45:26,10/31/13 15:46:06, 1m20s
PDF,10/31/13 15:59:27,10/31/13 16:00:07, 1m20s
Is there any clean solution I am missing?
--
gwern
http://www.gwern.net
More information about the arbtt
mailing list