Listing samples which are not matched by any tags?
Gwern Branwen
gwern at gwern.net
Sun Jun 29 23:35:49 CEST 2014
On Sun, Nov 3, 2013 at 6:36 PM, Gwern Branwen <gwern at gwern.net> wrote:
> On Sun, Nov 3, 2013 at 5:31 PM, Joachim Breitner
> <mail at joachim-breitner.de> wrote:
>> But now you surely want to know what these selected samples look like,
>> right? That leads us to the discussion we had on the list with Waldir
>> in June: What should the tool like like that combines the dumping of
>> arbtt-dump with the sample selection of arbtt-stats... I’m unsure about
>> the proper design here.
>
> To me, it seems pretty simple. Keeping the same interface, apply a
> categorize.cfg's set of rules to each sample and then print or not
> based on what tags matched or didn't match.
Has there been any more thought on this issue? After repairing my logs
& working out how to use the CSV, I wondered how much data I was
missing due to a lack of matching tag. This apparently is reported by
the -i flag. Even after adding some more tagging, this is what I get:
$ arbtt-stats -i -m 0 -f '$sampleage <100:00'
General Information
===================
FirstRecord | 2014-06-26 01:33:16.291076 UTC
LastRecord | 2014-06-29 21:28:30.625435 UTC
Number of records | 7485
Total time recorded | 3d19h31m00s
Total time selected | 1d12h41m10s
Fraction of total time recorded | 100%
Fraction of total time selected | 40%
Fraction of recorded time selected | 40%
Given the existence of the flag '--also-inactive include
samples with the tag "inactive"', I infer all this recorded time
reported is active time. But that means fully *60%* of my activity is
not being classified in any way! That's a heck of a lot of lost data.
And I don't know what the lost data is: I already classified
everything I could think of. What am I missing? I have no way of
knowing unless arbtt will tell me and give me samples of active time
which don't match so I can go 'aha, I need to classify $X/Y/Z as tag
A! Much better.'
--
gwern
http://www.gwern.net
More information about the arbtt
mailing list