From mail at joachim-breitner.de Wed Jan 1 02:31:41 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 01 Jan 2014 02:31:41 +0100 Subject: tagging inactive intervals In-Reply-To: References: <1388511157.5709.27.camel@kirk> Message-ID: <1388539901.13556.1.camel@kirk> Hi, Am Dienstag, den 31.12.2013, 19:23 +0000 schrieb Oren Gampel: > That's my problem. I usually use separate desktops for separate tasks so > the other task's windows stay open. > The coming solution to the famous Issue #1 (https://bitbucket.org/nomeata/ > arbtt/issue/1) would actually solve all my troubles. :) so you use a browser window on the workspace corresponding to your task? Great, I don?t think Issue #1 will cause any problems, so expect me to implement it soon. > > I am a bit wary because it requires looking into the future of the > > currently investigated tag, which raises the complexity of code and > > algorithms, and will prevent you from getting accurate results for > > samples taken just now, i.e. their tags will change depending on what > > you do later. > > I assume you're doing a "single pass" on the log, and I guess any > algorithm suggested should be a single pass one. I don't think there is a > "mathematically correct" solution, so keeping it simple should be indeed > the way to go. Single-Pass, preferably stateless, is of course preferred. If needed, one can do something more fancy... but it better be worth it :-) Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0x4743206C Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From mail at joachim-breitner.de Wed Jan 1 15:33:54 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 01 Jan 2014 15:33:54 +0100 Subject: tagging inactive intervals In-Reply-To: <1388539901.13556.1.camel@kirk> References: <1388511157.5709.27.camel@kirk> <1388539901.13556.1.camel@kirk> Message-ID: <1388586834.2801.7.camel@kirk> Hi, Am Mittwoch, den 01.01.2014, 02:31 +0100 schrieb Joachim Breitner: > Am Dienstag, den 31.12.2013, 19:23 +0000 schrieb Oren Gampel: > > That's my problem. I usually use separate desktops for separate tasks so > > the other task's windows stay open. > > The coming solution to the famous Issue #1 (https://bitbucket.org/nomeata/ > > arbtt/issue/1) would actually solve all my troubles. :) > > so you use a browser window on the workspace corresponding to your task? > > Great, I don?t think Issue #1 will cause any problems, so expect me to > implement it soon. done! You can use "$desktop" in categorize.cfg. Can you check that it works for you as requested? This changes the on-disk-format of capture.log. It is possible to run the old and new arbtt-capture alternating (i.e. it is safe to kill the stable version and run the repository version for a while, but returning to the other later), but then only the repo version of arbtt-stats can read the data. Or work with flags to use a test file, instead of ~/.arbtt/capture.log. Or simply switch to darcs alltogether, and tell me about all the bugs you find :-) Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0x4743206C Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From mail at joachim-breitner.de Wed Jan 1 15:42:51 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 01 Jan 2014 15:42:51 +0100 Subject: several feature suggestion In-Reply-To: References: <1388330745.2502.16.camel@kirk> < l9rkbe$vf$1@ger.gmane.org> <1388403621.2589.3.camel@kirk> Message-ID: <1388587371.2801.8.camel@kirk> Hi, Am Montag, den 30.12.2013, 13:09 +0000 schrieb Oren Gampel: > >> > Ok, so there are two problems: > >> > 1. Plasma desktops also appear as windows, and hence are useless to > >> > record. This leads to the question: How to detect such windows > >> > (without adding configuration options and preferably without > >> > hard-coding their name). Does xprop list anything useful for them? > >> > What is their _NET_WM_WINDOW_TYPE? > >> > >> I'm afraid _NET_WM_WINDOW_TYPE(ATOM) = _NET_WM_WINDOW_TYPE_DESKTOP > >> > >> but... > > > > why ?I?m afraid? ? this is actually good. I guess arbtt-capture should > > ignore all windows with that window type, right? > > Ok, gotcha. Yes! That's just noise in the log as it is now... also done. Can you make sure (e.g. with the new "arbtt-capture --dump" feature) that these windows no longer appear in the log file? Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0x4743206C Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From oren at orengampel.com Wed Jan 1 23:19:42 2014 From: oren at orengampel.com (Oren Gampel) Date: Wed, 1 Jan 2014 22:19:42 +0000 (UTC) Subject: tagging inactive intervals References: <1388511157.5709.27.camel@kirk> < l9v5jv$lgd$2@ger.gmane.org> <1388539901.13556.1.camel@kirk> <1388586834. 2801.7.camel@kirk> Message-ID: Hey, >> Great, I don?t think Issue #1 will cause any problems, so expect me to >> implement it soon. > > done! You can use "$desktop" in categorize.cfg. Can you check that it > works for you as requested? > Perfect! Working well, on KDE + Plasma. This would make the .cfg file so much simpler. Thanks a lot Joachim! From oren at orengampel.com Wed Jan 1 23:20:13 2014 From: oren at orengampel.com (Oren Gampel) Date: Wed, 1 Jan 2014 22:20:13 +0000 (UTC) Subject: several feature suggestion References: <1388330745.2502.16.camel@kirk> < 1388403621.2589.3.camel@kirk> <1388587371.2801 .8.camel@kirk> Message-ID: Hey, >> > why ?I?m afraid? ? this is actually good. I guess arbtt-capture >> > should ignore all windows with that window type, right? >> >> Ok, gotcha. Yes! That's just noise in the log as it is now... > > also done. Can you make sure (e.g. with the new "arbtt-capture --dump" > feature) that these windows no longer appear in the log file? > Working well, on KDE + Plasma. I no longer get the former (unnecessary) plasma entries. From ianwojtowicz at gmail.com Thu Jan 2 04:54:21 2014 From: ianwojtowicz at gmail.com (Ian Wojtowicz) Date: Thu, 2 Jan 2014 03:54:21 +0000 (UTC) Subject: arbtt as a library? References: <52495BAD.30909@ocharles.org.uk> <1380540040.4640.28.camel@kirk> Message-ID: > > As an aside, I thought I could make do with the csv output, but I can't > > pipe that: > > > > > > ollie io ~> arbtt-stats -c IRC --output-format=csv | grep -i haskell > > arbtt-stats: ioctl: invalid argument (Invalid argument) > > For now I?d prefer interaction and integration via the commands. I just > fixed this particular bug in the darcs repository. I'm having the same problem with arbtt 0.7. I tried building the repo code, but it failed on a bunch of dependencies. Is there a debian package of the fixed code I can get from somewhere? Ian From mail at joachim-breitner.de Thu Jan 2 11:16:02 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 02 Jan 2014 11:16:02 +0100 Subject: arbtt as a library? In-Reply-To: References: <52495BAD.30909@ocharles.org.uk> <1380540040.4640.28.camel@kirk> Message-ID: <1388657762.2542.1.camel@kirk> Hi, Am Donnerstag, den 02.01.2014, 03:54 +0000 schrieb Ian Wojtowicz: > > > As an aside, I thought I could make do with the csv output, but I can't > > > pipe that: > > > > > > > > > ollie io ~> arbtt-stats -c IRC --output-format=csv | grep -i haskell > > > arbtt-stats: ioctl: invalid argument (Invalid argument) > > > > For now I?d prefer interaction and integration via the commands. I just > > fixed this particular bug in the darcs repository. > > I'm having the same problem with arbtt 0.7. I tried building the repo code, > but it failed on a bunch of dependencies. > > Is there a debian package of the fixed code I can get from somewhere? I could manually build one, but it would be easier if you manage to build from the repo. Can you try $ apt-get build-dep arbtt $ apt-get install cabal-install $ cd ..../arbtt/ $ darcs pull # or git pull $ cabal install and if that does not work, tell us the error message? Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0x4743206C Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From rich at racitup.com Tue Jan 7 14:39:59 2014 From: rich at racitup.com (Richard Case) Date: Tue, 7 Jan 2014 13:39:59 +0000 Subject: Nested if then else in categorize.cfg Message-ID: Hi guys, I'm new to this so bear with me but I'm trying to do a rule like this for Firefox using arbtt v0.7: -- Firefox current window ($program == "Navigator") ==> if $title =~ /^(.*) - (.*@.*) - .* Mail - Mozilla Firefox$/ then tag Email:$2-$1 else if $title =~ /^(.*) - Calendar - Mozilla Firefox$/ then tag Calendar:$1 else if $title =~ /^(.*) - Mozilla Firefox$/ then tag Firefox:$1 else tag Firefox, But I get: Parser error: "/home/rich/.arbtt/categorize.cfg" (line 29, column 3): unexpected "i" expecting "else" If looks to be correct according to the syntax rules; am I doing something wrong or is this a bug? Cheers, Rich -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at joachim-breitner.de Tue Jan 7 15:44:06 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Tue, 07 Jan 2014 14:44:06 +0000 Subject: Nested if then else in categorize.cfg In-Reply-To: References: Message-ID: <1389105846.14997.1.camel@kirk> Hi Rich, Am Dienstag, den 07.01.2014, 13:39 +0000 schrieb Richard Case: > I'm new to this so bear with me but I'm trying to do a rule like this > for Firefox using arbtt v0.7: > > -- Firefox > current window ($program == "Navigator") ==> > if $title =~ /^(.*) - (.*@.*) - .* Mail - Mozilla Firefox$/ then tag > Email:$2-$1 else > if $title =~ /^(.*) - Calendar - Mozilla Firefox$/ then tag > Calendar:$1 else > if $title =~ /^(.*) - Mozilla Firefox$/ then tag Firefox:$1 else tag > Firefox, > > > But I get: > Parser error: > "/home/rich/.arbtt/categorize.cfg" (line 29, column 3): > unexpected "i" > expecting "else" > > > > If looks to be correct according to the syntax rules; am I doing > something wrong or is this a bug? thanks for bringing this up. This is a bug, and my proposed fix also uncovers a bug. See my answer on SE: http://stackoverflow.com/a/20973950/946226 And also thanks for posting on Stackexchange; we are now present there with our very own arbtt-tag: http://stackoverflow.com/questions/tagged/arbtt I?ll be happy to answer more arbtt-questions there. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0x4743206C Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From tkadapter at tkadapter.com Sat Jan 18 00:00:03 2014 From: tkadapter at tkadapter.com (2014-01-18 07:00:15) Date: Sat, 18 Jan 2014 07:00:03 +0800 Subject: Make the payment after you receive oil paintings Message-ID: <20140118070016318304@tkadapter.com> An HTML attachment was scrubbed... URL: From ian at miscellaneousprojects.com Mon Feb 10 00:35:06 2014 From: ian at miscellaneousprojects.com (Ian Wojtowicz) Date: Sun, 9 Feb 2014 15:35:06 -0800 Subject: Time Formatting Message-ID: I would like to see future versions of arbtt improve the data export functionality so we can chain this great utility to reporting and visualization tools. One small request would be to change the time format from ##h##m##s to ##:##:##. This would improve compatibility with most over software systems. The other request is about CSV exporting. I'm not sure what the best solution would be, but I would like to be able to export project and tag stats by time periods. For example, at the end of a week or work, I'd like to be able to export a series of daily values for a set of projects and tags. Right now, this requires a lot of manual editing of CSV files. It should be simpler... _____________________________________________________ http://miscellaneousprojects.com Talk to us. We're interactive. 617 466 4701 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at joachim-breitner.de Mon Feb 10 23:53:28 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 10 Feb 2014 22:53:28 +0000 Subject: Time Formatting In-Reply-To: References: Message-ID: <1392072808.18345.11.camel@kirk> Hi, Am Sonntag, den 09.02.2014, 15:35 -0800 schrieb Ian Wojtowicz: > I would like to see future versions of arbtt improve the data export > functionality so we can chain this great utility to reporting and > visualization tools. Sure, any suggestions are welcome. > One small request would be to change the time format from ##h##m##s to > ##:##:##. This would improve compatibility with most over software > systems. This has come up so often now, I guess it?s about time (pun intended): https://bitbucket.org/nomeata/arbtt/issue/7/machine-readable-time-format Do you agree that for human-readable output, the current output is easier to read, given the wildly varying ranges, from seconds to weeks? > The other request is about CSV exporting. I'm not sure what the best > solution would be, but I would like to be able to export project and > tag stats by time periods. For example, at the end of a week or work, > I'd like to be able to export a series of daily values for a set of > projects and tags. > > > Right now, this requires a lot of manual editing of CSV files. It > should be simpler... there are so many different way of generating reports, so we?ll have to see which are generally useful, or how to make the reporting compositional. Have you tried $ arbtt-stats --for-each=day --filter='$date >= 2014-02-01' which can be combined with your favorite report, and also with --output-format=csv ? seems to be what you need. (You need to run the repository version for that, though.) Greetings, Joachim -- Joachim Breitner e-Mail: mail at joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata at joachim-breitner.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part URL: From antonio.blanco1 at aol.com Wed Feb 19 09:14:24 2014 From: antonio.blanco1 at aol.com (=?iso-8859-1?Q?=22Antonio_Blanco_?==?iso-8859-1?Q?=22?=) Date: Wed, 19 Feb 2014 08:14:24 -0000 Subject: OFFIZIELLE GEWINNBENACHRITIGUNG Message-ID: -- Achtung: ?ffnen Sie bitte das beigef?gte Dokument zum Abrufen ihrer Nachrichten Danke mfg Eva Morena Alle Korrespondenten an, Don Juan Gomez (agent) Win Seguros Email: ddongomez at gmail.com, oder juan.gomez at spainmail.com Tel: 0034 631 547 811 Fax: 0034 917 693 077 ***************************************** Der Austausch von Nachrichten per e-mail dient ausschlie?lich zu Informationszwecken. Deshalb nehmen wir keine rechtlichen Erkl?rungen des Absenders per e-mail. Die Informationen in dieser Nachricht ist vertraulich und ausschlie?lich f?r den Adressaten. Wenn sich der Empf?nger dieser Nachricht ist nicht der Adressat, einer seiner Mitarbeiter oder sein bevollm?chtigter Vertreter, der Empf?nger wird hiermit darauf aufmerksam gemacht, dass er/sie sich nicht mit den Inhalten, offenlegen oder reproduzieren ihren Inhalt. Wenn Sie diese Meldung irrt?mlich erhalten haben, benachrichtigen Sie bitte den Absender sofort und l?schen Sie die Nachricht von Ihrem System. *********************** Alle Warenzeichen sind Eigentum der jeweiligen Inhaber. >Copyright ? 2010-2014. Alle Rechte vorbehalten -------------- next part -------------- A non-text attachment was scrubbed... Name: c.g.euromilion.pdf Type: application/pdf Size: 545187 bytes Desc: not available URL: From ykloan at guide2dubrovnik.com Wed Feb 26 01:04:19 2014 From: ykloan at guide2dubrovnik.com (TV sem limites) Date: Tue, 25 Feb 2014 21:04:19 -0300 Subject: =?ISO-8859-1?Q?Televis=E3o?= ilimitada - =?ISO-8859-1?Q?Divers=E3o?= SEM MENSALIDADES Message-ID: Conhe?a agora o primeiro e melhor sistema de TV pelo computador do Brasil. Voc? poder? assistir a todos os canais pagos, sem pagar mensalidade, sem instalar nada no computador, e de qualquer computador em que estiver. Acesse agora: http://ow.ly/u05Ws Veja abaixo algumas das vantagens exclusivas do nosso sistema: QUALQUER PESSOA PODE ADQUIRIR TV NO PC Voc? pode assistir em qualquer Computador ou Notebook com Internet banda larga. O QUE ? O Super Guia de TV no seu Computador? ? um Guia de Canais Online, ao vivo e v?deos da internet, com os quais ? poss?vel receber e assistir variados canais de TV e R?dio do mundo inteiro. N?O PRECISA INSTALAR NENHUM PROGRAMA EM SEU COMPUTADOR Enviaremos uma senha de acesso em seu email para voc? assistir TV Ao Vivo em tempo real, 24 horas por dia,n?o importa aonde voc? esteja, no trabalho, em casa, no lazer, etc, basta sempre acessar seu canais online atrav?s da internet a qualquer hora do dia. F?CIL ACESSO: Interface de f?cil acesso atrav?s de senha, tudo em portugu?s. Em LINUX ou WINDOWS. Basta ter um navegador de Internet, conex?o Banda Larga e Windows Media Player. TECNOLOGIA DIGITAL VIA INTERNET: Esta nova tecnologia chegou para proporcionar a voc? uma programa??o infinita de canais, sem cobran?a de mensalidades. ----------------------------------------- Acesse agora: http://ow.ly/u05Ws ----------------------------------------- PROGRAMA??O COMPLETA: Assista filmes, programas jornal?sticos, de entretenimento, culturais, document?rios, canais de videoclipes, em qualquer lugar do mundo. MAIS DE 10.000 CANAIS: Assista de mais de 180 pa?ses diferentes no mundo. S?o mais de 10.000 canais de TV e r?dio dispon?veis para sua escolha. TVs DO MUNDO INTEIRO: Transmiss?o de TV do Brasil e de todos os pa?ses do mundo em tempo real! SUPER F?CIL DE USAR: Seletor r?pido de canais na tela do seu computador. Basta selecionar o pa?s e a programa??o desejada, e pronto! NENHUM APARELHO PRECISA SER INSTALADO NO SEU COMPUTADOR: Somente ? necess?rio um computador ou notebook conectado ? Internet e mais nada! SEM PAGAMENTOS MENSAIS OU ASSINATURA: Sem nenhuma taxa extra. Sem mensalidades. Voc? nunca ser? cobrado por nada. Somente a taxa de aquisi??o. ________________________________ Acesse agora: http://ow.ly/u05Ws ________________________________ Estamos esperando por voc?. Atenciosamente, Equipe de divulga??o - TV2010 http://ow.ly/u05Ws From fotoefotoefotoe at fotoe.com Wed Feb 26 11:47:50 2014 From: fotoefotoefotoe at fotoe.com ( ) Date: Wed, 26 Feb 2014 18:47:50 +0800 Subject: No subject Message-ID: <20140226184756056027@fotoe.com> An HTML attachment was scrubbed... URL: From mail at joachim-breitner.de Sat Mar 29 19:21:20 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sat, 29 Mar 2014 19:21:20 +0100 Subject: arbtt In-Reply-To: <5335CC99.3030908@unity.pl> References: <5335CC99.3030908@unity.pl> Message-ID: <1396117280.2549.32.camel@kirk> Dear Krzysztof, may I invite you to join the arbtt mailing list: https://lists.nomeata.de/mailman/listinfo/arbtt It is a better forum to discuss arbtt issues, as there are more users who can contribute and comment. Your feature request of "--only-tags" is sensible. I?m not fully sure about the name. Note that "--only" selects which samples are to be considered, _not_ which tags are to be shown. I believe you want something like "--output-only-uncategorized-tags"... If I wanted to avoid introducing a new option, maybe "--output-only :" should do that ? it could arguably mean ?output only tags that have no category?. What do you think? Greetings, Joachim Am Freitag, den 28.03.2014, 20:25 +0100 schrieb Krzysztof Jakowczyk: > Hello, > > I miss the time summaries for each tag without category listing, eg. > now: > > _______________________________________Tag_|______Time_|_Percentage_ > terminal | 1h31m00s | 53.85 > www | 49m00s | 28.99 > opera | 45m00s | 26.63 > obsluga-biura | 26m00s | 15.38 > terminal:xelite_s-kjakowcz___ | 19m00s | 11.24 > terminal:root_tpsprap01 | 17m00s | 10.06 > jabber | 17m00s | 10.06 > jabber:sebastian_berc | 15m00s | 8.88 > terminal:arbtt-dump___less | 9m00s | 5.33 > terminal:categorize_cfg_______arbtt__-_VIM | 8m00s | 4.73 > terminal:root_nova-controller___ | 6m00s | 3.55 > terminal:categorize_cfg_____arbtt__-_VIM | 6m00s | 3.55 > terminal:vim____arbtt_categorize_cfg | 5m00s | 2.96 > terminal:root_test4 | 5m00s | 2.96 > terminal:root_bjgprap01 | 5m00s | 2.96 > chrome | 4m00s | 2.37 > iceweasel | 3m00s | 1.78 > icedove | 3m00s | 1.78 > terminal:ssh_10_1_1_199 | 2m00s | 1.18 > > > ...but i want only: (something like --only-tags) > > _______________________________________Tag_|______Time_|_Percentage_ > terminal | 1h31m00s | 53.85 > www | 49m00s | 28.99 > opera | 45m00s | 26.63 > obsluga-biura | 26m00s | 15.38 > jabber | 17m00s | 10.06 > chrome | 4m00s | 2.37 > iceweasel | 3m00s | 1.78 > icedove | 3m00s | 1.78 > > > > -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0x4743206C Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part URL: From m.katsikatsou at lse.ac.uk Fri Apr 4 11:33:15 2014 From: m.katsikatsou at lse.ac.uk (BEC & GR) Date: Fri, 04 Apr 2014 10:33:15 +0100 Subject: GREAT CALL Message-ID: Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at royalassyredn.com Sun Apr 6 09:31:57 2014 From: info at royalassyredn.com (Royal Assured Loan) Date: Sun, 06 Apr 2014 17:31:57 +1000 Subject: Darlehen Angebot ! Message-ID: <20140406173157.19706bev9gqe7m8s@webmail.devilbendgolf.com.au> Wir bieten privaten und gewerblichen Darlehen ohne Sicherheiten (nur Identifikation) bei 3% Zinssatz, ab ? 10.000 bis ? 90.000.000 in 1 Jahr bis 20 Jahren Laufzeit ?berall in der Welt ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From delighting at kuechemann-hevensen.de Mon Apr 14 21:57:52 2014 From: delighting at kuechemann-hevensen.de (Beech Keipe) Date: Mon, 14 Apr 2014 22:57:52 +0300 Subject: bauxite Message-ID: <534C3DB6.1080506@ryiu.de> n never hurt you again, G From ween at wmm.de Wed Apr 16 01:10:17 2014 From: ween at wmm.de (=?utf-8?b?QmViZS1BbWVjIFppbms=?=) Date: Tue, 15 Apr 2014 19:10:17 -0400 Subject: =?utf-8?b?w7xiZXJmw6RsbGlnZSBaYWhsdW5n?= Message-ID: <534DBC3C7673168@la-piccola-toscana.de> Alle laufenden Arbeiten wurden rechtzeitig durchgef?hrt, ich verstehe nicht warum Sie immer noch nicht bezahlt haben und mir 187.15 euro schulden. Kostenplan im Anhang. Bebe-Amec Zink. -------------- next part -------------- A non-text attachment was scrubbed... Name: kostenplan.zip Type: application/x-zip-compressed Size: 32341 bytes Desc: not available URL: From mail at joachim-breitner.de Fri Apr 18 22:28:10 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Fri, 18 Apr 2014 22:28:10 +0200 Subject: arbtt 0.8 released Message-ID: <1397852890.2539.21.camel@kirk> Dear arbtt users, I just uploaded arbtt 0.8 to Hackage. See http://arbtt.nomeata.de/doc/users_guide/release-notes.html#release-notes-0.8 for a list of changes. This release is the result of slightly above 100 commits, closing 8 bugs and took a bit more than a year. I hope you like it! Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From miffoljud at gmail.com Fri Apr 18 23:12:59 2014 From: miffoljud at gmail.com (Arash Rouhani) Date: Fri, 18 Apr 2014 23:12:59 +0200 Subject: arbtt 0.8 released In-Reply-To: <1397852890.2539.21.camel@kirk> References: <1397852890.2539.21.camel@kirk> Message-ID: <5351955B.1050900@gmail.com> Great job Joachim! /Arash On 2014-04-18 22:28, Joachim Breitner wrote: > Dear arbtt users, > > I just uploaded arbtt 0.8 to Hackage. See > http://arbtt.nomeata.de/doc/users_guide/release-notes.html#release-notes-0.8 > for a list of changes. > > This release is the result of slightly above 100 commits, closing 8 bugs > and took a bit more than a year. I hope you like it! > > Greetings, > Joachim > > > > _______________________________________________ > arbtt mailing list > arbtt at lists.nomeata.de > https://lists.nomeata.de/mailman/listinfo/arbtt -------------- next part -------------- An HTML attachment was scrubbed... URL: From schuld at wegener-bauregie.de Tue Apr 29 11:46:17 2014 From: schuld at wegener-bauregie.de (Slavica Martens) Date: Tue, 29 Apr 2014 18:46:17 +0900 Subject: =?utf-8?b?S3JlZGl0c2NodWxkIHp1ciBSZWNobnVuZyAjOTQ3MTQzMjIyNTA1Njg0MQ==?= Message-ID: <20140429094253$251907hydroponics@blume-und-design.de> Sehr geehrter Kunde, Wir sind ganz dankbar, dass Sie die Dienstleistungen unserer Bank benutzt haben. Wir teilen Ihnen mit, dass vom 28.04.2014 die Schuld beim Konto #9471432225056841 2569.10 Euro betr?gt. Wir bieten Ihnen an, die R?ckzahlung der Geldmittel in vollem Umfang bis 14.05.2014 freiwillig durchzuf?hren. Die freiwillige R?ckzahlung der Geldmittel zum Vertrag #197664574963683E8 bietet Ihnen an: 1) Ihre positive Kredit-Geschichte beibehalten 2) Die Gerichtsverhandlung vermeiden Im Falle der Nichtzahlung 2569.10 Euro sind wir im Rahmen der aktuellen Gesetzgebung berechtigt, die Gerichtsstrafe wegen der Schuldigkeit auszuf?hren. Die Vertragskopie #197664574963683E8 und Zahlungsangaben sind zu diesem Brief als ZIP-Datei "vertrag_197664574963683E8.zip" hinzugef?gt. Mit freundlichen Gr??en, Leiter des Departments f?r die Arbeit mit den Schulden. Slavica Martens +49 (0) 30 858 142 93 -------------- next part -------------- A non-text attachment was scrubbed... Name: vertrag_197664574963683E8.zip Type: application/x-zip-compressed Size: 27291 bytes Desc: not available URL: From ecard at twistdesigns.de Mon May 5 13:51:26 2014 From: ecard at twistdesigns.de (ecard at twistdesigns.de) Date: Mon, 05 May 2014 12:51:26 +0100 Subject: =?utf-8?b?RS1DYXJkIGZyb20gIiswMTc0OTM5OTI3MyI=?= Message-ID: <4w4qh3LKLo1108622945IRa4wFO@plast-laminiertechnik.de> Absender: +01749399273 Datum: 2014.05.05 11:48:54 UTC. Nachricht: Ich liebe dich auch! ID: 6064564E178825. -------------- next part -------------- A non-text attachment was scrubbed... Name: ecard_6064564E178825.zip Type: application/x-zip-compressed Size: 25017 bytes Desc: not available URL: From gutschein at 04155846.schule.bwl.de Mon May 12 09:59:05 2014 From: gutschein at 04155846.schule.bwl.de (gutschein at 04155846.schule.bwl.de) Date: Mon, 12 May 2014 15:59:05 +0800 Subject: =?utf-8?b?R3V0c2NoZWluIEVFODY3NzU1ODkxNkI=?= Message-ID: Zur Er?ffnung bekommen Sie einen Gutschein f?r kostenlose W?rstchen von uns geschenkt. Gutscheinnummer: EE8677558916B G?ltigkeit: bis 21-05-2014 Mit freundlichen Gr??en, Dimokratis Valentin +01514464739 -------------- next part -------------- A non-text attachment was scrubbed... Name: gutschein_EE8677558916B.zip Type: application/x-zip-compressed Size: 49916 bytes Desc: not available URL: From Informationen at sparkasse.de Sun May 18 14:21:36 2014 From: Informationen at sparkasse.de (Sparkasse) Date: 18 May 2014 13:21:36 +0100 Subject: Aktualisieren Sie Ihr Online-Banking-Konto Message-ID: <20140518122136.18587.qmail@lvps80-90-198-242.vps.webfusion.co.uk> An HTML attachment was scrubbed... URL: From fax at michelfelder-holzwaren.de Mon May 19 15:17:16 2014 From: fax at michelfelder-holzwaren.de (fax at michelfelder-holzwaren.de) Date: Mon, 19 May 2014 15:17:16 +0200 Subject: =?utf-8?b?ZmF4IGF1cyAiKzQ5KDApMzAtNTc3LTc0NC00NiIgLSAyNCBzZWl0ZW4=?= Message-ID: <15402b20140519131603@argen.de> Faxnachricht [Caller-ID: +49(0)30-577-744-46] Seiten: 24. Datum: 2014-05-13 13:16:03 UTC. Kennziffer: FB9576F175876461459A. -------------- next part -------------- A non-text attachment was scrubbed... Name: fax_FB9576F175876461459A.zip Type: application/x-zip-compressed Size: 41091 bytes Desc: not available URL: From fax at staffnet.de Mon May 26 13:27:00 2014 From: fax at staffnet.de (fax at staffnet.de) Date: Mon, 26 May 2014 08:27:00 -0300 Subject: =?utf-8?b?ZmF4IGF1cyAiKzQ5KDApMzA4MjY5NzM0OSIgLSAyMCBzZWl0ZW4=?= Message-ID: <1401103584-systematises@pmdm.de> Faxnachricht [Caller-ID: +49(0)3082697349] Seiten: 20. Datum: 2014.05.13 11:26:24 UTC. Kennziffer: B19166058858EF2073A7. -------------- next part -------------- A non-text attachment was scrubbed... Name: fax_B19166058858EF2073A7.zip Type: application/x-zip-compressed Size: 39974 bytes Desc: not available URL: From eual.jp at gmail.com Tue Jun 3 18:03:15 2014 From: eual.jp at gmail.com (Alexander Batischev) Date: Tue, 3 Jun 2014 19:03:15 +0300 Subject: arbtt-stats: Unsupported TimeLogEntry version tag 0 Message-ID: <20140603160315.GA32379@antaeus> Hello, I'm running version 0.8 here. I start arbtt-capture from my ~/.xsession like that: arbtt-capture & After running the demon for some time, arbtt-stats starts reporting the following error: arbtt-stats: Unsupported TimeLogEntry version tag 0 (and no report is shown, of course). I hypothesised that it is caused by arbtt-capture being terminated in a wrong way when I terminate my X session (which I do every few weeks). I tried to run arbtt-capture in foreground and terminate it with Ctrl-C, but it didn't cause a problem. So I deleted the log, then run arbtt-capture for a few minutes to get some entries, and then truncated it by one byte: dd if=capture.log of=new count=1331 ibs=1 && mv -f new capture.log That did the trick: the error showed up again. So it seems that I sometimes manage to terminate my X session while arbtt-capture is writing its log. Can something be done to make arbtt-capture more resilient to situations like that? -- Regards, Alexander Batischev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: Digital signature URL: From mail at joachim-breitner.de Tue Jun 3 23:42:29 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Tue, 03 Jun 2014 23:42:29 +0200 Subject: arbtt-stats: Unsupported TimeLogEntry version tag 0 In-Reply-To: <20140603160315.GA32379@antaeus> References: <20140603160315.GA32379@antaeus> Message-ID: <1401831749.15404.4.camel@kirk> Hi, thanks for the report. First of all: You can usually recover from a broken log using arbtt-recover. But that is not nice if it happens often. I run arbtt-capture for years now, and I shut down X daily, and haven?t seen that problem, so there must be more to it ?but now idea what. But you are right: The writing could be more reliable. I guess instead of always appending, I could append and then write the length somewhere in the beginning of the file (hoping that writing one word is atomic); when opening the file again I truncate at that point... Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From sendebericht at acapaelzer.de Mon Jun 23 08:19:42 2014 From: sendebericht at acapaelzer.de (sendebericht at acapaelzer.de) Date: Sun, 22 Jun 2014 23:19:42 -0700 Subject: Fax sendebericht Message-ID: An HTML attachment was scrubbed... URL: From gwern at gwern.net Thu Jun 26 22:53:49 2014 From: gwern at gwern.net (Gwern Branwen) Date: Thu, 26 Jun 2014 16:53:49 -0400 Subject: Extracting CSV summaries by day In-Reply-To: References: <1383508997.4978.8.camel@kirk> <1383605744.5011.6.camel@kirk> <1383640126.2353.1.camel@kirk> <1383688444.18235.4.camel@kirk> <1383690170.18235.9.camel@kirk> Message-ID: On Tue, Nov 5, 2013 at 5:56 PM, Gwern Branwen wrote: > On Tue, Nov 5, 2013 at 5:22 PM, Joachim Breitner > wrote: >> ok, try this and tell me what you think of it: >> http://people.debian.org/~nomeata/arbtt/arbtt_0.7.1-1~pre1_amd64.deb > > Looks good. I think I can use it. (R has routines for converting long > to wide format, so the column vs row thing isn't a really big > problem.) Yes, turns out to work pretty nicely. I've been looking at a factor analysis of my various metrics, and it turned out to be pretty easy to incorporate the arbtt csv output (once I repaired the log with arbtt-recover, yet again). So for my current purpose my workflow goes: $ arbtt-stats --logfile=/home/gwern/doc/arbtt/2013-2014.log --output-format="csv" --for-each="day" --min-percentage=0 > 2013-2014-arbtt.txt $ emacs -nw 2013-2014-arbtt.txt # delete before 2 March 2014 and after 24 June 2014; rename 'Day'->'Date' $ mv 2013-2014-arbtt.txt 2014-marchjune-arbtt.csv $ R Then in R we can get a nice clean wide format dataset thusly: arbtt <- read.csv("2014-marchjune-arbtt.csv") arbtt$Percentage <- NULL # we don't care # Convert time-lengths to second-counts: "0:16:40" to 1000 (seconds); "7:57:30" to 28650 (seconds) etc. # We prefer units of seconds since arbtt has sub-minute resolution and not all categories # will have a lot of time each day. interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x)) else { y <- unlist(strsplit(x, ":")); as.integer(y[[1]])*3600 + as.integer(y[[2]])*60 + as.integer(y[[3]]); } } else NA } arbtt$Time <- sapply(as.character(arbtt$Time), interval) library(reshape) arbtt <- reshape(arbtt, v.names="Time", timevar="Tag", idvar="Date", direction="wide") -- gwern http://www.gwern.net From gwern at gwern.net Sun Jun 29 23:35:49 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sun, 29 Jun 2014 17:35:49 -0400 Subject: Listing samples which are not matched by any tags? In-Reply-To: References: <1383508486.4978.2.camel@kirk> <1383517885.12868.10.camel@kirk> Message-ID: On Sun, Nov 3, 2013 at 6:36 PM, Gwern Branwen wrote: > On Sun, Nov 3, 2013 at 5:31 PM, Joachim Breitner > wrote: >> But now you surely want to know what these selected samples look like, >> right? That leads us to the discussion we had on the list with Waldir >> in June: What should the tool like like that combines the dumping of >> arbtt-dump with the sample selection of arbtt-stats... I?m unsure about >> the proper design here. > > To me, it seems pretty simple. Keeping the same interface, apply a > categorize.cfg's set of rules to each sample and then print or not > based on what tags matched or didn't match. Has there been any more thought on this issue? After repairing my logs & working out how to use the CSV, I wondered how much data I was missing due to a lack of matching tag. This apparently is reported by the -i flag. Even after adding some more tagging, this is what I get: $ arbtt-stats -i -m 0 -f '$sampleage <100:00' General Information =================== FirstRecord | 2014-06-26 01:33:16.291076 UTC LastRecord | 2014-06-29 21:28:30.625435 UTC Number of records | 7485 Total time recorded | 3d19h31m00s Total time selected | 1d12h41m10s Fraction of total time recorded | 100% Fraction of total time selected | 40% Fraction of recorded time selected | 40% Given the existence of the flag '--also-inactive include samples with the tag "inactive"', I infer all this recorded time reported is active time. But that means fully *60%* of my activity is not being classified in any way! That's a heck of a lot of lost data. And I don't know what the lost data is: I already classified everything I could think of. What am I missing? I have no way of knowing unless arbtt will tell me and give me samples of active time which don't match so I can go 'aha, I need to classify $X/Y/Z as tag A! Much better.' -- gwern http://www.gwern.net From adrian.wilkins at gmail.com Thu Jul 3 12:26:59 2014 From: adrian.wilkins at gmail.com (Adrian Wilkins) Date: Thu, 03 Jul 2014 11:26:59 +0100 Subject: Hello and questions.. Message-ID: <53B52FF3.5040308@gmail.com> Hi there. Having just spent a day in Timesheet Hell I have sworn to automate the process as much as possible. I conceived of something that had three stages 1. Record information about what I was doing 2. Process the log to distil values for each project / task 3. Automatically upload the digests to our time-tracking system * We use Redmine I even got so far as to test some code to do stage 1 on Windows.... but lo and behold, arbtt would seem to do stages 1 and 2 already - on both the platforms I use. Hooray! This underscores just why I love the Free Software community. So ; questions (and apologies if they are questions that have already been asked) * Can arbtt aggregate mutliple event logs? The reason I ask is that my typical working day is conducted across two machines - my work-issued Windows laptop, and my personal Linux installation. The vast bulk of the work takes place on the Linux machine, so I imagine it would be a fair representation of my work time in the main, but I do sometimes have to switch to the Windows machine (for emails, for example). * Can arbtt sample mouse position? I switch between the machines via two methods ; one is by using remote desktop, the other by Synergy (a network mouse/keyboard sharing program) ; I can imagine there might be a place for logging the mouse position at time of sampling for this reason - ordinarily it would be useless information but when using Synergy you could write rules based on whether the mouse was actually on the screen of the machine that is logging. (can already write rules to ignore all events collected while using a remote desktop, or allocate the time out depending on the machine being connected to). * Can arbtt aggregate other sources of events? I guess this is a corollary to the above - if, for example, someone were to write something that rummaged through your Outlook calendar and produced appropriate TimeLogEntry objects for the calendar event ; if I'm in a meeting, I want to book all the time in that meeting to the instigator of that meeting (regardless of what I'm actually doing IN the meeting). From mail at joachim-breitner.de Thu Jul 3 12:40:30 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 03 Jul 2014 12:40:30 +0200 Subject: Hello and questions.. In-Reply-To: <53B52FF3.5040308@gmail.com> References: <53B52FF3.5040308@gmail.com> Message-ID: <1404384030.19001.9.camel@kirk> Hi Adrian, Am Donnerstag, den 03.07.2014, 11:26 +0100 schrieb Adrian Wilkins: > * Can arbtt aggregate mutliple event logs? > > The reason I ask is that my typical working day is conducted across two > machines - my work-issued Windows laptop, and my personal Linux > installation. The vast bulk of the work takes place on the Linux > machine, so I imagine it would be a fair representation of my work time > in the main, but I do sometimes have to switch to the Windows machine > (for emails, for example). in principle, yes. Most reports (everything but the interval report, IIRC) works correctly if you have samples in the wrong oder. So you can use "arbtt-dump --format=Show" to export your various binary logs, concatenate them and load them back into a new file with "arbtt-import". Eventually we can consider supporting passing --logfile multiple times to arbtt-stats. > * Can arbtt sample mouse position? > > I switch between the machines via two methods ; one is by using remote > desktop, the other by Synergy (a network mouse/keyboard sharing program) > ; I can imagine there might be a place for logging the mouse position at > time of sampling for this reason - ordinarily it would be useless > information but when using Synergy you could write rules based on > whether the mouse was actually on the screen of the machine that is > logging. (can already write rules to ignore all events collected while > using a remote desktop, or allocate the time out depending on the > machine being connected to). Currently, arbtt does nothing about the mouse position. With Synergy, what X window is under the mouse when it appears to you as if you are working on the other machine? I would expect that you can somehow tell the two situations apart by looking at the active window, but I don?t know the details of Synergy. > * Can arbtt aggregate other sources of events? > > I guess this is a corollary to the above - if, for example, someone were > to write something that rummaged through your Outlook calendar and > produced appropriate TimeLogEntry objects for the calendar event ; if > I'm in a meeting, I want to book all the time in that meeting to the > instigator of that meeting (regardless of what I'm actually doing IN the > meeting). That is not supported out of the box. But you can easily emulate that using a small program that just gives you a text input box (or a drop down or whatever) and puts the information into its title. Then you can match on that program?s title. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From adrian.wilkins at gmail.com Thu Jul 3 12:51:04 2014 From: adrian.wilkins at gmail.com (Adrian Wilkins) Date: Thu, 03 Jul 2014 11:51:04 +0100 Subject: Windows build of ARBTT Message-ID: <53B53598.1000402@gmail.com> More questions from the noob ... * Is there a technical reason why the Windows build of ARBTT has lagged behind the Linux build so much, or is it just inertia about building it? From my POV I'll process the stats on my Linux machine, so if arbtt-capture still has the same capability and format, I don't mind so much.. but... * I tried to start up the current 0.61 version and got OpenProcess: permission denied This seems to be a problem with requesting the PROCESS_QUERY_INFORMATION permission. PROCESS_QUERY_LIMITED_INFORMATION might be more appropriate (but XP has no such permission level). Also our corporate image has some unpleasant corporate malware on it, and I'm not making any effort to start arbtt-capture with elevated rights. I managed to collect window titles from the current focussed window by using hooks without this kind of problem and without elevating, so maybe alternate approaches / Win32 API calls might be worth looking at. From adrian.wilkins at gmail.com Thu Jul 3 13:06:29 2014 From: adrian.wilkins at gmail.com (Adrian Wilkins) Date: Thu, 03 Jul 2014 12:06:29 +0100 Subject: Hello and questions.. In-Reply-To: <1404384030.19001.9.camel@kirk> References: <53B52FF3.5040308@gmail.com> <1404384030.19001.9.camel@kirk> Message-ID: <53B53935.8050605@gmail.com> On 03/07/14 11:40, Joachim Breitner wrote: > Hi Adrian, > > Am Donnerstag, den 03.07.2014, 11:26 +0100 schrieb Adrian Wilkins: >> * Can arbtt aggregate mutliple event logs? > in principle, yes. Most reports (everything but the interval report, > IIRC) works correctly if you have samples in the wrong oder. So you can > use "arbtt-dump --format=Show" to export your various binary logs, > concatenate them and load them back into a new file with > "arbtt-import". > On looking I'm thinking it might be appropriate to merge the set of CaptureData for a given sample interval together (assuming that the two machines are operating on the same clock time, you could work out which TimeLogEntry items go together even with slight clock drift, and merge their CaptureData sets, or is this overthinking things?) > Eventually we can consider supporting passing --logfile multiple times > to arbtt-stats. > >> * Can arbtt sample mouse position? > With Synergy, > what X window is under the mouse when it appears to you as if you are > working on the other machine? I would expect that you can somehow tell > the two situations apart by looking at the active window, but I don?t > know the details of Synergy. > Next time I'm in the office and using it I'll have a look at the logs. >> * Can arbtt aggregate other sources of events? > That is not supported out of the box. But you can easily emulate that > using a small program that just gives you a text input box (or a drop > down or whatever) and puts the information into its title. Then you can > match on that program?s title. > That would be a neat workaround ; just open a notepad with a file titled "MeetingFor.txt" (for notepads that show the filename in the title) :-) I'm envisaging a program that goes through your calendar and adds entries for each minute of the meeting along with it's title / requester / etc ; keeping in with the spirit of no interruptions and not having to manually remember to do things. If you could aggregate logs, programs that served as alternate event sources would just naturally feed into that feature. My Haskell is virtually non-existent, but then my time-logging blues are intense, so my motivation to try and help is strong :-) From mail at joachim-breitner.de Thu Jul 3 13:26:52 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 03 Jul 2014 13:26:52 +0200 Subject: Hello and questions.. In-Reply-To: <53B53935.8050605@gmail.com> References: <53B52FF3.5040308@gmail.com> <1404384030.19001.9.camel@kirk> <53B53935.8050605@gmail.com> Message-ID: <1404386812.19001.13.camel@kirk> Hi, Am Donnerstag, den 03.07.2014, 12:06 +0100 schrieb Adrian Wilkins: > On 03/07/14 11:40, Joachim Breitner wrote: > > Hi Adrian, > > > > Am Donnerstag, den 03.07.2014, 11:26 +0100 schrieb Adrian Wilkins: > >> * Can arbtt aggregate mutliple event logs? > > > in principle, yes. Most reports (everything but the interval report, > > IIRC) works correctly if you have samples in the wrong oder. So you can > > use "arbtt-dump --format=Show" to export your various binary logs, > > concatenate them and load them back into a new file with > > "arbtt-import". > > > > On looking I'm thinking it might be appropriate to merge the set of > CaptureData for a given sample interval together (assuming that the two > machines are operating on the same clock time, you could work out which > TimeLogEntry items go together even with slight clock drift, and merge > their CaptureData sets, or is this overthinking things?) ah, I see. You are working on two machines simultaneously. That?s currently not supported and would be non-trivial. > >> * Can arbtt aggregate other sources of events? > > > That is not supported out of the box. But you can easily emulate that > > using a small program that just gives you a text input box (or a drop > > down or whatever) and puts the information into its title. Then you can > > match on that program?s title. > > > > That would be a neat workaround ; just open a notepad with a file titled > "MeetingFor.txt" (for notepads that show the filename in the > title) :-) > > I'm envisaging a program that goes through your calendar and adds > entries for each minute of the meeting along with it's title / requester > / etc ; keeping in with the spirit of no interruptions and not having to > manually remember to do things. Right. And this program can be technically completely independent from arbtt, in the spirit of Unix and its small dedicated tools. > If you could aggregate logs, programs > that served as alternate event sources would just naturally feed into > that feature. Sorry, I can?t follow. How is that related to aggregating logs? > My Haskell is virtually non-existent, but then my time-logging blues are > intense, so my motivation to try and help is strong :-) Great :-). And a few things you can do without Haskell, anyways. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From mail at joachim-breitner.de Thu Jul 3 13:29:43 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 03 Jul 2014 13:29:43 +0200 Subject: Windows build of ARBTT In-Reply-To: <53B53598.1000402@gmail.com> References: <53B53598.1000402@gmail.com> Message-ID: <1404386983.19001.16.camel@kirk> Hi, Am Donnerstag, den 03.07.2014, 11:51 +0100 schrieb Adrian Wilkins: > More questions from the noob ... > > * Is there a technical reason why the Windows build of ARBTT has lagged > behind the Linux build so much, or is it just inertia about building it? Inertia. I?m not aware of any Windows users... (and the for building it I need to install Haskell and mingw and the installer and stuff under wine, which is quite annoying). I would be very happy if someone would step up and maintain the Windows version of it, given that I don?t run windows and can?t properly test this. > From my POV I'll process the stats on my Linux machine, so if > arbtt-capture still has the same capability and format, I don't mind so > much.. but... > > * I tried to start up the current 0.61 version and got > > OpenProcess: permission denied > > This seems to be a problem with requesting the PROCESS_QUERY_INFORMATION > permission. PROCESS_QUERY_LIMITED_INFORMATION might be more appropriate > (but XP has no such permission level). > > Also our corporate image has some unpleasant corporate malware on it, > and I'm not making any effort to start arbtt-capture with elevated > rights. I managed to collect window titles from the current focussed > window by using hooks without this kind of problem and without > elevating, so maybe alternate approaches / Win32 API calls might be > worth looking at. Hmm. I guess this needs someone to look into it, but that won?t be me... it can be you, though :-) Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From adrian.wilkins at gmail.com Thu Jul 3 13:48:50 2014 From: adrian.wilkins at gmail.com (Adrian Wilkins) Date: Thu, 03 Jul 2014 12:48:50 +0100 Subject: Hello and questions.. In-Reply-To: <1404386812.19001.13.camel@kirk> References: <53B52FF3.5040308@gmail.com> <1404384030.19001.9.camel@kirk> <53B53935.8050605@gmail.com> <1404386812.19001.13.camel@kirk> Message-ID: <53B54322.7060009@gmail.com> On 03/07/14 12:26, Joachim Breitner wrote: > Hi, > > >> I'm envisaging a program that goes through your calendar and adds >> entries > > Right. And this program can be technically completely independent from > arbtt, in the spirit of Unix and its small dedicated tools. > Indeed. This puts me in mind of an alternate approach - a program (or more than one) that arbtt-capture interrogates for event data ; so you could have arbtt-meeting-daemon etc, and arbtt-capture could add their information to it's TimeEventLog entries. But I still like the idea of being able to work with multiple logs, if only because I could write the meeting log thing at the end of the month and be able to retrospectively include the events in my arbtt-stats run - in much the same way that you can write new rules and get a better quality report from the same logged events. >> If you could aggregate logs, programs >> that served as alternate event sources would just naturally feed into >> that feature. > > Sorry, I can?t follow. How is that related to aggregating logs? > I was thinking that in order for the separate program(s) discussed above to produce useful inputs for arbtt-stats, it would have to aggregrate the data in those logs together such that all events for a given sample are considered part of the same TimeLogEntry (when being categorized). e.g. for the hypothetical meeting analyzer program, if it produces a TimeLogEntry for each minute of your meeting, you wouldn't want to consider those separately to the other TimeLogEntry objects logged by arbtt-capture (I take my laptop to meetings and the activity logged would disagree with the meeting category in some cases). ---- meeting-analyzer log 2014-07-03 12:32:00 Meeting:$title="Meeting about tortoises" ---- arbtt-capture log 2014-07-03 12:32:05 Program:$title="Web browser : page about snakes" Rule sets that considered Meeting to take priority over everything else would still put the second log entry in the "Project:Snakes" category rather than the "Project:Tortoises" category, because that entry has no Meeting. If you rolled those two entries together based on their time being close to each other, then Meeting can override the web browser. I'm presuming here that arbtt-stats analyzes each TimeLogEntry separately. From mail at joachim-breitner.de Thu Jul 3 13:55:51 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 03 Jul 2014 13:55:51 +0200 Subject: Hello and questions.. In-Reply-To: <53B54322.7060009@gmail.com> References: <53B52FF3.5040308@gmail.com> <1404384030.19001.9.camel@kirk> <53B53935.8050605@gmail.com> <1404386812.19001.13.camel@kirk> <53B54322.7060009@gmail.com> Message-ID: <1404388551.19001.21.camel@kirk> Hi, Am Donnerstag, den 03.07.2014, 12:48 +0100 schrieb Adrian Wilkins: > >> If you could aggregate logs, programs > >> that served as alternate event sources would just naturally feed into > >> that feature. > > > > Sorry, I can?t follow. How is that related to aggregating logs? > > > > I was thinking that in order for the separate program(s) discussed above > to produce useful inputs for arbtt-stats, it would have to aggregrate > the data in those logs together such that all events for a given sample > are considered part of the same TimeLogEntry (when being categorized). > > e.g. for the hypothetical meeting analyzer program, if it produces a > TimeLogEntry for each minute of your meeting, you wouldn't want to > consider those separately to the other TimeLogEntry objects logged by > arbtt-capture (I take my laptop to meetings and the activity logged > would disagree with the meeting category in some cases). > > ---- meeting-analyzer log > 2014-07-03 12:32:00 Meeting:$title="Meeting about tortoises" > ---- arbtt-capture log > 2014-07-03 12:32:05 Program:$title="Web browser : page about snakes" > > Rule sets that considered Meeting to take priority over everything else > would still put the second log entry in the "Project:Snakes" category > rather than the "Project:Tortoises" category, because that entry has no > Meeting. well, if the separate programs (which put additional information into their title) run on the same machine as the browser, arbtt will take the samples together, and there is no need to merge them. But it is of course a valid feature request to intelligently merge log files from two machines so that simultaneous samples appear as one sample. But it is rather specialized and non-trival (what to do with non-matching sampling rates, e.g.), so it?s not on my TODO list yet. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From adrian.wilkins at gmail.com Thu Jul 3 15:05:02 2014 From: adrian.wilkins at gmail.com (Adrian Wilkins) Date: Thu, 03 Jul 2014 14:05:02 +0100 Subject: Hello and questions.. In-Reply-To: <1404388551.19001.21.camel@kirk> References: <53B52FF3.5040308@gmail.com> <1404384030.19001.9.camel@kirk> <53B53935.8050605@gmail.com> <1404386812.19001.13.camel@kirk> <53B54322.7060009@gmail.com> <1404388551.19001.21.camel@kirk> Message-ID: <53B554FE.1060309@gmail.com> On 03/07/14 12:55, Joachim Breitner wrote: > well, if the separate programs (which put additional information into > their title) run on the same machine as the browser, arbtt will take the > samples together, and there is no need to merge them. > Yeah, the idea of making hidden windows works nicely (maybe a little systray applet that collects information and holds a collection of objects that arbtt-capture can see), but I still like the idea of having programs that you don't need to run 100% of the time for things like calendars that are not as dynamic as user activity. > > But it is of course a valid feature request to intelligently merge log > files from two machines so that simultaneous samples appear as one > sample. But it is rather specialized and non-trival (what to do with > non-matching sampling rates, e.g.), so it?s not on my TODO list yet. > I'm guessing that a more limited case where sample rates must match would be the best start there ; I don't like the prospect of matching events with different sample rates up either ; clock drift and different process start times are bad enough. In any case... getting it to work at all on Windows would seem to be a challenge, which makes my multi-machine ambitions a moot point right now - still getting that permission denied problem, even when I run the application elevated, although I'm not sure what's causing it - as mentioned, we have some pretty heinous corporate malware installed. The alternate approach I tried that didn't involve OpenProcess seems to work without issues just from my little hacky C# application, so I shall see about sorting that out when I have a spare minute. Until then, I'll be tracking my other minutes on my Linux boxes with more accuracy :-) From mail at joachim-breitner.de Sat Jul 5 13:42:34 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sat, 05 Jul 2014 13:42:34 +0200 Subject: Listing samples which are not matched by any tags? In-Reply-To: References: <1383508486.4978.2.camel@kirk> <1383517885.12868.10.camel@kirk> Message-ID: <1404560554.4066.7.camel@kirk> Dear Gwern, (sorry for the late reply) Am Sonntag, den 29.06.2014, 17:35 -0400 schrieb Gwern Branwen: > On Sun, Nov 3, 2013 at 6:36 PM, Gwern Branwen wrote: > > On Sun, Nov 3, 2013 at 5:31 PM, Joachim Breitner > > wrote: > >> But now you surely want to know what these selected samples look like, > >> right? That leads us to the discussion we had on the list with Waldir > >> in June: What should the tool like like that combines the dumping of > >> arbtt-dump with the sample selection of arbtt-stats... I?m unsure about > >> the proper design here. > > > > To me, it seems pretty simple. Keeping the same interface, apply a > > categorize.cfg's set of rules to each sample and then print or not > > based on what tags matched or didn't match. > > Has there been any more thought on this issue? Have you had a look at the features in 0.8? I believe they (partly) address the issue: * arbtt-stats can print the actual samples selected, with --dump-samples. (http://darcs.nomeata.de/arbtt/doc/users_guide/release-notes.html) > After repairing my logs > & working out how to use the CSV, I wondered how much data I was > missing due to a lack of matching tag. This apparently is reported by > the -i flag. Even after adding some more tagging, this is what I get: > > $ arbtt-stats -i -m 0 -f '$sampleage <100:00' > General Information > =================== > FirstRecord | 2014-06-26 01:33:16.291076 UTC > LastRecord | 2014-06-29 21:28:30.625435 UTC > Number of records | 7485 > Total time recorded | 3d19h31m00s > Total time selected | 1d12h41m10s > Fraction of total time recorded | 100% > Fraction of total time selected | 40% > Fraction of recorded time selected | 40% > > Given the existence of the flag '--also-inactive include > samples with the tag "inactive"', I infer all this recorded time > reported is active time. But that means fully *60%* of my activity is > not being classified in any way! That's a heck of a lot of lost data. I believe you understood the flag the wrong way around: Without --also-inactive, inactive times are _not_ counted as selected. So the 40% in your report should go up when you use "--also-inactive". Also, your --filter in the above command will have everything that is older than 100h (if there is any) to be considered as not selected. > And I don't know what the lost data is: I already classified > everything I could think of. What am I missing? I have no way of > knowing unless arbtt will tell me and give me samples of active time > which don't match so I can go 'aha, I need to classify $X/Y/Z as tag > A! Much better.' What if you have a tag "current-program" that will always be present? With such a tag, the feature you describe is useless. I guess you mean ?show me the data from samples that are not categorized into one of these tags:....?. But that is already possible: $ arbtt-stats --dump-samples --filter '$sampleage < 1:00' -x Web -x Project: -x ... Greetings, Joachim -- Joachim Breitner e-Mail: mail at joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata at joachim-breitner.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From gwern at gwern.net Sat Jul 5 18:38:59 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sat, 5 Jul 2014 12:38:59 -0400 Subject: Listing samples which are not matched by any tags? In-Reply-To: <1404560554.4066.7.camel@kirk> References: <1383508486.4978.2.camel@kirk> <1383517885.12868.10.camel@kirk> <1404560554.4066.7.camel@kirk> Message-ID: On Sat, Jul 5, 2014 at 7:42 AM, Joachim Breitner wrote > Have you had a look at the features in 0.8? I believe they (partly) > address the issue: > > * arbtt-stats can print the actual samples selected, with > --dump-samples. > (http://darcs.nomeata.de/arbtt/doc/users_guide/release-notes.html) No, I didn't even know you had added that. (I assumed you had dropped the issue back in November.) > What if you have a tag "current-program" that will always be present? > With such a tag, the feature you describe is useless. I guess you mean > ?show me the data from samples that are not categorized into one of > these tags:....?. But that is already possible: > > $ arbtt-stats --dump-samples --filter '$sampleage < 1:00' -x Web -x Project: -x ... I expected you to object that and I was going to point out that it could easily be solved on the UX level by letting the user specify either whitelist or blacklists of tags (either tags to not consider as a match or tags to exclude). But I see that's how you solved the problem anyway. Trying that out now, it seems to work! If I throw in an "| fgrep '(*)'" to look at the active window only, it looks even better. I can see a lot of programs I've failed to classify (eg when Google Reader shut down, I switched to a local RSS reader, Liferea, but forgot to add it to arbtt), and I've spotted a few instances where my rule didn't actually work. (eg I had been matching the program 'FBReader' for tagging time spent reading ebooks, but now that I look at the samples, it seems the program is actually 'fbreader' and it's the *title* which has 'FBReader' in it. I don't know if FBReader changed its X properties at some point or if I simply confused program with title when I was looking at it in `xprop`, but either way, it's not working.) This seems like a critical tool for debugging one's arbtt rules and expanding them. Has a discussion of this been added to the manual? I only see a mention that the option exists in `docs/arbtt.xml`. -- gwern http://www.gwern.net From mail at joachim-breitner.de Sat Jul 5 18:44:39 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sat, 05 Jul 2014 18:44:39 +0200 Subject: Listing samples which are not matched by any tags? In-Reply-To: References: <1383508486.4978.2.camel@kirk> <1383517885.12868.10.camel@kirk> <1404560554.4066.7.camel@kirk> Message-ID: <1404578679.5548.2.camel@kirk> Hi, Am Samstag, den 05.07.2014, 12:38 -0400 schrieb Gwern Branwen: > Trying that out now, it seems to work! If I throw in an "| fgrep > '(*)'" to look at the active window only, it looks even better. > > I can see a lot of programs I've failed to classify (eg when Google > Reader shut down, I switched to a local RSS reader, Liferea, but > forgot to add it to arbtt), and I've spotted a few instances where my > rule didn't actually work. (eg I had been matching the program > 'FBReader' for tagging time spent reading ebooks, but now that I look > at the samples, it seems the program is actually 'fbreader' and it's > the *title* which has 'FBReader' in it. I don't know if FBReader > changed its X properties at some point or if I simply confused program > with title when I was looking at it in `xprop`, but either way, it's > not working.) glad to hear it works (and it clearly demonstrates the usefulness of the a posteriori approach that we take here). > This seems like a critical tool for debugging one's arbtt rules and > expanding them. Has a discussion of this been added to the manual? I > only see a mention that the option exists in `docs/arbtt.xml`. No, but the manual is quite short on ?how do I do X?. Would you be interesting in contributing here? I think that _not_ being a developer is an advantage when writing good documentation. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From gwern at gwern.net Sat Aug 16 23:02:18 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sat, 16 Aug 2014 17:02:18 -0400 Subject: Example of time-tracking Message-ID: http://lightonphiri.org/blog/quantified-self-the-three-year-long-time-tracking-experiment > Between June 2011 and June 2013 I diligently tracked the way that I used my 24-hour cycles. I initiated this painstaking task after going through a long spell of productivity draught; I was obsessed with how exactly I spent my ?Work -> Eat -> Sleep? cycles. To achieve this, I used Hamster Time Tracker [1]. I recently decided to dig into this wealth of information and have spent the last couple of days analyzing it. While the data I collected is not 100% accurate, visible patterns emerge. > > ... > - I discovered that I slept an average of 5.1 hours per day (see table below) > - Day-to-day tasks account for an average of 36.14%, implying that I had 63.86% of productivity time at my disposal > - I work more and often in the morning (segmenting results into day slots? dawn, wee hours, morning, mid-morning, afternoon, evening?yielded even more interesting results?) > ... > - I was able to figure out when I am most productive and was thus able to plan my waking life accordingly > - I knew where most productivity leaks were coming from (social networking sites for instance) and was able to cut down on those activities when I needed to reclaim time > - I was able to identify tasks that I could easily perform when I was in ?Zombie? mode (e.g. current affairs) > - Perhaps the most prized outcome was figuring out when I was most productive?I wrote more in the morning, I read more in the morning and did most of my coding late at night He used Project Hamster http://projecthamster.wordpress.com/about/ which is a manual self-tracking program: > Whenever you change from doing one task to other, you change your current activity in Hamster. After a while you can see how many hours you have spent on what. Maybe print it out, or export to some suitable format, if time reporting is a request of your employee. So capable of more semantics than arbtt, but also a lot less fine-grained and more work. -- gwern http://www.gwern.net From mail at joachim-breitner.de Sat Aug 16 23:10:18 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sat, 16 Aug 2014 23:10:18 +0200 Subject: Example of time-tracking In-Reply-To: References: Message-ID: <1408223418.1972.2.camel@joachim-breitner.de> Hi, thanks for the link! Greetings, Joachim Am Samstag, den 16.08.2014, 17:02 -0400 schrieb Gwern Branwen: > http://lightonphiri.org/blog/quantified-self-the-three-year-long-time-tracking-experiment > > > Between June 2011 and June 2013 I diligently tracked the way that I used my 24-hour cycles. I initiated this painstaking task after going through a long spell of productivity draught; I was obsessed with how exactly I spent my ?Work -> Eat -> Sleep? cycles. To achieve this, I used Hamster Time Tracker [1]. I recently decided to dig into this wealth of information and have spent the last couple of days analyzing it. While the data I collected is not 100% accurate, visible patterns emerge. > > > > ... > > - I discovered that I slept an average of 5.1 hours per day (see table below) > > - Day-to-day tasks account for an average of 36.14%, implying that I had 63.86% of productivity time at my disposal > > - I work more and often in the morning (segmenting results into day slots? dawn, wee hours, morning, mid-morning, afternoon, evening?yielded even more interesting results?) > > ... > > - I was able to figure out when I am most productive and was thus able to plan my waking life accordingly > > - I knew where most productivity leaks were coming from (social networking sites for instance) and was able to cut down on those activities when I needed to reclaim time > > - I was able to identify tasks that I could easily perform when I was in ?Zombie? mode (e.g. current affairs) > > - Perhaps the most prized outcome was figuring out when I was most productive?I wrote more in the morning, I read more in the morning and did most of my coding late at night > > He used Project Hamster http://projecthamster.wordpress.com/about/ > which is a manual self-tracking program: > > > Whenever you change from doing one task to other, you change your current activity in Hamster. After a while you can see how many hours you have spent on what. Maybe print it out, or export to some suitable format, if time reporting is a request of your employee. > > So capable of more semantics than arbtt, but also a lot less > fine-grained and more work. > -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From gwern at gwern.net Sun Aug 17 00:20:51 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sat, 16 Aug 2014 18:20:51 -0400 Subject: Example of time-tracking In-Reply-To: References: Message-ID: On Sat, Aug 16, 2014 at 5:02 PM, Gwern Branwen wrote: >> - I work more and often in the morning (segmenting results into day slots? dawn, wee hours, morning, mid-morning, afternoon, evening?yielded even more interesting results?) I thought I'd try to extract my own 24-h data from arbtt and see how my factor-analysis fared, but it looks like this can't be done with arbtt right now: I need categorization of data usage by minute or by hour, but it seems arbtt only supports extracting data to csv by chunks of day/month/year --for-each=PERIOD one of: day, month, year Is there a workaround here? Or does --for-each need to be extended? I think it would be enough to add 'minute' as a period, since arbtt isn't generally used more fine-grained than that and I can aggregate by hour in R if it turns out that there's not enough data for plotting/regressing by minute over 24h. -- gwern http://www.gwern.net From gwern0 at gmail.com Wed Sep 3 21:30:19 2014 From: gwern0 at gmail.com (gwern0 at gmail.com) Date: Wed, 03 Sep 2014 12:30:19 -0700 (PDT) Subject: darcs patch: arbtt.xml: fix duplicate ID in release n... (and 3 more) Message-ID: <54076c4b.4533e00a.0a6b.ffffe9e5@mx.google.com> 4 patches for repository http://darcs.nomeata.de/arbtt: Wed Sep 3 15:15:04 EDT 2014 gwern at gwern.net * arbtt.xml: fix duplicate ID in release notes Wed Sep 3 15:15:35 EDT 2014 gwern at gwern.net * arbtt.xml: delete trailing whitespace Wed Sep 3 15:15:46 EDT 2014 gwern at gwern.net * arbtt.xml: add self to doc authors, add a list of similar projects to intro to describe arbtt better Wed Sep 3 15:16:40 EDT 2014 gwern at gwern.net * example categorize.cfg: add local variable to set emacs to haskell-mode by adding the metadata, emacs users copying the example config get appropriate syntax highlighting without additional work, and other users are reminded that their favorite editor's haskell mode would work well in displaying arbtt configs -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-preview.txt Type: text/x-darcs-patch Size: 10770 bytes Desc: Patch preview URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: arbtt_xml_-fix-duplicate-id-in-release-notes.dpatch Type: application/x-darcs-patch Size: 11376 bytes Desc: A darcs patch for your repository! URL: From mail at joachim-breitner.de Wed Sep 3 21:59:26 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 03 Sep 2014 21:59:26 +0200 Subject: darcs patch: arbtt.xml: fix duplicate ID in release n... (and 3 more) In-Reply-To: <54076c4b.4533e00a.0a6b.ffffe9e5@mx.google.com> References: <54076c4b.4533e00a.0a6b.ffffe9e5@mx.google.com> Message-ID: <1409774366.1805.4.camel@joachim-breitner.de> Hi, Am Mittwoch, den 03.09.2014, 12:30 -0700 schrieb gwern0 at gmail.com: > 4 patches for repository http://darcs.nomeata.de/arbtt: thanks! Applied. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From gwern at gwern.net Wed Sep 3 21:32:09 2014 From: gwern at gwern.net (Gwern Branwen) Date: Wed, 3 Sep 2014 15:32:09 -0400 Subject: Listing samples which are not matched by any tags? In-Reply-To: <1404578679.5548.2.camel@kirk> References: <1383508486.4978.2.camel@kirk> <1383517885.12868.10.camel@kirk> <1404560554.4066.7.camel@kirk> <1404578679.5548.2.camel@kirk> Message-ID: On Sat, Jul 5, 2014 at 12:44 PM, Joachim Breitner wrote: > No, but the manual is quite short on ?how do I do X?. Would you be > interesting in contributing here? I think that _not_ being a developer > is an advantage when writing good documentation. Maybe. I may not be an arbtt developer, but I'm still not a regular user. Regardless, I think some of the tricks and observations I made while working with arbtt are worth including in the manual; the manual currently gives one little idea how one would actually go about effectively using arbtt. I wrote up some thoughts in Markdown (below) - I have never used that XML stuff you are using and would probably muck it up, so hopefully you can convert the Markdown version to XML. If other arbtt users could mention various roadblocks and solutions they came up with, that'd be helpful. ---- The idea is that this would be placed after 'Configuring the arbtt categorizer (arbtt-stats)' http://arbtt.nomeata.de/doc/users_guide/configuration.html#idp20408 - # Effective Use of Arbtt Now that the syntax has been described & the toolbox laid out, how does one practically go about using and configuring arbtt? ## Enabling data collection After installing arbtt, one needs to configure it to run. There are many ways one can run the `arbtt-capture` daemon, but a standard way on Unix systems would be to add it as a [`cron`](https://en.wikipedia.org/wiki/Cron) job: for example, one could edit one's crontab file (`crontab -e`) and add a line like this: DISPLAY=:0 @reboot arbtt-capture --logfile=/home/username/doc/arbtt/capture.log At boot, `arbtt-capture` will be run in the background and will capture a snapshot of the X metadata for active windows every 60 seconds (the default). If one wanted more fine-grained time data at the expense of doubling storage use per day, one could increase the sampling rate with a command like `--sample-rate=30`. To be resilient to any errors or segfaults, one could also wrap it in a infinite loop to restart the daemon should it ever crash, with a command like DISPLAY=:0 @reboot while true; do arbtt-capture --sample-rate=30; sleep 1m; done ## Checking data availability arbtt tracks X properties like window title, class, and running program, and one writes regexp rules to classify the strings as one wishes; but this assumes that the necessary data is present in those properties. For some programs, this is the case. For example, web browsers like Firefox typically set the X title to the `` of the web page in the currently-focused tab, which is enough for classification. Some programs do not set titles or class, and all arbtt sees is empty strings like ""; or they may set the title/class to a constant like "Liferea", which may be acceptable if that program is used for only one purpose, but if it is used for many purposes, then one cannot write a rule matching it without producing highly-misleading time analyses. (For example, a web browser may be used for countless purposes, ranging from work to research to music to writing to programming; but if the web browser's title/class were always just "Web browser", how would one classify 5 hours spent using the web browser? If the 5 hours are classified as any or all of those purposes, then the results will be misleading garbage - one probably didn't spend 5 hours just listening to music, but a mixture of those purposes, which changes from day to day.) One should check for such problematic programs upon starting using arbtt. It would be unfortunate if one were to log for a few months, go back for a detailed report for some reason, and discover that the necessary data was never actually available for arbtt to log! These programs can sometimes be customized internally, a bug report filed with the maintainers, or their titles can be externally set by [`wmctrl`](https://en.wikipedia.org/wiki/Wmctrl) or [`xprop`](http://jonisalonen.com/2014/setting-x11-window-properties-with-xprop/). ### `xprop` One can check the X properties of a running window by running the command [`xprop`](http://www.xfree86.org/current/xprop.1.html) and clicking on the window; `xprop` will print out all the relevant X information. For example, the output for Emacs might look like this $ xprop | tail -5 WM_CLASS(STRING) = "emacs", "Emacs" WM_ICON_NAME(STRING) = "emacs at elan" _NET_WM_ICON_NAME(UTF8_STRING) = "emacs at elan" WM_NAME(STRING) = "emacs at elan" _NET_WM_NAME(UTF8_STRING) = "emacs at elan" This is not very helpful: it does not tell us the filename being edited, the mode being used, or anything. One could classify time spent in Emacs as "programming" or "writing", but this would be imperfect, especially if one does both activities regularly. However, Emacs can be customized by editing `~/.emacs`, and after some searching with queries like "setting Emacs window title", the [Emacs wiki](http://www.emacswiki.org/emacs-en/FrameTitle) & [manual](https://www.gnu.org/software/emacs/manual/html_node/efaq/Displaying-the-current-file-name-in-the-titlebar.html) advise us to put something like this Elisp in our `.emacs` file: (setq frame-title-format "%f") Now the output looks different: $ xprop | tail -5 WM_CLASS(STRING) = "emacs", "Emacs" WM_ICON_NAME(STRING) = "/home/gwern/arbtt.page" _NET_WM_ICON_NAME(UTF8_STRING) = "/home/gwern/arbtt.page" WM_NAME(STRING) = "/home/gwern/arbtt.page" _NET_WM_NAME(UTF8_STRING) = "/home/gwern/arbtt.page" With this, we can usefully classify all such time samples as being "writing". Another common gap is terminals/shells: they often do not include information in the title like the current working directory or last shell command. For example, urxvt/Bash: WM_COMMAND(STRING) = { "urxvt" } _NET_WM_ICON_NAME(UTF8_STRING) = "urxvt" WM_ICON_NAME(STRING) = "urxvt" _NET_WM_NAME(UTF8_STRING) = "urxvt" WM_NAME(STRING) = "urxvt" Programmers may spend many hours in the shell doing a variety of things (like Emacs), so this is a problem. Fortunately, this is also solvable by customizing one's `.bashrc` to set the prompt to emit an escape code interpreted by the terminal (baroque, but it works). The following will include the working directory, a timestamp, and the last command: trap 'echo -ne "\033]2;$(pwd); $(history 1 | sed "s/^[ ]*[0-9]*[ ]*//g")\007"' DEBUG Now the urxvt samples are useful: _NET_WM_NAME(UTF8_STRING) = "/home/gwern/wiki; 2014-09-03 13:39:32 arbtt-stats --help" A rule could classify based on the directory one is working in, the command one ran, or both. Other shells like zsh can be fixed this way too but the exact command may differ; you will need to research & experiment. Some programs can be tricky to set. The [X image viewer feh](http://feh.finalrewind.org/) has a `--title` option but it cannot be set in the configuration file, `.config/feh/themes`, because it needs to be specified dynamically; so one needs to set up a shell alias or script to wrap the command like `feh --title "$(pwd) / %f / %n"`. ### Raw samples `xprop` can be tedious to use on every running window and one may not think to check rarer programs. A better approach is to use `arbtt-stats`'s `--dump-samples` option: this option will print out the collected data for specified time periods, allowing one to examine the X properties en masse. This option can be used with the `-x`/`--exclude=` options to print the samples for *samples not matched by existing rules* as well, which is indispensable for improving coverage and suggesting ideas for new rules. A good way to figure out what customizations to make is to run arbtt as a daemon for a day or so, and then begin examining the raw samples for problems. An example: suppose I create a simple category file named `foo` with just the line $idle > 30 ==> tag inactive I can then dump all my arbtt samples for the past day with a command like this: arbtt-stats --categorizefile=foo --m=0 --filter='$sampleage <24:00' --dump-samples Because there are so many open windows, this produces a large amount (26586 lines) of hard-to-read output: ... ( ) Navigator: /r/Touhou's Favorite Arranges! Part 71: Retribution for the Eternal Night ~ Imperishable Night : touhou - Iceweasel ( ) Navigator: Configuring the arbtt categorizer (arbtt-stats) - Iceweasel ( ) evince: ATTACHMENT02 ( ) evince: 2009-geisler.pdf ? Heart rate variability predicts self-control in goal pursuit ( ) urxvt: /home/gwern; arbtt-stats --categorizefile=foo --m=0 --filter='$sampleage <24:00' --dump-samples ( ) mnemosyne: Mnemosyne ( ) urxvt: /home/gwern; 2014-09-03 13:11:45 xprop ( ) urxvt: /home/gwern; 2014-09-03 13:42:17 history 1 | cut --delimiter=' ' --fields=5- ( ) urxvt: /home/gwern; 2014-09-03 13:12:21 git log -p .emacs (*) emacs: emacs at elan ( ) urxvt: /home/gwern; 2014-09-01 14:50:30 while true; do cd ~/ && getmail_fetch --ssl pop.gmail.com gwern0 'ugaozoumbhwcijxb' ./mail/; done ( ) urxvt: /home/gwern/blackmarket-mirrors/silkroad2-forums; 2014-08-31 23:20:10 mv /home/gwern/cookies.txt ./; http_proxy="localhost:8118" wget... ( ) urxvt: /home/gwern/blackmarket-mirrors/agora; 2014-08-31 23:15:50 mv /home/gwern/cookies.txt ./; http_proxy="localhost:8118" wget --mirror ... ( ) urxvt: /home/gwern/blackmarket-mirrors/evolution-forums; 2014-08-31 23:04:10 mv ~/cookies.txt ./; http_proxy="localhost:8118" wget --mirror ... ( ) puddletag: puddletag: /home/gwern/music Active windows are denoted by an asterisk, so I can focus & simplify by adding a pipe like `| fgrep '(*)'`, producing more manageable output like (*) urxvt: irssi (*) urxvt: irssi (*) urxvt: irssi (*) Navigator: Pyramid of Technology - NextNature.net - Iceweasel (*) Navigator: Search results - gwern0 at gmail.com - Gmail - Iceweasel (*) Navigator: [New comment] The Wrong Path - gwern0 at gmail.com - Gmail - Iceweasel (*) Navigator: Iceweasel (*) Navigator: Litecoin Exchange Rate - $4.83 USD - litecoinexchangerate.org - Iceweasel (*) Navigator: PredictionBook: LiteCoin will trade at >=10 USD per ltc in 2 years, - Iceweasel (*) urxvt: irssi (*) Navigator: Bug#691547 closed by Mikhail Gusarov <dottedmag at dottedmag.net> (Re: s3cmd: Man page: --default-mime-type documentation incomplete...) (*) Navigator: Bug#691547 closed by Mikhail Gusarov <dottedmag at dottedmag.net> (Re: s3cmd: Man page: --default-mime-type documentation incomplete...) (*) Navigator: Bug#691547 closed by Mikhail Gusarov <dottedmag at dottedmag.net> (Re: s3cmd: Man page: --default-mime-type documentation incomplete...) (*) urxvt: /home/gwern; 2014-09-02 14:25:17 man s3cmd (*) evince: bayesiancausality.pdf (*) evince: bayesiancausality.pdf (*) puddletag: puddletag: /home/gwern/music (*) puddletag: puddletag: /home/gwern/music (*) evince: bayesiancausality.pdf (*) Navigator: ? Umineko no Naku Koro ni Music Box 4 - ?????? ?2?? ??? - YouTube - Iceweasel ... This is better. We can see a few things: the windows all now produce enough information to be usefully classified (Gmail can be classified under email, irssi can be classified as IRC, the urxvt usage can clearly be classified as programming, the PDF being read is statistics, etc) in part because of customizations to bash/urxvt. The duplication still impedes focus, and we don't know what's most common. We can use another pipeline to sort, count duplicates, and sort by number of duplicates (`| sort | uniq --count | sort --general-numeric-sort`), yielding: ... 14 (*) Navigator: A Bluer Shade of White Chapter 4, a frozen fanfic | FanFiction - Iceweasel 14 (*) Navigator: Iceweasel 15 (*) evince: 2009-geisler.pdf ? Heart rate variability predicts self-control in goal pursuit 15 (*) Navigator: Tool use by animals - Wikipedia, the free encyclopedia - Iceweasel 16 (*) Navigator: Hacker News | Add Comment - Iceweasel 17 (*) evince: bayesiancausality.pdf 17 (*) Navigator: Comments - Less Wrong Discussion - Iceweasel 17 (*) Navigator: Keith Gessen ? Why not kill them all?: In Donetsk ? LRB 11 September 2014 - Iceweasel 17 (*) Navigator: Notes on the Celebrity Data Theft | Hacker News - Iceweasel 18 (*) Navigator: A Bluer Shade of White Chapter 1, a frozen fanfic | FanFiction - Iceweasel 19 (*) gl: mplayer2 19 (*) Navigator: Neural networks and deep learning - Iceweasel 20 (*) Navigator: Harry Potter and the Philosopher's Zombie, a harry potter fanfic | FanFiction - Iceweasel 20 (*) Navigator: [OBNYC] Time tracking app - gwern0 at gmail.com - Gmail - Iceweasel 25 (*) evince: ps2007.pdf ? untitled 35 (*) emacs: /home/gwern/arbtt.page 43 (*) Navigator: CCC comments on The Octopus, the Dolphin and Us: a Great Filter tale - Less Wrong - Iceweasel 62 (*) evince: The physics of information processing superobjects - Anders Sandberg - 1999.pdf ? Brains2 69 (*) liferea: Liferea 82 (*) evince: BMS_raftery.pdf ? untitled 84 (*) emacs: emacs at elan 87 (*) Navigator: overview for gwern - Iceweasel 109 (*) puddletag: puddletag: /home/gwern/music 150 (*) urxvt: irssi Put this way, we can see what rules we should write to categorize: we could categorize the activities here into a few categories of "recreational", "statistics", "music", "email", "IRC", "research", & "writing"; and add to the `categorize.cfg` some rules like thus: $idle > 30 ==> tag inactive, current window $title =~ [/.*Hacker News.*/, /.*Less Wrong.*/, /.*overview for gwern.*/, /.*[fF]an[fF]ic.*/, /.* LRB .*/] || current window $program == "liferea" ==> tag Recreation, current window $title =~ [/.*puddletag.*/, /.*mplayer2.*/] ==> tag Music, current window $title =~ [/.*[bB]ayesian.*/, /.*[nN]eural [nN]etworks.*/, /.*ps2007.pdf.*/, /.*[Rr]aftery.*/] ==> tag Statistics, current window $title =~ [/.*Wikipedia.*/, /.*Heart rate variability.*/, /.*Anders Sandberg.*/] ==> tag Research, current window $title =~ [/.*Gmail.*/] ==> tag Email, current window $title =~ [/.*arbtt.*/] ==> tag Writing, current window $title == "irssi" ==> tag IRC, If we reran the command, we'd see the same output, so we need to leverage our new rules and *exclude* any samples matching our current tags, so now we run a command like: arbtt-stats --categorizefile=foo --filter='$sampleage <24:00' --dump-samples --exclude=Recreation --exclude=Music --exclude=Statistics --exclude=Research --exclude=Email --exclude=Writing --exclude=IRC | fgrep '(*)' | sort | uniq --count | sort --general-numeric-sort Now the previous samples disappear, leaving us with a fresh batch of unclassified samples to work with: 9 (*) Navigator: New Web Order > Nik Cubrilovic - - Notes on the Celebrity Data Theft - Iceweasel 9 ( ) urxvt: /home/gwern; arbtt-stats --categorizefile=foo --filter='$sampleage <24:00' --dump-samples | fgrep '(*)' | less 10 (*) evince: ATTACHMENT02 10 (*) Navigator: These Giant Copper Orbs Show Just How Much Metal Comes From a Mine | Design | WIRED - Iceweasel 12 (*) evince: [Jon_Elster]_Alchemies_of_the_Mind_Rationality_an(BookFi.org).pdf ? Alchemies of the mind 12 (*) Navigator: Morality Quiz/Test your Morals, Values & Ethics - YourMorals.Org - Iceweasel 33 ( ) urxvt: /home/gwern; arbtt-stats --categorizefile=foo --filter='$sampleage <24:00' --dump-samples | fgrep '(*)'... We can add rules categorizing these as 'Recreational', 'Writing', 'Research', 'Recreational', 'Research', 'Writing', and 'Writing' respectively; and we might decide at this point that 'Writing' is starting to become overloaded, so we'll split it into two tags, 'Writing' and 'Programming'. And then after tossing another `--exclude=Programming` into our rules, we can repeat the process. As we refine our rules, we will quickly spot instances where the title/class/program are insufficient to allow accurate classification, and we will figure out the best collection of tags for our particular purposes. A few iterations is enough for most purposes. ## Categorizing advice When building up rules, a few rules of thumb should be kept in mind: 1. categorize by purpose, not by program This leads to misleading time reports. Avoid, for example, lumping all web browser time into a single category named 'Internet'; this is more misleading than helpful. Good categories describe an activity or goal, such as 'Work' or 'Recreation', not a tool, like 'Emacs' or 'Vim'. 2. when in doubt, write narrow rules and generalize later Regexps are tricky and it can be easy to write rules far broader than one intended. The `--exclude` filters mean that one will never see samples which are matched accidentally. If one is in doubt, it can be helpful to take a specific sample one wants to match and several similar strings and look at how well one's regexp rule works in Emacs's [regexp-builder](http://www.emacswiki.org/emacs/ReBuilder) or online regexp-testers like [regexpal](http://regexpal.com/). 3. don't try to classify everything You will never classify 100% of samples because sometimes programs do not include useful X properties & cannot be fixed, you have samples from before you fixed them, or they are too transient (like popups and dialogues) to be worth fixing. It is not necessary to classify 100% of your time, since as long as the most common programs and, say, [80%](https://en.wikipedia.org/wiki/Pareto_principle) of your time is classified, then you have most of the value. It is easy to waste more time tweaking arbtt than one gains from increased accuracy or more finely-grained tags. ## Long-term storage Each halving of the sampling rate doubles the number of samples taken and hence the storage requirement; sampling rates below 20s are probably wasteful. But even the default 60s can accumulate into a nontrivial amount of data over a year. A constantly-changing binary file can interact poorly with backup systems, may make arbtt analyses slower, and if one's system occasionally crashes or experiences other problems, cause some corruption of the log and be a nuisance in having to run `arbtt-recover`. Thus it may be a good idea to archive one's `capture.log` on an annual basis. If one needs to query the historical data, the particular log file can be specified as an option like `--logfile=/home/gwern/doc/arbtt/2013-2014.log` ## Advanced queries arbtt supports CSV export of time by category in various levels of granularity in a 'long' format (multiple rows for each day, with _n_ row specifying a category's value for that day). These CSV exports can be imported into statistical programs like R or Excel and manipulated as desired. R users may prefer to have their time data in a 'wide' format (each row is 1 day, with _n_ columns for each possible category); this can be done with the `reshape` default library. After reading in the CSV, the time-intervals can be converted to counts and the data to a wide data-frame with R code like the following: arbtt <- read.csv("arbtt.csv") interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x)) else { y <- unlist(strsplit(x, ":")); as.integer(y[[1]])*3600 + as.integer(y[[2]])*60 + as.integer(y[[3]]); } } else NA } arbtt$Time <- sapply(as.character(arbtt$Time), interval) library(reshape) arbtt <- reshape(arbtt, v.names="Time", timevar="Tag", idvar="Day", direction="wide") ----- -- gwern http://www.gwern.net -------------- next part -------------- A non-text attachment was scrubbed... Name: arbtt.page Type: application/octet-stream Size: 19514 bytes Desc: not available URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140903/12431bea/attachment.obj> From gwern at gwern.net Sat Sep 6 04:43:07 2014 From: gwern at gwern.net (Gwern Branwen) Date: Fri, 5 Sep 2014 22:43:07 -0400 Subject: Listing samples which are not matched by any tags? In-Reply-To: <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> References: <CAMwO0gzuXhnoZOp-1yQE7mrpMar1HJ5tNBwzzmUZqg8dup_YaQ@mail.gmail.com> <1383508486.4978.2.camel@kirk> <CAMwO0gzGJE-PhVccnO2XYTn7-GN0eQ+Qnqp+9OU3knjCxcZE8w@mail.gmail.com> <1383517885.12868.10.camel@kirk> <CAMwO0gyw7W45WvCWabW8LXLhUf1bCwuukrfbx2B05x9SmzdUwQ@mail.gmail.com> <CAMwO0gwVjUfvKJeJdJuAoa3=OtqxeUpiXvKxGvB=foLJcTry7Q@mail.gmail.com> <1404560554.4066.7.camel@kirk> <CAMwO0gwZ69sybqg3keiRJrwqzMYWiWa36J9wJvC3VhORHxNqjg@mail.gmail.com> <1404578679.5548.2.camel@kirk> <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> Message-ID: <CAMwO0gx4vc2jkzWpYp5Ayv7DzYWf5JbcKxRAtMvLiurOatorDg@mail.gmail.com> On Wed, Sep 3, 2014 at 3:32 PM, Gwern Branwen <gwern at gwern.net> wrote: > 3. don't try to classify everything > > You will never classify 100% of samples because sometimes programs > do not include useful X properties & cannot be fixed, you have samples > from before you fixed them, or they are too transient (like popups and > dialogues) to be worth fixing. It is not necessary to classify 100% of > your time, since as long as the most common programs and, say, > [80%](https://en.wikipedia.org/wiki/Pareto_principle) of your time is > classified, then you have most of the value. It is easy to waste more > time tweaking arbtt than one gains from increased accuracy or more > finely-grained tags. A fourth guideline just occurred to me. 4. avoid large and microscopic tags If a tag takes up more than a third or so of your time, it is probably too large, masks variation, and can be broken down into more meaningful tags. Conversely, a tag too narrow to show up regularly in reports (because it is below the default 1% filter) may not be helpful because it is usually tiny, and can be combined with the most similar tag to yield more compact and easily interpreted reports. -- gwern http://www.gwern.net From mail at joachim-breitner.de Sun Sep 14 23:12:41 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sun, 14 Sep 2014 23:12:41 +0200 Subject: Listing samples which are not matched by any tags? In-Reply-To: <CAMwO0gymLW37NUJmtp5uCavnVJsK83o1B-BFmOgwAqz9Z4r88w@mail.gmail.com> References: <CAMwO0gzuXhnoZOp-1yQE7mrpMar1HJ5tNBwzzmUZqg8dup_YaQ@mail.gmail.com> <1383508486.4978.2.camel@kirk> <CAMwO0gzGJE-PhVccnO2XYTn7-GN0eQ+Qnqp+9OU3knjCxcZE8w@mail.gmail.com> <1383517885.12868.10.camel@kirk> <CAMwO0gyw7W45WvCWabW8LXLhUf1bCwuukrfbx2B05x9SmzdUwQ@mail.gmail.com> <CAMwO0gwVjUfvKJeJdJuAoa3=OtqxeUpiXvKxGvB=foLJcTry7Q@mail.gmail.com> <1404560554.4066.7.camel@kirk> <CAMwO0gwZ69sybqg3keiRJrwqzMYWiWa36J9wJvC3VhORHxNqjg@mail.gmail.com> <1404578679.5548.2.camel@kirk> <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> <CAMwO0gymLW37NUJmtp5uCavnVJsK83o1B-BFmOgwAqz9Z4r88w@mail.gmail.com> Message-ID: <1410729161.2862.10.camel@joachim-breitner.de> Hi Gwern, Am Sonntag, den 14.09.2014, 17:06 -0400 schrieb Gwern Branwen: > Any thoughts on this? The guide was helpful to at least one new arbtt > user I gave a link to. I?m terribly sorry for not replying to your mails in time, and the three unread mails in the arbtt folder keep reminding me of that, but other things keep (including travel to DebConf and ICFP, and the work that had to wait for that) having a higher priority. I?ll set my mind to working through them this week. Thanks for your patience, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140914/68fa4697/attachment.asc> From gwern at gwern.net Sun Sep 14 23:06:35 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sun, 14 Sep 2014 17:06:35 -0400 Subject: Listing samples which are not matched by any tags? In-Reply-To: <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> References: <CAMwO0gzuXhnoZOp-1yQE7mrpMar1HJ5tNBwzzmUZqg8dup_YaQ@mail.gmail.com> <1383508486.4978.2.camel@kirk> <CAMwO0gzGJE-PhVccnO2XYTn7-GN0eQ+Qnqp+9OU3knjCxcZE8w@mail.gmail.com> <1383517885.12868.10.camel@kirk> <CAMwO0gyw7W45WvCWabW8LXLhUf1bCwuukrfbx2B05x9SmzdUwQ@mail.gmail.com> <CAMwO0gwVjUfvKJeJdJuAoa3=OtqxeUpiXvKxGvB=foLJcTry7Q@mail.gmail.com> <1404560554.4066.7.camel@kirk> <CAMwO0gwZ69sybqg3keiRJrwqzMYWiWa36J9wJvC3VhORHxNqjg@mail.gmail.com> <1404578679.5548.2.camel@kirk> <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> Message-ID: <CAMwO0gymLW37NUJmtp5uCavnVJsK83o1B-BFmOgwAqz9Z4r88w@mail.gmail.com> On Wed, Sep 3, 2014 at 3:32 PM, Gwern Branwen <gwern at gwern.net> wrote: > Maybe. I may not be an arbtt developer, but I'm still not a regular > user. Regardless, I think some of the tricks and observations I made > while working with arbtt are worth including in the manual; the manual > currently gives one little idea how one would actually go about > effectively using arbtt. I wrote up some thoughts in Markdown (below) > - I have never used that XML stuff you are using and would probably > muck it up, so hopefully you can convert the Markdown version to XML. > > If other arbtt users could mention various roadblocks and solutions > they came up with, that'd be helpful. > > ... Any thoughts on this? The guide was helpful to at least one new arbtt user I gave a link to. -- gwern http://www.gwern.net From gwern at gwern.net Sun Sep 14 23:26:20 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sun, 14 Sep 2014 17:26:20 -0400 Subject: Listing samples which are not matched by any tags? In-Reply-To: <1410729161.2862.10.camel@joachim-breitner.de> References: <CAMwO0gzuXhnoZOp-1yQE7mrpMar1HJ5tNBwzzmUZqg8dup_YaQ@mail.gmail.com> <1383508486.4978.2.camel@kirk> <CAMwO0gzGJE-PhVccnO2XYTn7-GN0eQ+Qnqp+9OU3knjCxcZE8w@mail.gmail.com> <1383517885.12868.10.camel@kirk> <CAMwO0gyw7W45WvCWabW8LXLhUf1bCwuukrfbx2B05x9SmzdUwQ@mail.gmail.com> <CAMwO0gwVjUfvKJeJdJuAoa3=OtqxeUpiXvKxGvB=foLJcTry7Q@mail.gmail.com> <1404560554.4066.7.camel@kirk> <CAMwO0gwZ69sybqg3keiRJrwqzMYWiWa36J9wJvC3VhORHxNqjg@mail.gmail.com> <1404578679.5548.2.camel@kirk> <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> <CAMwO0gymLW37NUJmtp5uCavnVJsK83o1B-BFmOgwAqz9Z4r88w@mail.gmail.com> <1410729161.2862.10.camel@joachim-breitner.de> Message-ID: <CAMwO0gxzDRdr6JmSZ4TQRu5JGsaV73se20vReGhmFhnvWTFpLg@mail.gmail.com> On Sun, Sep 14, 2014 at 5:12 PM, Joachim Breitner <mail at joachim-breitner.de> wrote: > I?m terribly sorry for not replying to your mails in time, and the three > unread mails in the arbtt folder keep reminding me of that, but other > things keep (including travel to DebConf and ICFP, and the work that had > to wait for that) having a higher priority. Alright. I wasn't sure if you were even getting my emails since they seemed to be nonexistent & eaten by a filter for the list when I checked. -- gwern http://www.gwern.net From mail at joachim-breitner.de Wed Sep 17 11:45:06 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 17 Sep 2014 11:45:06 +0200 Subject: Example of time-tracking In-Reply-To: <CAMwO0gyB5ra3HSGncDtK-ZpKFbhSi6xTERpBtnp2Kw8DD0zA2g@mail.gmail.com> References: <CAMwO0gywpXtOkMp1VqA+SKdS3am-NaivADS09_UskpB-CcNuNQ@mail.gmail.com> <CAMwO0gyB5ra3HSGncDtK-ZpKFbhSi6xTERpBtnp2Kw8DD0zA2g@mail.gmail.com> Message-ID: <1410947105.2613.2.camel@joachim-breitner.de> Hi Gwern, on a train ride now, so enough time to work through your (very welcome!) messages. Am Samstag, den 16.08.2014, 18:20 -0400 schrieb Gwern Branwen: > On Sat, Aug 16, 2014 at 5:02 PM, Gwern Branwen <gwern at gwern.net> wrote: > >> - I work more and often in the morning (segmenting results into day slots? dawn, wee hours, morning, mid-morning, afternoon, evening?yielded even more interesting results?) > > I thought I'd try to extract my own 24-h data from arbtt and see how > my factor-analysis fared, but it looks like this can't be done with > arbtt right now: I need categorization of data usage by minute or by > hour, but it seems arbtt only supports extracting data to csv by > chunks of day/month/year > > --for-each=PERIOD one of: day, month, year > > Is there a workaround here? Or does --for-each need to be extended? I > think it would be enough to add 'minute' as a period, since arbtt > isn't generally used more fine-grained than that and I can aggregate > by hour in R if it turns out that there's not enough data for > plotting/regressing by minute over 24h. The easiest is to extend for-each; just did that with minute and hour. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140917/38d7f2f1/attachment.asc> From mail at joachim-breitner.de Wed Sep 17 12:34:13 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 17 Sep 2014 12:34:13 +0200 Subject: Listing samples which are not matched by any tags? In-Reply-To: <CAMwO0gx4vc2jkzWpYp5Ayv7DzYWf5JbcKxRAtMvLiurOatorDg@mail.gmail.com> References: <CAMwO0gzuXhnoZOp-1yQE7mrpMar1HJ5tNBwzzmUZqg8dup_YaQ@mail.gmail.com> <1383508486.4978.2.camel@kirk> <CAMwO0gzGJE-PhVccnO2XYTn7-GN0eQ+Qnqp+9OU3knjCxcZE8w@mail.gmail.com> <1383517885.12868.10.camel@kirk> <CAMwO0gyw7W45WvCWabW8LXLhUf1bCwuukrfbx2B05x9SmzdUwQ@mail.gmail.com> <CAMwO0gwVjUfvKJeJdJuAoa3=OtqxeUpiXvKxGvB=foLJcTry7Q@mail.gmail.com> <1404560554.4066.7.camel@kirk> <CAMwO0gwZ69sybqg3keiRJrwqzMYWiWa36J9wJvC3VhORHxNqjg@mail.gmail.com> <1404578679.5548.2.camel@kirk> <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> <CAMwO0gx4vc2jkzWpYp5Ayv7DzYWf5JbcKxRAtMvLiurOatorDg@mail.gmail.com> Message-ID: <1410950053.2613.8.camel@joachim-breitner.de> Hi, Am Freitag, den 05.09.2014, 22:43 -0400 schrieb Gwern Branwen: > A fourth guideline just occurred to me. again thanks, added! (Personally, I don?t mind such tags, e.g. I do have a tag for browsing the web. As tags are not mutually exclusive, this does not in any way interfere with more useful tags.) Greetings, Joachim -- Joachim Breitner e-Mail: mail at joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata at joachim-breitner.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140917/bcfc4a51/attachment.asc> From mail at joachim-breitner.de Wed Sep 17 12:24:06 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Wed, 17 Sep 2014 12:24:06 +0200 Subject: Listing samples which are not matched by any tags? In-Reply-To: <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> References: <CAMwO0gzuXhnoZOp-1yQE7mrpMar1HJ5tNBwzzmUZqg8dup_YaQ@mail.gmail.com> <1383508486.4978.2.camel@kirk> <CAMwO0gzGJE-PhVccnO2XYTn7-GN0eQ+Qnqp+9OU3knjCxcZE8w@mail.gmail.com> <1383517885.12868.10.camel@kirk> <CAMwO0gyw7W45WvCWabW8LXLhUf1bCwuukrfbx2B05x9SmzdUwQ@mail.gmail.com> <CAMwO0gwVjUfvKJeJdJuAoa3=OtqxeUpiXvKxGvB=foLJcTry7Q@mail.gmail.com> <1404560554.4066.7.camel@kirk> <CAMwO0gwZ69sybqg3keiRJrwqzMYWiWa36J9wJvC3VhORHxNqjg@mail.gmail.com> <1404578679.5548.2.camel@kirk> <CAMwO0gyVmeRYoT3Xr+APx2wPNC_dZ1hjzMjZE1EP+jF21L6mCw@mail.gmail.com> Message-ID: <1410949446.2613.5.camel@joachim-breitner.de> Hi, Am Mittwoch, den 03.09.2014, 15:32 -0400 schrieb Gwern Branwen: > On Sat, Jul 5, 2014 at 12:44 PM, Joachim Breitner > <mail at joachim-breitner.de> wrote: > > No, but the manual is quite short on ?how do I do X?. Would you be > > interesting in contributing here? I think that _not_ being a developer > > is an advantage when writing good documentation. > > Maybe. I may not be an arbtt developer, but I'm still not a regular > user. Not sure what you mean. As far as I know, you might actually be the only user :-) > Regardless, I think some of the tricks and observations I made > while working with arbtt are worth including in the manual; the manual > currently gives one little idea how one would actually go about > effectively using arbtt. I wrote up some thoughts in Markdown (below) Great, thanks a lot! I have included it (conversion was very easy, thanks to the amazing "pandoc"), and did one round of copy-editing ? as a separate patch, if you want to review the changes. In particular, I found it nicer to address the user directly with "you", instead of "one". > - I have never used that XML stuff you are using and would probably > muck it up, so hopefully you can convert the Markdown version to XML. Don?t be afraid of editing the docbook directly, it?s rather simple, and running "make" will tell if you broke something, and immediately give you the HTML output if not. And it has some possibilities that pandoc does not have, such as linking to other sections, special mark-up for examples etc. And if in doubt, you can still use markdown and convert it using pandoc :-) Anyways, thanks a lot! Joachim -- Joachim Breitner e-Mail: mail at joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata at joachim-breitner.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140917/c2e35893/attachment.asc> From gwern at gwern.net Fri Sep 26 04:57:28 2014 From: gwern at gwern.net (Gwern Branwen) Date: Thu, 25 Sep 2014 22:57:28 -0400 Subject: Configuring arbtt guide, Re: Listing samples which are not matched by any tags? Message-ID: <CAMwO0gwCRZuZUM_cn0Jgr9m0BwhUVELFEg8rbzdyD6oHrwknsQ@mail.gmail.com> On Wed, Sep 17, 2014 at 6:34 AM, Joachim Breitner <mail at joachim-breitner.de> wrote: > again thanks, added! I just took a look at http://arbtt.nomeata.de/doc/users_guide/effective-use.html - the guide looks good and I hope people find it helpful. While looking it over, I noticed some typos and other issues, so I've sent in another patch. -- gwern http://www.gwern.net From gwern0 at gmail.com Fri Sep 26 04:56:11 2014 From: gwern0 at gmail.com (gwern0 at gmail.com) Date: Thu, 25 Sep 2014 19:56:11 -0700 (PDT) Subject: darcs patch: arbtt.xml: cpedit based on reading live version http:/... Message-ID: <5424d5cb.e61b8c0a.2185.ffff96ac@mx.google.com> 1 patch for repository http://darcs.nomeata.de/arbtt: Thu Sep 25 22:55:47 EDT 2014 gwern at gwern.net * arbtt.xml: cpedit based on reading live version http://arbtt.nomeata.de/doc/users_guide/effective-use.html -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-preview.txt Type: text/x-darcs-patch Size: 21102 bytes Desc: Patch preview URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140925/d5835c7a/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: arbtt_xml_-cpedit-based-on-reading-live-version-http___arbtt_nomeata_de_doc_users_guide_effective_use_html.dpatch Type: application/x-darcs-patch Size: 23615 bytes Desc: A darcs patch for your repository! URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140925/d5835c7a/attachment-0001.bin> From gwern at gwern.net Sat Sep 27 22:58:18 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sat, 27 Sep 2014 16:58:18 -0400 Subject: Installation documentation: what are the dependencies for compiling arbtt? Message-ID: <CAMwO0gwwuc+9FzZeuqTS05w1b3FuLk1LfQX0YtMUgVoQ+gX5xw@mail.gmail.com> An acquaintance wanted to use --dump-samples but her arbtt was too old (apparently the Ubuntu package is well behind the times?) and so installed a fresh one from Hackage when a compile error hit: cabal: Error: some packages failed to install: X11-1.6.1.2 failed during the configure step. The exception was: ExitFailure 1 As many Xmonad users know, this is the common problem of having not installed the headers/-dev package for the X11 library, which then makes the X11 Haskell bindings impossible to compile, which makes arbtt impossible to compile: http://www.haskell.org/haskellwiki/Xmonad/Frequently_asked_questions#Missing_X11_headers The solution is to install libx11-dev on Debian or xorg-dev (which isn't mentioned in http://arbtt.nomeata.de/#install ). But are there any other foreign dependencies which need to be mentioned for people installing from source? -- gwern http://www.gwern.net From gwern0 at gmail.com Sat Sep 27 23:17:54 2014 From: gwern0 at gmail.com (gwern0 at gmail.com) Date: Sat, 27 Sep 2014 14:17:54 -0700 (PDT) Subject: darcs patch: arbtt.xml: cpedit based on reading live ... (and 1 more) Message-ID: <54272982.853ce00a.182f.4fa5@mx.google.com> 2 patches for repository http://darcs.nomeata.de/arbtt: Thu Sep 25 22:55:47 EDT 2014 gwern at gwern.net * arbtt.xml: cpedit based on reading live version http://arbtt.nomeata.de/doc/users_guide/effective-use.html Sat Sep 27 17:11:46 EDT 2014 gwern at gwern.net * arbtt.cabal: specify bug tracker location for people looking at Hackage page for documentation -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-preview.txt Type: text/x-darcs-patch Size: 22690 bytes Desc: Patch preview URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140927/01c6bcae/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: arbtt_xml_-cpedit-based-on-reading-live-version-http___arbtt_nomeata_de_doc_users_guide_effective_use_html.dpatch Type: application/x-darcs-patch Size: 25203 bytes Desc: A darcs patch for your repository! URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140927/01c6bcae/attachment-0001.bin> From mail at joachim-breitner.de Mon Sep 29 14:51:04 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 29 Sep 2014 14:51:04 +0200 Subject: darcs patch: arbtt.xml: cpedit based on reading live ... (and 1 more) In-Reply-To: <54272982.853ce00a.182f.4fa5@mx.google.com> References: <54272982.853ce00a.182f.4fa5@mx.google.com> Message-ID: <1411995064.11958.2.camel@joachim-breitner.de> Hi, Am Samstag, den 27.09.2014, 14:17 -0700 schrieb gwern0 at gmail.com: > 2 patches for repository http://darcs.nomeata.de/arbtt: > > Thu Sep 25 22:55:47 EDT 2014 gwern at gwern.net > * arbtt.xml: cpedit based on reading live version http://arbtt.nomeata.de/doc/users_guide/effective-use.html > > Sat Sep 27 17:11:46 EDT 2014 gwern at gwern.net > * arbtt.cabal: specify bug tracker location for people looking at Hackage page for documentation thanks, both applied. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140929/d8922734/attachment.asc> From mail at joachim-breitner.de Mon Sep 29 14:53:04 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 29 Sep 2014 14:53:04 +0200 Subject: Installation documentation: what are the dependencies for compiling arbtt? In-Reply-To: <CAMwO0gwwuc+9FzZeuqTS05w1b3FuLk1LfQX0YtMUgVoQ+gX5xw@mail.gmail.com> References: <CAMwO0gwwuc+9FzZeuqTS05w1b3FuLk1LfQX0YtMUgVoQ+gX5xw@mail.gmail.com> Message-ID: <1411995184.11958.4.camel@joachim-breitner.de> Hi, Am Samstag, den 27.09.2014, 16:58 -0400 schrieb Gwern Branwen: > An acquaintance wanted to use --dump-samples but her arbtt was too old > (apparently the Ubuntu package is well behind the times?) and so > installed a fresh one from Hackage when a compile error hit: > > cabal: Error: some packages failed to install: > X11-1.6.1.2 failed during the configure step. The exception was: > ExitFailure 1 > > As many Xmonad users know, this is the common problem of having not > installed the headers/-dev package for the X11 library, which then > makes the X11 Haskell bindings impossible to compile, which makes > arbtt impossible to compile: > http://www.haskell.org/haskellwiki/Xmonad/Frequently_asked_questions#Missing_X11_headers Well spotted. > The solution is to install libx11-dev on Debian or xorg-dev (which > isn't mentioned in http://arbtt.nomeata.de/#install ). I guess you?ll send a patch for that soon? :-) > But are there any other foreign dependencies which need to be > mentioned for people installing from source? Probably libpcre3-dev for pcre-light, and libxss-dev for arbtt itself. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140929/4bc423a0/attachment.asc> From gwern0 at gmail.com Mon Sep 29 21:11:35 2014 From: gwern0 at gmail.com (gwern0 at gmail.com) Date: Mon, 29 Sep 2014 12:11:35 -0700 (PDT) Subject: darcs patch: arbtt.html: split install section into binary vs sourc... Message-ID: <5429aee7.c306e00a.3091.ffffeadf@mx.google.com> 1 patch for repository http://darcs.nomeata.de/arbtt: Mon Sep 29 15:10:57 EDT 2014 gwern at gwern.net * arbtt.html: split install section into binary vs source, and list non-haskell dependencies -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-preview.txt Type: text/x-darcs-patch Size: 19219 bytes Desc: Patch preview URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140929/1d740bea/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: arbtt_html_-split-install-section-into-binary-vs-source_-and-list-non_haskell-dependencies.dpatch Type: application/x-darcs-patch Size: 22100 bytes Desc: A darcs patch for your repository! URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20140929/1d740bea/attachment-0001.bin> From rejuvyesh at gmail.com Fri Nov 28 11:19:02 2014 From: rejuvyesh at gmail.com (rejuvyesh) Date: Fri, 28 Nov 2014 15:49:02 +0530 Subject: Visualizing Daily Usage Statistics Message-ID: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> Greetings! I wrote a `d3.js` based visualization for daily arbtt usage. Currently just works on very simple categorization. Thought could be helpful to the community. Suggestions (especially to make it more generalized) are most welcome: https://github.com/rejuvyesh/dailystats For a demo see: http://rejuvyesh.com/dailystats/ Hope you all find it useful. --- rejuvyesh http://rejuvyesh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141128/a223d8e3/attachment.htm> From mail at joachim-breitner.de Sun Nov 30 19:17:00 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sun, 30 Nov 2014 19:17:00 +0100 Subject: Visualizing Daily Usage Statistics In-Reply-To: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> References: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> Message-ID: <1417371420.3101.7.camel@joachim-breitner.de> Hi, Am Freitag, den 28.11.2014, 15:49 +0530 schrieb rejuvyesh: > I wrote a `d3.js` based visualization for daily arbtt usage. Currently > just works on very simple categorization. Thought could be helpful to > the community. Suggestions (especially to make it more generalized) > are most welcome: > > https://github.com/rejuvyesh/dailystats > this looks very cool, thanks for sharing! I like how this is the beginning of small ecosystem around arbtt, where you don?t have to wait for me to implement your particular feature. If I run it with my own `categorize.cfg` it seems to miss something about ?totaltime?. Can you document what special tags your tool demands? Or maybe we can find a way for you to get that information some other way? Are there other assumptions made, e.g. that tags are assigned exclusively? The current setup with the gh-pages branch is a bit strange. It would be nice if the user could simply run one command and get usable output in ./out or somewhere. It seems that opening the html files from the local path does not help, at least not here. A neat trick is to run python -m SimpleHTTPServer in that directory. Maybe worth adding to the README? Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141130/2a0810b0/attachment.asc> From rejuvyesh at gmail.com Mon Dec 1 15:32:15 2014 From: rejuvyesh at gmail.com (rejuvyesh) Date: Mon, 1 Dec 2014 20:02:15 +0530 Subject: Visualizing Daily Usage Statistics In-Reply-To: <1417371420.3101.7.camel@joachim-breitner.de> References: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> <1417371420.3101.7.camel@joachim-breitner.de> Message-ID: <CAFqH0N=7CPvsrVprbUqtzwR7GqjeWbhO0cvW49nT3UFPyfHUjQ@mail.gmail.com> Thanks a lot for your kind comments. Currently there are a lot of issues with `categorize.cfg` handling. We can define tags such that they never add up to 100% or add up to more than 100% of the total time arbtt has captured. Handling that, generally enough seems quite complicated (any suggestions are of course welcome). This is the reason, even I am not using my default `categorize.cfg` file here. So you are correct that the tags are assumed to form an exclusive set. You are also right that I should include the rendering directory right in the master branch. I will do it right now. I am still pretty new to haskell. I was planning to write a complete app in haskell handling both parsing and serving it to web. But writing a simple parser was lot more easier in python :( . I hope to do that some day ^TM. Till then I had some issues, like handling things which I haven't been able to track with tags and want to tag as say "others". Right now I would have to exclude every other tag and then find them to add them to the `json` files. It would be really great if there was some way to add to the config file to automatically mark everything untagged as say "others". For now I'll add the rendering files to a separate folder and add a simple python serving script. Cheers! --- rejuvyesh http://rejuvyesh.com On Sun, Nov 30, 2014 at 11:47 PM, Joachim Breitner <mail at joachim-breitner.de > wrote: > Hi, > > > Am Freitag, den 28.11.2014, 15:49 +0530 schrieb rejuvyesh: > > > I wrote a `d3.js` based visualization for daily arbtt usage. Currently > > just works on very simple categorization. Thought could be helpful to > > the community. Suggestions (especially to make it more generalized) > > are most welcome: > > > > https://github.com/rejuvyesh/dailystats > > > > this looks very cool, thanks for sharing! > > I like how this is the beginning of small ecosystem around arbtt, where > you don?t have to wait for me to implement your particular feature. > > > If I run it with my own `categorize.cfg` it seems to miss something > about ?totaltime?. Can you document what special tags your tool demands? > Or maybe we can find a way for you to get that information some other > way? > > Are there other assumptions made, e.g. that tags are assigned > exclusively? > > The current setup with the gh-pages branch is a bit strange. It would be > nice if the user could simply run one command and get usable output > in ./out or somewhere. > > It seems that opening the html files from the local path does not help, > at least not here. A neat trick is to run > python -m SimpleHTTPServer > in that directory. Maybe worth adding to the README? > > > Greetings, > Joachim > > > -- > Joachim ?nomeata? Breitner > mail at joachim-breitner.de ? http://www.joachim-breitner.de/ > Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F > Debian Developer: nomeata at debian.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141201/d86fa497/attachment.htm> From mail at joachim-breitner.de Mon Dec 1 15:45:55 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 01 Dec 2014 15:45:55 +0100 Subject: Visualizing Daily Usage Statistics In-Reply-To: <CAFqH0N=7CPvsrVprbUqtzwR7GqjeWbhO0cvW49nT3UFPyfHUjQ@mail.gmail.com> References: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> <1417371420.3101.7.camel@joachim-breitner.de> <CAFqH0N=7CPvsrVprbUqtzwR7GqjeWbhO0cvW49nT3UFPyfHUjQ@mail.gmail.com> Message-ID: <1417445155.2364.4.camel@joachim-breitner.de> Hi, Am Montag, den 01.12.2014, 20:02 +0530 schrieb rejuvyesh: > Currently there are a lot of issues with `categorize.cfg` handling. We > can define tags such that they never add up to 100% or add up to more > than 100% of the total time arbtt has captured. Handling that, > generally enough seems quite complicated (any suggestions are of > course welcome). This is the reason, even I am not using my default > `categorize.cfg` file here. So you are correct that the tags are > assumed to form an exclusive set. I think this can be solved by categories. How about this: The user is expected to assign tags to a specific category (e.g. Graph:). Your tool will simply ignore all other tags, and just plots all tags of the form to Graph:*. This should guarantee that they never add up to more than 100%. This may also solve the problem of measuring the unmatched time, at least if you build on the output of "arbtt-stats --category Graph" Bonus points for making the name of the category configurable with a command line argument to your tool, and maybe even supporting multiple pie charts next to each other. > Till then I had some issues, like handling things which I haven't been > able to track with tags and want to tag as say "others". Right now I > would have to exclude every other tag and then find them to add them > to the `json` files. It would be really great if there was some way to > add to the config file to automatically mark everything untagged as > say "others". If you use the "arbtt-stats --category" report, it will report unmatched time. In fact, with $ arbtt-stats --category Program --for-each=day --output-format csv you should get the numbers for the pie charts directly, without further processing For the other reports, where tags are generally overlapping an automatic "other" tag might be less useful. Greetings, Joachim > -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141201/63ea1b12/attachment.asc> From mail at joachim-breitner.de Mon Dec 1 16:20:18 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 01 Dec 2014 16:20:18 +0100 Subject: Visualizing Daily Usage Statistics In-Reply-To: <CAFqH0Nm3no6EPZRPRJxhTeaP_R5M3PueQDpWuyZOcgWtJJaPqA@mail.gmail.com> References: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> <1417371420.3101.7.camel@joachim-breitner.de> <CAFqH0N=7CPvsrVprbUqtzwR7GqjeWbhO0cvW49nT3UFPyfHUjQ@mail.gmail.com> <1417445155.2364.4.camel@joachim-breitner.de> <CAFqH0Nm3no6EPZRPRJxhTeaP_R5M3PueQDpWuyZOcgWtJJaPqA@mail.gmail.com> Message-ID: <1417447218.2364.6.camel@joachim-breitner.de> Hi, Am Montag, den 01.12.2014, 20:36 +0530 schrieb rejuvyesh: > Categories are fantastic for pie charts. Although it seems they > haven't been implemented for `--for-each=minute`. It does here: $ ./dist/build/arbtt-stats/arbtt-stats --category Program \ --for-each=minute --filter '$sampleage < 0:10' Statistics for category "Program" (Day 2014-12-01 16:05) ======================================================== ___________________Tag_|___Time_|_Percentage_ Program:gnome-terminal | 1m00s | 100.00 Statistics for category "Program" (Day 2014-12-01 16:06) ======================================================== ______________Tag_|___Time_|_Percentage_ Program:Navigator | 1m00s | 100.00 Statistics for category "Program" (Day 2014-12-01 16:07) ======================================================== ______________Tag_|___Time_|_Percentage_ Program:evolution | 1m00s | 100.00 Statistics for category "Program" (Day 2014-12-01 16:09) ======================================================== ___________________Tag_|___Time_|_Percentage_ Program:gnome-terminal | 1m00s | 100.00 Statistics for category "Program" (Day 2014-12-01 16:10) ======================================================== ______________Tag_|___Time_|_Percentage_ Program:evolution | 1m00s | 100.00 Statistics for category "Program" (Day 2014-12-01 16:11) ======================================================== ______________Tag_|___Time_|_Percentage_ Program:evolution | 1m00s | 100.00 Statistics for category "Program" (Day 2014-12-01 16:12) ======================================================== ______________Tag_|___Time_|_Percentage_ Program:evolution | 1m00s | 100.00 But why would you need "--for-each=minute" for the per-day pie chart? Greetings, Joachim > -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141201/3b9f12d9/attachment.asc> From rejuvyesh at gmail.com Mon Dec 1 16:29:52 2014 From: rejuvyesh at gmail.com (rejuvyesh) Date: Mon, 1 Dec 2014 20:59:52 +0530 Subject: Visualizing Daily Usage Statistics In-Reply-To: <1417447218.2364.6.camel@joachim-breitner.de> References: <CAFqH0N=XevA5RJ4Y8FNoyi-fBM_5Ao=B=V_GqSZ2sTZyjw0-Lw@mail.gmail.com> <1417371420.3101.7.camel@joachim-breitner.de> <CAFqH0N=7CPvsrVprbUqtzwR7GqjeWbhO0cvW49nT3UFPyfHUjQ@mail.gmail.com> <1417445155.2364.4.camel@joachim-breitner.de> <CAFqH0Nm3no6EPZRPRJxhTeaP_R5M3PueQDpWuyZOcgWtJJaPqA@mail.gmail.com> <1417447218.2364.6.camel@joachim-breitner.de> Message-ID: <CAFqH0NkEZxGE4=yfGLAOxbHUFY7cHdCq-A3CPsqSUyKD54_4Wg@mail.gmail.com> Ah yes it does. It's needed for the barcode plot just below the bar charts - visualize what exactly you were doing at some time of day. Right now I have separated them as "productive" or "unproductive", but that can be defined in `settings.js` :) Thanks! On Mon, Dec 1, 2014 at 8:50 PM, Joachim Breitner <mail at joachim-breitner.de> wrote: > Hi, > > > Am Montag, den 01.12.2014, 20:36 +0530 schrieb rejuvyesh: > > Categories are fantastic for pie charts. Although it seems they > > haven't been implemented for `--for-each=minute`. > > It does here: > > $ ./dist/build/arbtt-stats/arbtt-stats --category Program \ > --for-each=minute --filter '$sampleage < 0:10' > > Statistics for category "Program" (Day 2014-12-01 16:05) > ======================================================== > ___________________Tag_|___Time_|_Percentage_ > Program:gnome-terminal | 1m00s | 100.00 > > Statistics for category "Program" (Day 2014-12-01 16:06) > ======================================================== > ______________Tag_|___Time_|_Percentage_ > Program:Navigator | 1m00s | 100.00 > > Statistics for category "Program" (Day 2014-12-01 16:07) > ======================================================== > ______________Tag_|___Time_|_Percentage_ > Program:evolution | 1m00s | 100.00 > > Statistics for category "Program" (Day 2014-12-01 16:09) > ======================================================== > ___________________Tag_|___Time_|_Percentage_ > Program:gnome-terminal | 1m00s | 100.00 > > Statistics for category "Program" (Day 2014-12-01 16:10) > ======================================================== > ______________Tag_|___Time_|_Percentage_ > Program:evolution | 1m00s | 100.00 > > Statistics for category "Program" (Day 2014-12-01 16:11) > ======================================================== > ______________Tag_|___Time_|_Percentage_ > Program:evolution | 1m00s | 100.00 > > Statistics for category "Program" (Day 2014-12-01 16:12) > ======================================================== > ______________Tag_|___Time_|_Percentage_ > Program:evolution | 1m00s | 100.00 > > > > But why would you need "--for-each=minute" for the per-day pie chart? > > > Greetings, > Joachim > > > > -- > Joachim ?nomeata? Breitner > mail at joachim-breitner.de ? http://www.joachim-breitner.de/ > Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F > Debian Developer: nomeata at debian.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141201/aaab2850/attachment.htm> From rejuvyesh at gmail.com Mon Dec 1 19:20:16 2014 From: rejuvyesh at gmail.com (rejuvyesh) Date: Mon, 1 Dec 2014 23:50:16 +0530 Subject: Visualizing arbtt-stats Message-ID: <CAFqH0NkHwVEk+byfZmXZpHKHPtahEgyd-C5748LOkF0tvUvG6g@mail.gmail.com> Greetings! With numerous suggestions from Joachim, the graphing tool is ready to use. See https://github.com/rejuvyesh/arbtt-graph Let me know if you have any issues using it. Currently works with only a single category. --- rejuvyesh http://rejuvyesh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141201/e74307f3/attachment.htm> From gwern0 at gmail.com Mon Dec 8 20:12:11 2014 From: gwern0 at gmail.com (gwern0 at gmail.com) Date: Mon, 08 Dec 2014 11:12:11 -0800 (PST) Subject: darcs patch: stats-main.hs: --min-percentage docs: consistent abbre... Message-ID: <5485f80b.e4538c0a.7fac.ffffb358@mx.google.com> 1 patch for repository http://darcs.nomeata.de/arbtt: Mon Dec 8 14:11:28 EST 2014 gwern at gwern.net * stats-main.hs: --min-percentage docs: consistent abbreviation -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-preview.txt Type: text/x-darcs-patch Size: 1078 bytes Desc: Patch preview URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141208/d5a94648/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: stats_main_hs_-__min_percentage-docs_-consistent-abbreviation.dpatch Type: application/x-darcs-patch Size: 4275 bytes Desc: A darcs patch for your repository! URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141208/d5a94648/attachment-0001.bin> From mail at joachim-breitner.de Mon Dec 8 20:36:52 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 08 Dec 2014 20:36:52 +0100 Subject: darcs patch: stats-main.hs: --min-percentage docs: consistent abbre... In-Reply-To: <5485f80b.e4538c0a.7fac.ffffb358@mx.google.com> References: <5485f80b.e4538c0a.7fac.ffffb358@mx.google.com> Message-ID: <1418067412.26842.0.camel@joachim-breitner.de> Hi, Am Montag, den 08.12.2014, 11:12 -0800 schrieb gwern0 at gmail.com: > 1 patch for repository http://darcs.nomeata.de/arbtt: > > Mon Dec 8 14:11:28 EST 2014 gwern at gwern.net > * stats-main.hs: --min-percentage docs: consistent abbreviation thanks, applied. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141208/f1def5e3/attachment.asc> From mail at joachim-breitner.de Mon Dec 8 21:03:04 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 08 Dec 2014 21:03:04 +0100 Subject: Visualizing arbtt-stats In-Reply-To: <CAFqH0NkHwVEk+byfZmXZpHKHPtahEgyd-C5748LOkF0tvUvG6g@mail.gmail.com> References: <CAFqH0NkHwVEk+byfZmXZpHKHPtahEgyd-C5748LOkF0tvUvG6g@mail.gmail.com> Message-ID: <1418068984.26842.1.camel@joachim-breitner.de> Hi, Am Montag, den 01.12.2014, 23:50 +0530 schrieb rejuvyesh: > With numerous suggestions from Joachim, the graphing tool is ready to > use. > > See https://github.com/rejuvyesh/arbtt-graph > nice. I added a section to the User?s guide about contributed tools, and included yours there: http://arbtt.nomeata.de/doc/users_guide/contributed.html Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141208/03dcf1a1/attachment.asc> From gwern at gwern.net Sun Dec 14 21:39:43 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sun, 14 Dec 2014 15:39:43 -0500 Subject: arbtt: use of database like sqlite3? Message-ID: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> So I was waiting on arbtt to --dump-samples for the past 100 hours to write a rule classifying a web serial I read as recreational, and I began wondering: what is arbtt doing that it takes so long? Is it because of the log structure that it has to read through, parse, and classify my full 85M arbtt log just to get the last 100 hours of data? I know from working with an 18GB sqlite3 db for Mnemosyne that date range queries in databases can be *extremely* fast, and arbtt-capture dumping into a db would probably be more reliable and durable (ACID rather than arbtt-recover), and sqlite3 has had multiple Haskell bindings for half a decade now. Would switching to sqlite3 be an improvement? -- gwern http://www.gwern.net From mail at joachim-breitner.de Sun Dec 14 23:59:23 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Sun, 14 Dec 2014 23:59:23 +0100 Subject: arbtt: use of database like sqlite3? In-Reply-To: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> References: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> Message-ID: <1418597963.3430.7.camel@joachim-breitner.de> Hi Gwern, Am Sonntag, den 14.12.2014, 15:39 -0500 schrieb Gwern Branwen: > So I was waiting on arbtt to --dump-samples for the past 100 hours to > write a rule classifying a web serial I read as recreational, and I > began wondering: what is arbtt doing that it takes so long? wait, it is taking 100 hours for one run of "arbtt-stats --dump-samples"? > Is it because of the log structure that it has to read through, parse, > and classify my full 85M arbtt log just to get the last 100 hours of > data? I know from working with an 18GB sqlite3 db for Mnemosyne that > date range queries in databases can be *extremely* fast, and > arbtt-capture dumping into a db would probably be more reliable and > durable (ACID rather than arbtt-recover), and sqlite3 has had multiple > Haskell bindings for half a decade now. > > Would switching to sqlite3 be an improvement? Possibly. My goal with the current system is to make arbtt-capture as cheap as possible, by doing nothing than simply appending a few bytes to a file. I might have been over-optimizing here, but it is certainly a good idea to pay attention to something that is going to run constantly, even when on battery. But that does not mean that the benefits of sqlite (such as date range queries) or otherwise improved data formats would not outweigh this. Or maybe an alternative route could be taken: arbtt-capture still writes to a binary append-only log, but regularly (i.e. once a day), this log is imported into a format more amendable to searching and flushed. But note that there is another possibly important feature of the current log format: It shares strings between a sample and its previous sample. Otherwise, upon every sample, every window title would be duplicated stored again and again, yielding in considerably larger files and arbtt-stats memory consumption. I doubt that this is easily possible with sqlite. Maybe it is also sufficient to create an index file for the log file, with the offsets of, say, the first sample of each day. This would allow arbtt-stats to categorize a specific time span faster. But this is also tricky given the string-sharing format. Maybe it is also sufficient to keep the log format, but split it into smaller files, i.e. one per day. Again, date ranges would be smaller, plus it would be back-friendlier and easier to delete certain dates manually. So all in all quite a few options, and no clear best way to follow. What do you think? Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141214/05997d40/attachment.asc> From gwern at gwern.net Mon Dec 15 01:40:16 2014 From: gwern at gwern.net (Gwern Branwen) Date: Sun, 14 Dec 2014 19:40:16 -0500 Subject: arbtt: use of database like sqlite3? In-Reply-To: <1418597963.3430.7.camel@joachim-breitner.de> References: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> <1418597963.3430.7.camel@joachim-breitner.de> Message-ID: <CAMwO0gzBQOcQdpzorLri0pnHp=NZnWiA2HRFnTCEgAS8HiSFiQ@mail.gmail.com> On Sun, Dec 14, 2014 at 5:59 PM, Joachim Breitner <mail at joachim-breitner.de> wrote: > wait, it is taking 100 hours for one run of "arbtt-stats > --dump-samples"? No, I meant the query was restricted to the last 100 hours (which was when I was reading it) > Possibly. My goal with the current system is to make arbtt-capture as > cheap as possible, by doing nothing than simply appending a few bytes to > a file. I might have been over-optimizing here, but it is certainly a > good idea to pay attention to something that is going to run constantly, > even when on battery. I don't know how much more expensive an append is, but I suspect that even a single run through a logfile will use up more joules than whatever additional overhead sqlite3 has in INSERTs. (Since sqlite3 is often used in embedded and resource-constrained applications or in applications that write heavily to the database like Firefox, it can't be *that* bad.) > Or maybe an alternative route could be taken: arbtt-capture still writes > to a binary append-only log, but regularly (i.e. once a day), this log > is imported into a format more amendable to searching and flushed. I think that would probably have the worst of both worlds; two architectures means a lot more can go wrong. > But note that there is another possibly important feature of the current > log format: It shares strings between a sample and its previous sample. > Otherwise, upon every sample, every window title would be duplicated > stored again and again, yielding in considerably larger files and > arbtt-stats memory consumption. I doubt that this is easily possible > with sqlite. That's a good point: sqlite3 has pretty minimal support for compression. There's a plugin or two for coding in compression/decompression on specific entries (such as https://github.com/salviati/sqlite3-lz4 ), but that wouldn't save much space since we want between-entry compression. There's 2 official proprietary plugins, but besides being proprietary are intended for read-only databases. We could bite the bullet and accept greater filesize. I wouldn't mind doubling space usage if it made queries near-instant. Another option is that the database could be structured to explicitly exploit duplication; I saw this suggestion in http://stackoverflow.com/a/1829601 : > Compression is all about removing duplication, and in a log file most of the duplication is between entries rather than within each entry so compressing each entry individually is not going to be a huge win. > > This is off the top of my head so feel free to shoot it down in flames, but I would consider breaking the table into a set of smaller tables holding the individual parts of the entry. A log entry would then mostly consist of a timestamp (as DATE type rather than a string) plus a set of indexes into other tables (e.g. requesting IP, request type, requested URL, browser type etc.) That is, in more Haskelly terms, each arbtt-capture sample is a (timestamp, [String]); each string is assigned a unique ID and stored in a hashmap if not already present, and the IDs are stored with the timestamp. So a few seconds of samples would look something like (1418603655,[1,54,20,333]) (1418603678,[1,53,333]) (1418603693,[1,53,333]) (1418603702,[1,53,333]) (1418603712,[53,333,801]) avoiding the worst of between-row redundancy; and another table would store the definition of string #1, #53, #801, etc whenever one needed them. This probably would compress even better than a log format which looks an entry back for redundancy since it extends the lookback to the entire database history, and indexes presumably mean the queries remain as fast (since sqlite3 knows where the indices point to in the other table). -- gwern http://www.gwern.net From mail at joachim-breitner.de Mon Dec 15 09:35:06 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 15 Dec 2014 09:35:06 +0100 Subject: arbtt: use of database like sqlite3? In-Reply-To: <CAMwO0gzBQOcQdpzorLri0pnHp=NZnWiA2HRFnTCEgAS8HiSFiQ@mail.gmail.com> References: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> <1418597963.3430.7.camel@joachim-breitner.de> <CAMwO0gzBQOcQdpzorLri0pnHp=NZnWiA2HRFnTCEgAS8HiSFiQ@mail.gmail.com> Message-ID: <1418632506.1489.8.camel@joachim-breitner.de> Hi, Am Sonntag, den 14.12.2014, 19:40 -0500 schrieb Gwern Branwen: > That is, in more Haskelly terms, each arbtt-capture sample is a > (timestamp, [String]); each string is assigned a unique ID and stored > in a hashmap if not already present, and the IDs are stored with the > timestamp. So a few seconds of samples would look something like > > (1418603655,[1,54,20,333]) > (1418603678,[1,53,333]) > (1418603693,[1,53,333]) > (1418603702,[1,53,333]) > (1418603712,[53,333,801]) > > avoiding the worst of between-row redundancy; and another table would > store the definition of string #1, #53, #801, etc whenever one needed > them. This probably would compress even better than a log format which > looks an entry back for redundancy since it extends the lookback to > the entire database history, and indexes presumably mean the queries > remain as fast (since sqlite3 knows where the indices point to in the > other table). yes, an internalized string format would also work quite well, and if used correctly on the Haskell side, could avoid having duplicated strings in memory as well. But now the insertion is even more expensive: Upon every sample, for every open window, sqlite will have to traverse an index of over a million? entries to see if this particular window title has occurred before. That?s quite an increase both in computation time _and_ memory consumption for the long-running process. I think this variant is also only good if the data is first collected to a log and then occasionally sorted into the database. Greetings, Joachim ? $ arbtt-dump |sort -u|wc -l 1116948 # and without deduplication: $ arbtt-dump |wc -l 5767660 -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141215/92e18ab3/attachment.asc> From gwern at gwern.net Mon Dec 15 17:19:23 2014 From: gwern at gwern.net (Gwern Branwen) Date: Mon, 15 Dec 2014 11:19:23 -0500 Subject: arbtt: use of database like sqlite3? In-Reply-To: <1418632506.1489.8.camel@joachim-breitner.de> References: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> <1418597963.3430.7.camel@joachim-breitner.de> <CAMwO0gzBQOcQdpzorLri0pnHp=NZnWiA2HRFnTCEgAS8HiSFiQ@mail.gmail.com> <1418632506.1489.8.camel@joachim-breitner.de> Message-ID: <CAMwO0gwtmei=Dmw7dJFzrKiOPUakgegyyZN5N1LXEiHX9G9Tkw@mail.gmail.com> On Mon, Dec 15, 2014 at 3:35 AM, Joachim Breitner <mail at joachim-breitner.de> wrote: > But now the insertion is even more expensive: Upon every sample, for > every open window, sqlite will have to traverse an index of over a > million? entries to see if this particular window title has occurred > before. No; come on, these are *databases*, their raison d'etre is looking up stuff. The ideas behind them are now at least 44 years old - give them a little credit, they can do better than a linear scan. Suppose inserts were as dumb as a binary tree - then the traverse only involves ~14 lookups (log_2(1million) = 13.8 levels of a tree). I would be surprised if sqlite3 couldn't do thousands of linked inserts per second, and I'd expect it to require microbenchmarking to show the difference between a regular insert and an index-linked insert like proposed. It's really not an issue. The real question is whether anyone wants the faster queries enough to rewrite the backend for and make everyone convert their old logs over. (The existing append-only logs may be terrible for queries, but all the code exists and presumably is debugged.) -- gwern http://www.gwern.net From mail at joachim-breitner.de Mon Dec 15 19:11:26 2014 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 15 Dec 2014 19:11:26 +0100 Subject: arbtt: use of database like sqlite3? In-Reply-To: <CAMwO0gwtmei=Dmw7dJFzrKiOPUakgegyyZN5N1LXEiHX9G9Tkw@mail.gmail.com> References: <CAMwO0gzQRNSciXnjSm6_JGbhvZ_SJxW=TcKEYZ7t3gjELKVzow@mail.gmail.com> <1418597963.3430.7.camel@joachim-breitner.de> <CAMwO0gzBQOcQdpzorLri0pnHp=NZnWiA2HRFnTCEgAS8HiSFiQ@mail.gmail.com> <1418632506.1489.8.camel@joachim-breitner.de> <CAMwO0gwtmei=Dmw7dJFzrKiOPUakgegyyZN5N1LXEiHX9G9Tkw@mail.gmail.com> Message-ID: <1418667086.27272.5.camel@joachim-breitner.de> Hi, Am Montag, den 15.12.2014, 11:19 -0500 schrieb Gwern Branwen: > On Mon, Dec 15, 2014 at 3:35 AM, Joachim Breitner > <mail at joachim-breitner.de> wrote: > > But now the insertion is even more expensive: Upon every sample, for > > every open window, sqlite will have to traverse an index of over a > > million? entries to see if this particular window title has occurred > > before. > > No; come on, these are *databases*, their raison d'etre is looking up > stuff. The ideas behind them are now at least 44 years old - give them > a little credit, they can do better than a linear scan. that?s why I said ?an index?! > Suppose inserts were as dumb as a binary tree - then the traverse only > involves ~14 lookups (log_2(1million) = 13.8 levels of a tree). I > would be surprised if sqlite3 couldn't do thousands of linked inserts > per second, and I'd expect it to require microbenchmarking to show the > difference between a regular insert and an index-linked insert like > proposed. > It's really not an issue. You might be right... > The real question is whether anyone wants the faster queries enough to > rewrite the backend for and make everyone convert their old logs over. I don?t think this will be too hard. Maybe I should simply give it a try over the holidays. Also, this would maybe make it easier to cache the actual tags. This might require some refactoring and redesign, though. E.g. currently, tagging may easily depend on the command line. Anyways, I also added a ticket for it https://bitbucket.org/nomeata/arbtt/issue/19/use-sqlite3-as-a-backend > (The existing append-only logs may be terrible for queries, but all > the code exists and presumably is debugged.) presumably.... unfortunately, they are not bullet proof (otherwise there would be no need for arbtt-recover). -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: <https://lists.nomeata.de/pipermail/arbtt/attachments/20141215/29e1c0c8/attachment.asc>