This repository was archived by the owner on Jul 16, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 48
Weekly Meeting 2016 08 18
Tim Pepper edited this page Aug 25, 2016
·
4 revisions
- roll call (1 minute)
- opens (5 minutes)
- bugs (10 minutes): pre-seed a list of new or priority up/down candidates into the agenda for meeting focus
- prioritize
- scrub
- update/assign
- prior meeting actions (5 minutes): check prior meetings' minutes ACTION items from minutes for progress and resolution
Meeting started by tcpepper at 16:03:51 UTC. The full logs are available at ciao-project/2016/ciao-project.2016-08-18-16.03.log.html .
-
opens (tcpepper, 16:04:23)
- LINK: https://github.com/01org/ciao/pull/451 (mrkz, 16:16:33)
-
Bugs (tcpepper, 16:17:47)
- LINK: https://github.com/01org/ciao/blob/master/ciao-launcher/overseer.go#L679 (markusry, 16:22:01)
- LINK: https://blog.golang.org/pipelines (markusry, 16:22:33)
Meeting ended at 16:24:25 UTC.
-
UNASSIGNED
- (none)
- tcpepper (39)
- markusry (19)
- jvillalo (11)
- kristenc (7)
- mrkz (5)
- ciaomtgbot (2)
- carlosag (1)
- albertom (1)
- mrkz_afk (1)
Generated by MeetBot
_ 0.1.4
.. _MeetBot
: http://wiki.debian.org/MeetBot
###Full IRC Log
16:03:51 <tcpepper> #startmeeting
16:03:51 <ciaomtgbot> Meeting started Thu Aug 18 16:03:51 2016 UTC. The chair is tcpepper. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:03:51 <ciaomtgbot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:03:57 <markusry> o/
16:03:59 <albertom> o/
16:03:59 <tcpepper> roll call ...
16:04:01 <tcpepper> o/
16:04:05 <carlosag> o/
16:04:05 <markusry> o/
16:04:16 <mrkz_afk> o/
16:04:23 <tcpepper> #topic opens
16:04:33 <jvillalo> o/
16:05:44 <tcpepper> anybody with opens? I suspect short meeting given recent vacations and no specific agenda items for the week
16:05:52 <jvillalo> I have a couple of opens
16:06:02 <tcpepper> jvillalo: yep?
16:06:44 <jvillalo> The first one is regarding and enpoint which is /v2.1/{tenant}/resources , it currently provides usage information on cpu, memory and disk
16:07:39 <jvillalo> my question is, it seemed to be using logs rather than DB, and if we reboot the cluster the previous logs are lost
16:08:08 <jvillalo> I wanted to ask a feature to prevent that loss
16:08:48 <tcpepper> we do intend resource utilization to be ephemeral today
16:08:49 <jvillalo> And the second open is that current release of the UI already implements the charts to visualize that information, it's currently avaiable now
16:09:40 <tcpepper> we've "architected" (ie: minimally talked about and drawn a box on a whiteboard for ;) a monitoring/estimating entity that would have persistent knowledge of usage
16:10:01 <jvillalo> ok, so this is the expected behavior then.
16:10:08 <tcpepper> it would also support billing
16:10:16 <tcpepper> so we do want to have usage data be persisted
16:10:21 <tcpepper> just not implemented yet
16:10:34 <tcpepper> current utilization is all self-rebuilding via the stats frames
16:10:39 <tcpepper> but it's not persisted today
16:11:26 <jvillalo> great, well then, I guess my only comment is that, if anyone would like to check the usage of resources on the UI, it will take about 15min to gather the data for the linecharts to work. I invite you to check them out :)
16:11:30 <tcpepper> jvillalo: can you look through our tickets and see if there's one for the monitoring/estimating/billing persistent historical usage data, and if there is not, can you add one?
16:11:51 <jvillalo> sure
16:11:51 <tcpepper> then you can mark the UI somehow as experimental/incomplete maybe until that issue is resolved?
16:12:21 <jvillalo> that makes sense, yes of course I will
16:13:17 <kristenc> it is persistent, it's just that our default location is in /tmp
16:13:27 <kristenc> there is a database that stores this info.
16:13:36 <kristenc> rebooting the system loses it because of /tmp
16:13:58 <kristenc> if we change the default location for this datastore, then it can survive a reboot.
16:14:04 <kristenc> so perhaps this is a bug against config?
16:14:30 <tcpepper> other opens?
16:16:12 <kristenc> is the 15 minutes to get info a bug, or by design?
16:16:14 <mrkz> I just have a quick comment, about PR 451 It will take a bit more of time to check why it's failing due to deployment task force
16:16:33 <mrkz> #link https://github.com/01org/ciao/pull/451
16:17:13 <tcpepper> mrkz: that's a good transition to bugs...I wanted to talk about that there
16:17:18 <tcpepper> other opens?
16:17:39 <markusry> mrkz: Try rebasing and repushun
16:17:41 <markusry> repusing
16:17:44 <markusry> re-pushing
16:17:47 <tcpepper> #topic Bugs
16:18:01 <tcpepper> our focus bug list has only one change
16:18:09 <tcpepper> I've closed https://github.com/01org/ciao/issues/434
16:18:12 <mrkz> markusry: problem is some networking tests are failing and I'd need to take a look on why and fix it properly
16:18:19 <mrkz> but for sure I can rebase
16:18:26 <tcpepper> this week I and markusry have done some poking at our failures
16:18:31 <markusry> I turned off race detection on the travis unit tests that should make the failures less frequent
16:18:44 <markusry> The underlying launcher issues seen in https://travis-ci.org/01org/ciao/jobs/152456463
16:18:49 <markusry> are fixed in the storage branch
16:18:50 <tcpepper> he found a launcher issue and I've fixed one in scheduler and a couple in testutil that were causing much of these travis failures
16:18:53 <markusry> But not in master
16:19:11 <tcpepper> I'm seeing maybe 1 in 30 travis failures now, where it had gotten as high as 50 16:19:25 <tcpepper> the one I'm chasing I have a question, maybe for markusry (and anybody else)
16:19:25 <markusry> However, race detection is now turned off in master so it should be much more difficult to hit this launcher bug, even thought it's still present
16:19:44 <tcpepper> closing a channel: is that that a synchronous or asynchronous task in go?
16:20:11 <tcpepper> I ask b/c I'm seeing behavior which makes me wonder if it's async
16:20:16 <markusry> You should only close a channel from the go routine that writes to it
16:20:47 <markusry> If you have multiple go routines writing to the same channel you need to use wait groups to ensure that all those go routines have exitted
16:20:51 <markusry> before closing the channel
16:20:53 <tcpepper> ok
16:20:58 <tcpepper> that makes sense relative to what I'm seeing
16:21:33 <tcpepper> so i've got a tricky tricky synchronization issue to sort out in testutil
16:22:01 <markusry> Here's how I do this in launcher
16:22:01 <markusry> https://github.com/01org/ciao/blob/master/ciao-launcher/overseer.go#L679
16:22:32 <markusry> And here's some guidance from Google
16:22:33 <markusry> https://blog.golang.org/pipelines
16:22:53 <markusry> See Fan-out, fan-in
16:22:59 <tcpepper> in the meantime mrkz et al if you see a failure like https://travis-ci.org/tpepper/ciao/jobs/153113007 where it's testutil TestReconnects() and you see traces of testutil agent and controller being blocked forever on SendResultAndDelEventChan() ...
16:23:18 <tcpepper> that's a bug you can ignore...aka click the travis Rerun button...and odds are good your test will pass.
16:23:26 <tcpepper> I'll try to sort that out today/tomorrow
16:23:33 <tcpepper> markusry: thanks for the pointers!!!
16:23:33 <mrkz> gotcha, thanks tcpepper
16:23:49 <tcpepper> so I think that covers bugs
16:23:56 <tcpepper> and we don't have anything else on the agenda...
16:24:00 <tcpepper> want to call it a short meeting?
16:24:04 <markusry> yep
16:24:08 <kristenc> yes
16:24:10 <jvillalo> sure
16:24:20 <tcpepper> o/
Development
Architecture