00:00:51 | gmaxwell: | K. his accoutn was created not that long before anonymint stopped posting. |
00:01:10 | gmaxwell: | Wondered if he was known to be the same before I wasted any more time responding to him. |
00:14:47 | Luke-Jr: | I remember some "ABIS protocol" thing from like 2011 |
00:15:02 | Luke-Jr: | wasn't it some kind of wallet protocol thing that never went anywhere? |
00:15:43 | sipa: | more a vague idea about wallets and rainbows and unicorns |
00:16:31 | hearn: | yeah calling it a protocol was a bit grand, iirc |
00:18:09 | sipa: | https://github.com/ABISprotocol/ABIS#abis |
00:19:12 | gmaxwell: | yea, he seemed familar to me. Context was that sybil monitoring network thread. You can see the last few postst there if you can, not really important. |
00:20:53 | Luke-Jr: | [00:15:43] more a vague idea about wallets and rainbows and unicorns <-- lol |
01:17:29 | bramc: | Hey everybody |
01:18:17 | bramc: | there's something in the IBLT stuff I'm not getting. If I have a transaction and split it up and distribute it in the iblt, how many copies of the different parts of it do I put in there, and where do I put them in? |
01:19:05 | bramc: | rusty's post about it gives an example where something is split into two parts, one of which is included twice and one of which is included once, which seems... odd. |
01:21:37 | kanzure: | links please |
01:28:55 | bramc: | kanzure, http://rustyrussell.github.io/pettycoin/2014/11/05/Playing-with-invertible-bloom-lookup-tables-and-bitcoin-transactions.html |
01:31:13 | kanzure: | i would not have guessed that url |
02:47:53 | tromp: | i wld think you shld only put in single copies. he probably had an extra copy just for educational purposes |
03:10:36 | bramc: | so I'm working through some math on this IBLT thing |
03:12:21 | bramc: | If one assumes that all things in the set are the same size, and there are n diffs between the two sets and m copies per diff and k slots, then I figure what you want to focus on is the virality of reveals: that is, the expected number of new reveals to happen from a previous reveal. If that value is greater than 1, you're probably golden, hence leading to the thresholding behavior rusty was so confused by |
03:12:49 | bramc: | So I worked through some math and get the formula for that of m(m-1)n(k-m)^(n-1)/k^n |
03:13:29 | bramc: | and, uh, it's been a while since I did much calculus, anybody know how to take the derivative of that with respect to m or point me to a web site which does? (I assume wolfram alpha does, but I don't know its syntax) |
03:16:35 | amiller_: | bramc, http://quickmath.com/webMathematica3/quickmath/calculus/differentiate/advanced.jsp#c=differentiate_advanceddifferentiate&v1=m(m-1)n(k-m)%5E(n-1)%2Fk%5En&v2=m%0A |
03:16:38 | gmaxwell: | -(((k-m)^(-2+n)n(k-2*k*m+m^2-m*n+m^2n))/k^n) |
03:17:41 | gmaxwell: | bramc: the way it's used for transactions is to break the transactions into parts, thus meeting the same size criteria... |
03:18:06 | bramc: | gmaxwell, I don't get the breaking into parts, that seems to make it much harder to reconstitute |
03:18:24 | bramc: | Because you need all the parts to reconstruct a transaction |
03:18:26 | gmaxwell: | bramc: not an issue that more overhead can't solve. |
03:18:37 | bramc: | I'm trying to minimize overhead :-P |
03:19:23 | bramc: | Seriously though, you need all the parts to reconstruct a transaction. Knowing that a transaction part is xored with some other unknown transaction doesn't help you |
03:19:56 | bramc: | Thanks for the math. Can somebody find the zeros of that mess quickly? |
03:21:46 | gmaxwell: | there are several obvious trivial roots. |
03:21:49 | gmaxwell: | uhh |
03:23:10 | bramc: | I'm expanding it now. It's a simple quadratic after getting rid of all the garbage |
03:23:38 | kanzure: | .wa roots of -(((k-m)^(-2+n)n(k-2*k*m+m^2-m*n+m^2n))/k^n) |
03:23:44 | yoleaux: | roots -((k-m)^(-2+n) n (k-2 k m+m²-m n+m² n))/k^n = 0: k = 0 and Re(n)<0; k!=0 and m = k and Re(n)>1; k² !=k m and n = 0; m² !=m and k² !=k m and n = (-2 k m+k+m²)/(m-m²) |
03:24:43 | gmaxwell: | n=(((k*((2*m)-1))-(m*m))/((m*m)-m)) |
03:25:09 | gmaxwell: | wow someone got WA to do something useful! amazing. |
03:25:32 | kanzure: | don't celebrate yet, it misinterpreted m^(2n) |
03:27:21 | gmaxwell: | kanzure: darn it somewhere I have a screenshot where I gave it a simple expression and it simplified it to Sum(expression * n_i,i,-Inf,Inf) where n[i] = 1 for 0 and 0 for all other values. |
03:28:39 | gmaxwell: | "Gee thanks for that help." |
03:28:59 | kanzure: | yeah i just think of wolframalpha as maybe a little sarcastic sometimes :) |
03:31:41 | bramc: | After solving and removing very small terms I get (n+2k+-(n^2+4k^2)^.5)/(2n) |
03:32:39 | bramc: | obviously it isn't the plus, so that's (n+2k-(n^2+4k^2)^.5)/2n |
03:37:54 | bramc: | Oh wait, I got it wrong, it's the plus. If we say k=cn, we have (1+2c+(1+4c^2)^.5)/2 |
03:40:57 | bramc: | So what it boils down to is that the optimal number of copies has to do with what you expect the ratio of the amount of stuff you have stored with how much space you have |
03:41:17 | bramc: | If the ratio is 1, which means you're fucked, it's 2.6, so the actual value should be larger than that |
03:43:03 | bramc: | If the ratio is 1.5, in which case you probably aren't fucked, the optimal value is 3.58, and at 1.2, which probably on the edge, it's exactly 3.0 |
03:43:13 | bramc: | So that was a lot of math for concluding that the best number of copies is 3 |
03:43:39 | bramc: | But I still don't understand why to split into parts at all, except to support multisize, and even there I have an issue with it |
03:45:54 | bramc: | Parts should individually self-declare what they are so they can be covered individually and collated later |
03:46:54 | bramc: | Anyway, all that math resulted in a nice clear answer of 3, and I see why rusty's implementation is so shit at insertions. More work later, time for me to go home. |
05:20:11 | smooth: | gmaxwell: i dont think abisprotocol is anonymint (but anything is possible). mint is using the nick iamback now |
05:23:16 | gmaxwell: | smooth: thanks! |
05:26:23 | bramc: | argh, this multiplier never gets above 1. If you want your diff to work, you basically have to get all the pieces right off the bat. |
05:28:52 | bramc: | Oh wait, it helps if I multiply by the number I need to multiply by |
05:29:43 | phantomcircuit: | bramc, minor details |
05:36:10 | bramc: | Okay, running numbers properly I get that the exponential multiplier gets over 1 when m=5 and k=2.4 |
05:37:57 | bramc: | so basically 58% of your IBLT will be wasted, at least compared to theoretical capacity, and the number of copies stored should be 5 |
05:40:22 | bramc: | There's a marginal advantage to making any given block have some chance of hashing to 6 places, but it's clearly not worth the complexity |
05:43:52 | bramc: | And oh wait, a lot of that overhead is recouped by positions not being used, I'll have to work out how much that saves |
05:48:39 | rusty: | * rusty wakes up.... |
05:49:41 | bramc: | Hey rusty! |
05:50:00 | bramc: | rusty, As you can see in scrollback I've been doing math and running numbers on IBLTs. |
05:50:11 | rusty: | bramc: fixed sizes make the numbers somewhat easier. |
05:50:41 | bramc: | It turns out that the thresholding phenomenon you noticed is very real and can be explained straightforwardly |
05:51:09 | rusty: | bramc: of course, but kalle didn't see the same threshold. |
05:51:39 | rusty: | bramc: the threshold is made more abrupt by my requirement that all parts of a tx be available before removing any of it from the iblt. |
05:52:04 | bramc: | rusty, there are a number of things which could potentially cause the phenomenon to go away, the discussions seem to underspecify behavior a bit |
05:52:07 | rusty: | bramc: (remember, gavin's proposal had us breaking txs into equal sie parts). |
05:52:27 | gmaxwell: | In general I think polynomial set reconciliation is more interesting, since you get more or less optimal communications efficiency and get it consistently; and you also get perfect communications parallelism; but perhaps the performance hit is too big. Having never implemented the rational interpolation I don't have a feel for the performance; but it only needs to be used to communicate ID's and |
05:52:33 | gmaxwell: | lengths and then the RS code stuff I described before can take over and that can be very fast (because it can be implemented over GF(2^8) using lookup tables). |
05:52:50 | bramc: | rusty, I'd like to assume that transaction parts are self declaring, so they say 'I'm part 1 of transaction Q' and so they can be reconstructed individually and put back together later |
05:53:40 | bramc: | gmaxwell, the nice features of IBLT are that it's (a) simple (b) fast (c) requires no interactivity whatsoever |
05:53:52 | rusty: | bramc: Yes, but that takes space. There's a trick I noted in the final comment on that post, however, and it's on my TODO. |
05:54:48 | rusty: | bramc: each iblt bucket is [txid] [frag number] [data]. (Well, it's not actually txid for , but you get the idea). |
05:54:56 | bramc: | rusty, well for simplicity I assumed that all txs are equal size, and for analysis purposes I assumed that it's either all insertions or all deletions, there are of course headaches when there's a mix because you get false zeroes |
05:55:04 | gmaxwell: | bramc: yes, what I'd described before required no interactivity whatsoever. (assuming you send enough data) as it started as a proposal to improve network convergence time, rather than bandwidth, by being able to send different data to each peer to maximize goodput. |
05:55:51 | bramc: | gmaxwell, not sure what you mean by that. It looks like IBLT is most applicable to fast relay networks. And yes it's probably more important for convergence time than bandwidth. |
05:56:01 | gmaxwell: | I'm not actually sure how fast the IBLT stuff goes, e.g. does the overhead of trump that the basic operations are faster? probably not, but I'm unsure. |
05:56:11 | rusty: | bramc: yeah, I simulated. (Though dumb me, I took txs from the testnet.) |
05:56:44 | bramc: | gmaxwell, IBLT stuff is essentially instantaneous |
05:56:58 | rusty: | gmaxwell: I'll add "Understand and implement gmaxwell's scheme" to my TODO. |
05:57:00 | gmaxwell: | bramc: there are far simpler things that work wonders if _all_ you care about is convergence time (e.g. Matt's protocol) |
05:58:13 | bramc: | gmaxwell, this would simplify matt's protocol by making it so the fast relay network doesn't have to maintain states for peers it's talking to or do round trips on synchronization |
05:58:26 | rusty: | bramc: well, you have to generate the for-this-iblt-txids for all txs in your mempool. Then you wander over the iblt; in the good case it's almost a single sweep. |
05:58:31 | bramc: | gmaxwell, Also refer to my earlier comment about 'I shouldn't be working on this but it's fun' |
05:58:47 | rusty: | bramc: matt would still win, and has no failure modes. |
05:59:18 | bramc: | Just to clarify: I like matt's protocol better, and will read up on it later |
06:00:02 | gmaxwell: | bramc: I mean, matt's protocol relays most blocks as three or for packets or something like that. And the reconstruction is simply copying the data, no calculation at all. No hashing no yadda yadda. It's hard to be faster doing anything else. |
06:00:21 | gmaxwell: | rusty: sipa has a better writeup. though he hasn't plugged in all the optional improvements yet. |
06:02:46 | bramc: | rusty, the explanation of why the threshold happens is a virality one. Each tx you correct will reduce counts in a few other blocks by one. Each of those has some chance of resulting in a new recoverable transaction. If the expected number of those is greater than one then you'll almost certainly be able to recover everything. If it's less than one you're screwed |
06:03:20 | rusty: | bramc: sure, but it's worse if you require N block to be recoverable to make progress. |
06:03:59 | bramc: | rusty, what? I'm assuming that you repeatedly recover whatever you happen to be able to recover, and that larger transactions are split into smaller individually recoverable pieces |
06:04:08 | bramc: | And I'm ignoring encoding overhead |
06:04:18 | gmaxwell: | Key in my fancier scheme is that the source of a block can send fragments round robbin to peers, the fragements are orthorgonal, and (ignoring some linear overhead for 'ID's) each peer can recover as soon as it has the data equal to the missing data (not the difference), and if they don't have enough yet they can get data from the source's peers instead of the source directly. So in the time it |
06:04:24 | gmaxwell: | takes a source to send a single full difference, all N of its peers can be fully recovered (assuming bandwidth between the N isn't sharing the same bandwidth limit as the source); and with no round trips needed (except to say 'send me no more'). |
06:04:31 | rusty: | bramc: yes, my implementation was dumb, and only recovered a tx if all parts were available :) |
06:05:08 | bramc: | rusty, thanks for clarifying that. That's what I assumed given how awfully it performs or insertions |
06:05:43 | gmaxwell: | rusty: oy yea, thats derpy. :) |
06:05:47 | rusty: | bramc: OTOH, there's the trick that if you find part n of a tx in a bucket with count -1, you can now clear all parts of the tx. |
06:07:06 | bramc: | rusty, So I did some math and found that if the number of diffs is n, the number of copies of each transaction is m, and the number of slots is k, then the viral base is m*(m-1)*n*(k-m)^(n-1)/k^n which is almost scale invariant on k and n as long as their ratio remains the same |
06:07:51 | bramc: | and optimizing for m, if we say that k=cn, we get m=(1+2*c+(1+4c^2)^.5)/2 |
06:08:33 | rusty: | bramc: kalle says k=3 wins https://github.com/kallerosenbaum/bitcoin-iblt/wiki/Diff-count-VS-failure-probability |
06:08:48 | bramc: | rusty, kalle is wrong :-P |
06:09:29 | rusty: | bramc: we need a corpus to figure that out. |
06:09:42 | bramc: | rusty, I have some math which should be worth something :-P |
06:10:43 | bramc: | Even in kalle's graphs it says that k=4 and 5 beats 3, but it's 'only' a factor of 2 or something, which looks small on his log scale graphs. |
06:12:17 | bramc: | rusty, so plugging in numbers I get that things go over 1 at c=2.4 and m=5 (that's k=5 in the variable names kalle is using) |
06:12:19 | rusty: | bramc: ? What |
06:12:32 | bramc: | at that point 6 works almost exactly as well |
06:12:48 | rusty: | bramc: his graph shows lowest fail prob with k=3, across entire range. |
06:14:10 | bramc: | one could argue that a viral base marginally over 1.0 isn't likely to work in practice, so if we want a base of, say, 1.25, we get c=2.8 and k=6 |
06:14:21 | bramc: | so, umm, I guess the best number of copies to use is 6 |
06:14:47 | rusty: | bramc: I'll have to dig out the paper, but that's larger than the number I read... was one of the IBLT papers. |
06:15:44 | bramc: | rusty, I'm quite certain that if you do the overall optimization of getting the best ratio of size to amount recoverable the best is 5 or 6. 3 is clearly far worse |
06:16:36 | bramc: | if c=2.8 (that means you have 2.8 times as many entries as diffs) then if you have 3 copies your viral base is .74, meaning you're fucked, and if you use 6 copies your viral base is 1.25, meaning you're clearly recoverable |
06:17:13 | bramc: | If there's a paper which gives values for numbers of copies with a coherent justification I'd love to see it |
06:17:47 | rusty: | bramc: Let me step back a moment, make sure we're in sync? |
06:17:58 | bramc: | rusty, sure |
06:18:21 | rusty: | bramc: c = #number of iblt entries / #things-put-in-iblt ? |
06:18:41 | bramc: | rusty, yes |
06:18:58 | bramc: | and I'm assuming that each 'thing' fits in exactly one entry |
06:19:06 | rusty: | bramc: sure. |
06:20:40 | rusty: | bramc: so, you're saying that if you have 2800 buckets, and 1000 things. If each one gets hashed and added 3 times, you're screwed, but if it gets hashed and added 6 times you're OK? |
06:20:53 | bramc: | rusty, exactly |
06:21:23 | rusty: | bramc: That's easily testable. |
06:22:15 | bramc: | rusty, It's because of exponential blow-up. If you do multiple passes through as subsequent things get revealed, you'll find with 3 copies that the number on each pass is about 3/4 what it was on the last pass, where with 5 copies it's about 5/4 |
06:22:40 | rusty: | bramc: sure, as long as there's a singleton somewhere to start with. |
06:22:46 | bramc: | rusty, I started with the math because it clarifies what's going on |
06:23:10 | bramc: | There will be plenty of singletons. The chances of not getting any at that scale is exceedingly small |
06:23:40 | bramc: | Like, effectively nonexistent |
06:24:45 | rusty: | http://conferences.sigcomm.org/sigcomm/2011/papers/sigcomm/p218.pdf |
06:24:47 | bramc: | rusty, I'm also assuming that it's all insertions or all deletions. Things get more complicated if there's a mix |
06:25:19 | rusty: | bramc: yes, I'm assuming the "false singleton" case (n insertions, n-1 deletions, or vice versa) is noise. |
06:26:01 | bramc: | rusty, That's probably close to true, but implementing it properly requires a bunch of interesting backtracking |
06:26:12 | rusty: | bramc: 6.1 of that paper says 3 or 4. |
06:27:24 | bramc: | rusty, Apparently this is the paper which got later papers saying 'n hash functions' instead of 'n copies' which confused someone in this channel so much :-P |
06:27:29 | rusty: | bramc: yeah, I don't think we want to backtrack. But there's a trick where the frag counter is offset by the hash of the txid, which means that you can easily spot more likely candidates. |
06:27:50 | rusty: | bramc: damn, baby feeding time.... |
06:30:55 | bramc: | rusty, them using size 50 is probably throwing things off a lot |
06:31:53 | phantomcircuit: | gmaxwell, bleh validating just the last 2016*4 blocks is going to take > 1 hour |
06:32:17 | phantomcircuit: | (with openssl) |
06:32:23 | bramc: | I suppose it's possible that I'm also being pessimistic in that I'm assuming that the viral base needs to be higher than it does, because some fraction of everything will be singletons right off the bat, and with each thing you remove the viral base goes up |
06:33:00 | bramc: | So my estimates are overly pessimistic that way |
06:33:19 | phantomcircuit: | strike that it took |
06:33:37 | phantomcircuit: | 38 minutes |
06:33:54 | brisque: | phantomcircuit: I hear validation is faster if you return True ;) |
06:34:01 | phantomcircuit: | could be worse |
06:34:13 | phantomcircuit: | brisque, you joke but i've done just that before |
06:34:21 | brisque: | phantomcircuit: I wasn't joking. |
06:34:32 | bramc: | gmaxwell, for what it's worth, they're claiming that c=1.4 is recoverable, and by my calculations at that size 3 copies should come closer to recovery than 4 |
06:34:43 | bramc: | I meant to direct that last statement to rusty, sorry |
06:36:56 | phantomcircuit: | brisque, https://github.com/bitcoin/bitcoin/pull/5842 |
06:37:07 | phantomcircuit: | bleh merge conflicts |
06:38:27 | rusty: | bramc: math is nice. Testing is nicer :) |
06:38:58 | rusty: | bramc: I promise I will re-test once all the other variables are eliminated. |
06:39:31 | bramc: | rusty, running the numbers does give a useful intuitive insight for what's happening. I knew that my numbers were pessimistic but apparently they're way more pessimistic than I thought |
06:39:39 | brisque: | phantomcircuit: aw, I was expecting insanity=1 where it does no ECDSA tests at all |
06:41:14 | bramc: | rusty, my numbers also indicate that giving some chance of 2 or 4 is unlikely to make a measurable difference |
06:41:33 | phantomcircuit: | brisque, without the last commit it can randomly do that! |
06:41:35 | phantomcircuit: | horray! |
06:41:59 | rusty: | bramc: Oh, I like your count optimization, too. My statement about compressing on the wire was brainfart, because the numbers in the transmitted iblt are large, of course. |
06:42:04 | bramc: | rusty, You also have to factor in how much compression you get from entries being empty, I'm going to run that next, it looks like once you do that you're astoundingly close to optimal efficiency |
06:42:23 | rusty: | bramc: entries being empty? |
06:42:28 | bramc: | rusty, yeah |
06:42:40 | phantomcircuit: | blargh |
06:43:07 | bramc: | If you use c=1.4 and 3 copies, a significant fraction of all entries will have literally nothing in them |
06:43:11 | rusty: | bramc: confused. Entries may be empty after iblt subtraction, but they're not for transmission. |
06:43:34 | bramc: | Oh duh, you're right, never mind, forgot about the subtraction step |
06:43:51 | phantomcircuit: | derp i screwed that up |
06:44:05 | phantomcircuit: | it'll be easier to start fresh than to fix this git history ... |
06:44:09 | bramc: | rusty, One can of course compress the numbers, but yuck |
06:44:48 | rusty: | bramc: yeah :( It's kind of fascinating to send around a totally clogged up (and thus useless) bloom filter, and magic it into something useful :) |
06:47:08 | bramc: | rusty, It is a fun trick, hence my messing with it. I'm less negative on it now that I know the overhead is more like 40% than 150% |
06:47:54 | rusty: | bramc: the bias towards "extra" vs. "missing" txs is also the right way (ie. perfect network => I'll have more txs than you did when you solved block). |
06:49:11 | bramc: | rusty, I'm not convinced that there's a tradeoff to be had there. Forcing pieces to be interrelated gets lousy insertion performance without much benefit to deletion performance |
06:54:07 | rusty: | bramc: if you don't fragment txs, how do you get large txs? Variable size buckets mean you're basically encoding most of it multiple times in the clear. |
06:54:56 | bramc: | rusty, You could make a separate data structure of just ids with no content which was nominally bigger for the purposes of finding deletions, that would likely make performance overall better when there are more deletions than insertions |
06:55:29 | bramc: | rusty, I'm saying fragment transactions, but do it in a way where the fragments can be recovered individually rather than being directly interdependent |
06:56:40 | rusty: | bramc: hmm, in what way does gavin's prefix not do that? |
06:58:38 | bramc: | rusty, I don't see anything about in your post. Maybe a comment is saying something about it. |
06:59:36 | rusty: | bramc: struct keySum { u8 id[6]; u16 index; u8 frag[8]; } key; |
06:59:51 | rusty: | bramc: which was in the original gist. |
07:00:12 | bramc: | index is the fragment number? |
07:00:20 | rusty: | bramc: yep. |
07:00:21 | rusty: | bramc: turns out that[8] is *way* too low (little surprise). |
07:00:35 | bramc: | I missed that |
07:01:08 | rusty: | bramc: did you read the hashkeySum stuff in the original IBLT paper? |
07:01:29 | bramc: | rusty, No I have a lot of trouble reading papers |
07:02:13 | bramc: | I've never quite learned academicese, so when I read papers I tend to nibble at parts of them and rederive stuff and ask people questions |
07:02:42 | rusty: | bramc: OK, well it's a pretty simple idea. You add a 16-bit (?) field with the hash of the key. That way you get an extra check if a bucket is really a singleton. |
07:03:59 | bramc: | Oh that reminds me, I spoke to the lightning guys and the answer to how they can get around the apparent hard limit of needing to redo everything when the timeout time happens is that they don't. That requires an op_relative_time_verify using a much simpler but still highly technical protocol which they *ahem* are now going to write up clearly |
07:04:35 | bramc: | rusty, How is that not just making the id longer? |
07:05:02 | rusty: | bramc: you can't tell if an id is valid until after reconstructing the parts. This is a *bucket* hash. |
07:05:09 | bramc: | rusty, I'm pretty sure that a separate 'ids only' iblt is a big performance win when there are significant numbers of deletions |
07:05:32 | rusty: | bramc: I |
07:05:53 | rusty: | bramc: I'm pretty sure it's not, since that's the easy case :) |
07:05:59 | gmaxwell: | bramc: the problem you run into is that the efficiecny of iblt goes way down if there is too little data. (e.g. because you don't get enough singletons to decode) |
07:06:33 | rusty: | bramc: deletions are easy to spot: you only need one fragment. additions are hard, since you need all of them. |
07:07:14 | bramc: | rusty, honestly it sounds like a waste of space, I think some careful implementation of guessing which things are singletons and backing up if you notice there's an error will handle that well in practice |
07:08:09 | rusty: | bramc: Oh, I agree with hashkeySum being a waste of space. You can get most of the effect by offseting that "index" field by the hash of the fragid. |
07:09:06 | bramc: | gmaxwell, But if you have an ids-only one it (a) needs only a single id for a large transaction with multiple fragments (b) needs only id space for each entry anyway |
07:09:38 | rusty: | bramc: and then removing the singletons with the lowest-offset index field first (most likely to be valid). |
07:09:41 | bramc: | so you can make an ids only iblt with a lot more entries in it, and you'll lean on mostly that one for deletions, and the other one for mostly insertions |
07:10:03 | bramc: | rusty, I seem to not be explaining this important point about fragmenting properly |
07:10:03 | rusty: | bramc: backtracking could get very expensive for the we're going to fail anyway case, I think. |
07:10:26 | rusty: | bramc: no. How do you propose to fragment? |
07:11:05 | bramc: | My proposal for fragmenting is that the fragments be treating for reconstruction purposes like separate pieces: hashed separately from each other and recovered separately from each other |
07:12:08 | bramc: | so you start out with a bunch of transactions, then fragment them to make separate entries, then store everything in the iblt in an abstraction layer which doesn't really know much about fragmenting, then recover everything in a layer which also doesn't know much about fragmenting, and finally de-frag to get the original transactions back |
07:12:42 | rusty: | bramc: Sure, and your implemention will suck. |
07:12:54 | bramc: | rusty, why will it suck? |
07:13:11 | bramc: | it will be much, much better at handling large transactions |
07:13:23 | rusty: | bramc: no, AFAICT that's exactly Gavin's scheme. |
07:13:40 | rusty: | bramc: except without the optimization for removal of known txs. |
07:14:05 | rusty: | bramc: which is *really* winful. |
07:14:18 | bramc: | rusty, so why do you say that large transactions create recovery problems? All that should matter is the number of fragments |
07:15:01 | rusty: | bramc: Yep, but you're missing an opportunity to cheat! |
07:15:32 | bramc: | what is this cheating you speak of? And what is the business about optimization for removal of known transactions? |
07:15:41 | gmaxwell: | bramc: if you know that it doesn't contain fragment A and A is part of txn X, then you magically know it doesn't contain any of X's other fragments. It's a very clever optimization. |
07:15:46 | rusty: | bramc: ie. when you do recovery, and find an addition of some fragment, you *know* the rest of the fragments, and can immediatly remove them too. |
07:15:53 | rusty: | gmaxwell: thanks! |
07:16:45 | bramc: | Oh I thought that trick was too obvious too mention |
07:16:59 | rusty: | bramc: for my (admittedly flawed) results this means you can handle about 10x as many "I added a tx you didn't" as "you added a tx I didn't". |
07:17:38 | bramc: | Obviously once you've eliminated a fragment you eliminate the rest of the transaction |
07:18:04 | rusty: | bramc: that's exactly the opposite of your "abstraction layer" line above. |
07:18:19 | bramc: | rusty, I'm pretty sure that if you fiddle with the numbers right a separate ids only data structure will still be a win |
07:18:39 | bramc: | rusty, It's a somewhat leaky abstraction layer :-) |
07:18:47 | gmaxwell: | bramc: there is a way to make 'ids only' a win. But it's a bigger change than just that. |
07:19:24 | gmaxwell: | (to get that win you don't even use iblt to encode the rest, you encode the rest with a scheme that needs to send size linear to the missing data only-- an erasure code) |
07:19:25 | rusty: | bramc: as I said, it's optimizing the already-easy case. |
07:19:38 | bramc: | Oh right you need ids only to not be redundant with the existing data structure, so you need 'the other half of this entry' or something like that |
07:19:59 | bramc: | gmaxwell, I don't follow |
07:20:11 | bramc: | rusty, easy schmeazy, overhead is overhead |
07:20:47 | gmaxwell: | Thats okay, I've sort of giving up explaining it. Next time I come up with something I'll have to call it a hash table or something to all the more engineery types aren't so afraid of diving in. :P |
07:21:54 | bramc: | gmaxwell, It seems like in principle it should be possible to build these things more efficiently, for information theoretic reasons, but ECC doesn't seem to exactly match |
07:23:01 | gmaxwell: | bramc: it matches exactly once you know the IDs in the set and the lengths of the unknown entities. :) |
07:24:17 | bramc: | gmaxwell, true but for things with such sparse holes you need to use tornado codes for it to perform at all |
07:24:53 | bramc: | Maybe you could use iblt for finding the ids and lengths and ecc for filling in the data? |
07:25:17 | rusty: | bramc: feel free to play with my code on github. Since you've reminded me of it, I might do so too:) https://github.com/rustyrussell/bitcoin-iblt-test |
07:25:56 | gmaxwell: | bramc: you can. (or polynomial set reconciliation, which is information theoretically optimal) |
07:26:17 | bramc: | gmaxwell, aren't tornado codes very close to optimal and very efficient? |
07:27:26 | gmaxwell: | bramc: luby's later work (fontain codes) are more interesting. Though sipa crunched numbers and it looks like RS codes actually work fine. My "or" above was an alterntive to using iblt for the IDs. |
07:28:10 | gmaxwell: | IBLT is to polynomial set reconciliation as a LT fountain code is to a Reed-Solomon erasure code. |
07:28:22 | bramc: | Oh I see. I didn't quite parse that 'polynomial set reconciliation' is a form of set reconciliation |
07:28:39 | bramc: | Yes that's obvious when I actually read the term |
07:29:53 | bramc: | So in that case you expand your ids to be id+length and do polynomial set reconciliation on those, then do ECC on the whole string |
07:30:17 | bramc: | That seems much more mathematically elegant. Less of a fun hack for the engineers to play with though. |
07:31:30 | bramc: | I think I've now gotten enough familiarity with iblt and related concepts that I can stop worrying about it. |
07:32:04 | gmaxwell: | I've not played with implementing iblt myself, but I had tremendous fun implementing fountain codes a while back, and they're pretty similar. |
07:32:29 | bramc: | Aren't fountain codes patent encumbered? |
07:33:40 | gmaxwell: | Yup, or at least there are patents in that space. (I actually noted that on the writeup, and mentioned alternatives) |
07:36:18 | gmaxwell: | (there may also be a non-public patent application for iBLT as of yet, papers are new enough that there could be; and the author has filed patents previously) |
07:39:01 | bramc: | On the expansion of the ids before doing polynomial set reconciliation: It you wouldn't just tack on bits of length, you overwrite a variable amount using omega encoding |
07:40:22 | bramc: | Also I think the earlier confusion about difficulty of larger transactions was that I thought the claim was being made that larger transactions are harder to insert, because there's a dumb way of screwing that up, but the claim was that larger transactions are easier to delete, which is sort of the opposite claim and also true. |
08:05:15 | verne.freenode.net: | topic is: This channel is not about short-term Bitcoin development | http://bitcoin.ninja/ | This channel is logged. | For logs and more information, visit http://bitcoin.ninja |
08:05:15 | verne.freenode.net: | Users on #bitcoin-wizards: andy-logbot RoboTeddy hktud0 givenchy p15 wallet42 p15x bramc PaulCapestany rusty jhogan42 x98gvyn hearn harrow molec d1ggy_ SubCreative DougieBot5000 afk11 tromp_ dc17523b13 Pan0ram1x GAit Tiraspol cluckj flower nuke1989 Emcy crowleyman dgenr8 NewLiberty Starduster shesek OneFixt waxwing Transisto adam3us [d__d] DoctorBTC Luke-Jr realcr binaryatrocity kyuupichan fanquake koshii SDCDev oaavi Hybridsole hashtag Adlai NikolaiToryzin espes__ luny |
08:05:15 | verne.freenode.net: | Users on #bitcoin-wizards: antgreen MoALTz ahmed_ huseby face spinza sipa betarigs_admin melvster airbreather go1111111 Visheate phedny dignork s1w yorick amiller_ petertodd iddo_ kanzure LeMiner mkarrer_ arubi c0rw1n Iriez hashtagg michagogo yrashk btcdrak catcow Muis cfields Zouppen sneak coryfields_ stevenroose alferz gribble Cory LarsLarsen jcorgan cryptowest_ ajweiss berndj kinlo Logicwax midnightmagic crescendo wizkid057 bosma grandmaster otoburb wumpus jaekwon |
08:05:16 | verne.freenode.net: | Users on #bitcoin-wizards: GreenIsMyPepper phantomcircuit BlueMatt jaromil Apocalyptic gwillen dasource bbrittain fenn amincd HM2 prepost tromp eordano nickler Alanius PRab BananaLotus guruvan ryan-c lnovy ebfull brisque sdaftuar veorq helo Hunger- xabbix runeks null_radix bedeho gmaxwell SwedFTP epscy nanotube andytoshi bliljerk101 starsoccer comboy nsh Taek EasyAt maaku livegnik optimator fluffypony Meeh cursive yoleaux dansmith_btc morcos Fistful_of_Coins dardasaba |
08:05:16 | verne.freenode.net: | Users on #bitcoin-wizards: isis smooth Xzibit17 artifexd kumavis mariorz Krellan platinuum Oizopower Keefe catlasshrugged JonTitor eric mappum jbenet wiz heath gnusha warren AdrianG lechuga_ jessepollak Graet Eliel veox warptangent indolering K1773R TD-Linux leakypat CryptOprah Anduck a5m0 d9b4bef9 mr_burdell NeatBasis davout brand0 @ChanServ throughnothing btc___ azariah MRL-Relay hguux__ so BrainOverfl0w roasbeef |
08:30:57 | Guyver2: | Guyver2 has left #bitcoin-wizards |
09:39:37 | johnguest: | spartancoin is the name |
09:43:44 | sipa: | johnguest: not here |
09:46:15 | fanquake_: | fanquake_ is now known as fanquake |
09:48:45 | johnguest: | johnguest has left #bitcoin-wizards |
09:49:01 | crowleyman: | crowleyman is now known as crwlymn |
11:12:03 | Muis: | with bitcoin you are 'allowed' to mine a block when some condition (difficulty is met) |
11:12:28 | Muis: | and other peers are able to check that condition immediately, but also years later |
11:12:38 | Muis: | but suppose there was a second conditiion |
11:13:08 | Muis: | and other peers can check for that right now, but will not be able to re-check it in the future |
11:13:45 | Muis: | would that make the network insecure by design, or is there some way the network can record that they agreed today? |
11:14:07 | sipa: | recording that they agreed is not enough |
11:14:21 | Muis: | mmhh |
11:14:53 | sipa: | also, what is the point? |
11:14:54 | Muis: | i have an idea |
11:14:58 | sipa: | what is the benefit? |
11:16:11 | Muis: | its kind of complicated, but you can compare it to something like: |
11:17:58 | Muis: | suppose the chain draws random numbers, and only miners that are connected with an IP address that match the number are privileged to mine. then the whole network can just refuse to relay new blocks from other IPs, and it would be pretty secure. But for new joiners to the network, they can never verify that it was actually relayed by that IP |
11:18:25 | Muis: | (my idea has nothing to do with IPs but I try to make it simple) |
11:19:33 | sipa: | refusing to accept an otherwise valid block is a really bad kdea, unless you know that others will reject it too |
11:19:48 | Muis: | I assume that the majority is honest |
11:20:33 | Muis: | no |
11:20:38 | Muis: | i must say it differently |
11:21:08 | Muis: | I dont assume the majority is honest, but I dont mind being on a fork as long as it is the fork of the honest ones :) |
11:21:13 | sipa: | we do have something similar in bitcoin already |
11:21:20 | sipa: | namely the maximum timestamp |
11:21:47 | crwlymn: | crwlymn is now known as crowleyman |
11:21:50 | sipa: | but that doesn't have the same problem as it is only temporary blocking |
11:22:26 | Muis: | how do you mean temporary |
11:22:43 | Muis: | that suppose those blocks ended up in the chain |
11:22:43 | sipa: | if time passes, a too high timestamp becomes valid |
11:22:47 | Muis: | the chain would still be valid |
11:22:50 | Muis: | yes ok |
11:23:06 | Muis: | now I think about it |
11:23:13 | sipa: | right, it doesn't affect the ultimately valid chain, only the one we choose to accept right now |
11:23:15 | Muis: | thats also no problem in my case |
11:23:22 | Muis: | I just want to make it highly unlikely |
11:23:23 | sipa: | so what is your idea |
11:23:46 | Muis: | and I guess its very unlikely for too high timestamps to end up in the chain? |
11:24:09 | sipa: | yes, because few nodes will (quickly) accept them |
11:24:54 | Muis: | do nodes also actively block other nodes who broadacast invalid timestamps? or does it not go that far? |
11:25:07 | sipa: | no |
11:25:16 | sipa: | not afaik |
11:25:27 | Muis: | okay this is my idea: |
11:25:38 | Muis: | but its a long story, so if you are in a hurry say it |
11:26:08 | sipa: | what are you trying to accomplish? |
11:27:55 | sipa: | if it's long, maybe write it up somewhere, and post a link |
11:28:00 | sipa: | so others can read it later |
11:28:32 | Muis: | I want to mix proof of work with something new: location. Locations are something weird in digital space, because the concept does not exist. But still its very important: your bitcoins can only be in 1 location at the same time. |
11:29:12 | sipa: | ok |
11:29:22 | Muis: | It basicly boils down to this: |
11:35:15 | Muis: | Peers publish their GPS locations, and every 10 minutes a different location is allowed to mine. Miners try to pool with peers from the same country (based on the publish gps location, which may be fake). Now after they joined the pool, they ping all the peers and sort them by latency, after they found the peer with the lowest latency, they can be pretty |
11:35:16 | Muis: | sure that the found someone who is very close to them. They record a vote in the chain saying: I dont know where peer A is, but I claim that he is closer to me than B, C and D, etc. |
11:36:19 | brisque: | what stops the miners from just lying? |
11:36:38 | brisque: | nobody could tell if they are right, wrong, or just mislead. |
11:37:41 | Taek: | Muis: how do you sort by latency? What's to stop a massive mining pool from having nodes in 500 locations that can all respond with low latency, but tap into the higher powered miner? |
11:38:18 | Taek: | Mining takes a long time anyway, nobody is going to recognize an extra 200ms when a block is found |
11:38:51 | Muis: | After that you send POW prefixed with the IP of that peer, to that peer. If he wants to pair with you, he sends POW back based on what you send him, and you continue this proces with other peers (this last step I must describe in some document or else its gets too long). But the proof now is that if one of you lied, other pairs will will the block because |
11:38:51 | Muis: | they can do faster POW because the ping between the hosts is 140 ms instead of 11 ms for example. |
11:39:06 | brisque: | also note that as the crow flies is not how the internet works. you can have in some cases people who are 100 meters apart but have to transverse half a country for TCP packets. |
11:39:49 | Muis: | Taek: you cannoit have 500 miners in one location |
11:40:10 | Muis: | Taek: Because you have to transfer the results forth/back between the country of the fake node, and the country of the real miner |
11:40:29 | Muis: | and if you do that, you always will be slower than miners who are really located in that country and do not fake |
11:40:52 | Muis: | if your datacenter or mining rack is 500 KM away, you already have 5 millisecond extra latency |
11:40:58 | Muis: | and you will never mine a block |
11:41:04 | Taek: | what determines when a block has been found? |
11:41:11 | Muis: | because those 5 milliseconds are amplified |
11:41:40 | Muis: | the difficulty, but its a different difficulty as with bitcoin because it depends on two conditions |
11:42:48 | Muis: | what it comes down to: |
11:42:52 | brisque: | again physical location and latency are loosely linked. even in the absence of other problems, you can't determine location like that. if I ping my neighbour it's a 150ms+ round trip, even though we are physically in the same place. |
11:43:44 | Muis: | the network select two random GPS locations: you try to form a chain with pow from country A to country B, and if you can provide a valid chain, then that chain allows you to broadcast the block, and a new block starts |
11:43:53 | Taek: | For the sake of science I'm happy enough assuming a spherical Internet |
11:44:40 | Muis: | brisque: that doesnt matter.. it just means that if you want to team-mine with your neighbour, he is a bad partner |
11:45:15 | Taek: | What determines if a node is allowed to mine? |
11:45:16 | Muis: | it doesnt matter when a peer is slower than he should |
11:45:24 | Muis: | because people fake slowness |
11:45:34 | Muis: | it matters that they cannot fake speed |
11:45:40 | Taek: | so, say we pick Chicago and NY as the two GPS locations |
11:45:42 | brisque: | if I wanted to "mine" with anybody I'm a bad partner. it's 70ms just to get to my ISPs network let alone outside. |
11:46:09 | Taek: | how do you know that the peers forming the mining chain are in Chicago and NY? |
11:46:29 | Taek: | why not have 1 datacenter in SF do all of the mining and just say that it was 2 datacenters? |
11:46:47 | Taek: | it's quite likely that the biggest datacenter in the world will have more power than the two random locations you picked combined |
11:46:59 | Muis: | thats impossible |
11:47:10 | Taek: | which part? |
11:48:19 | crowleyman: | crowleyman is now known as crwlymn |
11:48:23 | Muis: | the peers are all split according to small regions, so if you do what you say, then all peers in region Chicago and NY will vote that they never saw you in their peer-list, and all the peers along the path will say that too. |
11:49:06 | Muis: | its not like treechains that every region mines a different chain |
11:49:33 | Muis: | they mine the same chain as other regions, its just that they prefix their hashes with another region-code |
11:50:09 | Muis: | and they will reject blocks which have their region-code, but contains peers they do not recognize |
11:50:46 | Taek: | but what stops me from bribing peers in chicago to say that they saw me in the peer list? And what happens if I make 10,000 fake nodes that all say they're in Chicago? |
11:51:23 | Muis: | if you bring 10.000 fake nodes that all claim to be in Chicago |
11:51:51 | Muis: | it doesnt work, because the latency to the datacenter will make them loose the race |
11:52:01 | Muis: | but suppose the 10.000 are really in chicago |
11:52:35 | Taek: | they don't have mining power, they're just endorsing the idea that my datacenter is in chicago |
11:53:14 | Muis: | it still wont work, because every block a different region is selected, so even though you represent > 51% of the hashrate, and also > 51% of the nodes, you can never use it to mine a block more than once every X blocks. (X= number of regions) |
11:53:52 | Muis: | if they are just endorsing an idea it will never work |
11:54:30 | Muis: | because they have to form a path from that region to another? if they have high latency in the Chicago area, how will they ever form a chain to new york the fastest to win the race? |
11:55:41 | Taek: | they aren't forming a chain to NY. My 10,000 in Chicago says my datacenter in SF is in Chicago, and my 10,000 in NY say my datacenter is in NY. So my datacenter pretends to be two datacenters, one in each location, and really it's just getting sub-millisecond latency |
11:56:08 | Taek: | while using the full power for each |
11:56:30 | Taek: | wining the race is easy if I can make people believe that I'm in a location that I'm not |
11:58:53 | Muis: | the proof of work is the path from region A to region B, so you need to do the POW not only with your partner-peer in Chicago, but that hash must travel trough all region-codes between those two cities, every time using the same partner-hashing. So you dont need datacenters in Chicago and NY, you need datacenters in all towns and villages between those |
11:58:53 | Muis: | cities. |
11:59:19 | Muis: | and every node that is part of the winning path, earns a small reward |
11:59:25 | brisque: | how do you prove to other people that it transversed that route? |
11:59:52 | Muis: | because they can verify that for themselves? |
12:00:04 | brisque: | * brisque blinks |
12:00:06 | brisque: | how. |
12:00:10 | Muis: | they just ask a random peer in country A: do you know something in the path called X? |
12:00:15 | Muis: | *someone |
12:00:43 | brisque: | I don't think you understand the sybil problem. we should take this to #bitcoin. |
12:00:51 | Muis: | I do |
12:01:14 | Muis: | but I dont explain myself wel, because they can also verify that from the path alone |
12:01:23 | Muis: | since it contains pow based on the IP's in the chain |
12:01:31 | brisque: | * brisque rolls eyes |
12:02:08 | Taek: | If you are in London, watching all of this, how do you decide which peers are or are not in NY? |
12:02:24 | brisque: | brisque has left #bitcoin-wizards |
12:02:32 | Taek: | lol |
12:03:56 | Muis: | Taek: because all past history in the chain is basicly a record of pow-votes for locations, so if you take that data, you can reconcstruct a large world-map with estimated positions of nodes with a reasonable accury. its exactly how GPS triangluaton works |
12:04:30 | Taek: | that only works if the majority of hashing power has been honest the whole time |
12:04:53 | Taek: | but security is worse than that, because most of your hashing power is idle most of the time |
12:05:35 | Taek: | If one big miner starts from block 1 and constructs an alternate view of reality where his "nodes" are scattered all over the world, and they're all endorsing each other, and providing fake timestamps etc. |
12:05:48 | Taek: | he can get a much larger/more difficult blockchain much faster |
12:05:49 | Muis: | that doesnt work |
12:05:53 | Taek: | even without 51% of global hashing power |
12:05:57 | Muis: | because a client can read that chain |
12:06:08 | Muis: | and calculate where his own location is according to that chain |
12:06:20 | Muis: | and if its own position mismatches, its a bad chain |
12:07:00 | Taek: | That's called subjective consensus |
12:07:45 | Muis: | how do you mean |
12:07:45 | Taek: | if each client is deciding for themselves which chain is valid based on data that only they know (their own location), you're going to have problems |
12:08:11 | Muis: | no its only for when they are in doubt, normally they dont have to do it like that |
12:08:37 | Muis: | and they should never be in doubt if they follow the protocol |
12:08:49 | Taek: | just the fact that you have subjective rules though opens up an entire world of hardfork risks |
12:09:20 | Taek: | it's unreasonable to assume that miners won't try to lie about their location, because there's clearly a large profit incentive for doing so |
12:11:00 | Muis: | I dont assume that |
12:13:28 | Taek: | then you acknowledge that sometimes clients will be using subjective rules? |
12:16:19 | Muis: | no not really, but I must make a better write-up |
12:17:21 | Muis: | first, they will never be able to join the country-pool from that location, because other peers from that country will not relay and disconnect them (based on ping times, ip adress, hostname, etc.). and each region is also connected to their neighbouring regions, so if you need to travel your fake work accross, nobody on that path will relay for you |
12:18:19 | Muis: | second, even if you try to do it anyway, you will loose the race, because the path with the least latency will win, and you cannot magicly transport your asics to that country or city |
12:20:01 | Muis: | so having a sybil group of fake nodes is not impossible, its just that their outisde IP address will never geo-locate to that region, and their latency is too high to ever win |
12:20:40 | Muis: | and you cannot change region with every block |
12:23:11 | Muis: | because existing pow for one region stays valid for some time, so if you join the region AFTER its public which region is start or endpoint of the path, then you will have a disadvantage compared to the people who were already commited to that region in the hour before |
12:25:31 | Muis: | so your point of 'most hashing power is idle most of the time' is not really true, its just that they are already kind of preparing for when their region is picked |
12:28:41 | Muis: | and if the path is from Germany to Spain, and you are from region Belgium, you are still allowed to form a valid pow-path as long as it contains the start- and endpoint. but most likely you will not win, because nodes who happen to live in the right country can form a shorter path |
12:29:17 | Muis: | so you will only win if you are faster, even though your path was longer |
12:31:14 | crwlymn: | crwlymn is now known as crowleyman |
12:37:45 | Muis: | I think the beauity of this scheme is that honest miners will only have to 50% of the work, because they form couples all the time. And dishonest miners need to do twice the work, because they need to form pairs with themselves, essentially doing twice the effort with the same net result. |
12:38:02 | Muis: | i never saw anything like that before |
16:21:40 | dgenr8: | Muis: what benefits does this scheme deliver, if it worked? |
16:22:15 | dgenr8: | re mining, i've been flashing on a mental picture enzymes replicating a strand of DNA |
16:22:44 | dgenr8: | the reality is amazing http://www.dnalc.org/resources/3d/04-mechanism-of-replication-advanced.html |
16:23:46 | dgenr8: | "how the sausage is made" is so complex, but the result it so simple |
16:23:49 | Muis: | dgenr8: providing an incentive for geographically decentralized mining |
16:23:58 | dgenr8: | Muis: why? |
16:25:23 | Muis: | dgenr8: so that people dont need asics anymore to win a block, and making the network more secure as Bitcoin: since 51 percent attack is much harder to pull off |
16:26:46 | Muis: | As it requires not only to have 51 percent of the hashrate |
16:27:07 | Muis: | But to have that amount on each possible geographical location |
16:27:35 | Muis: | So with 255 countries/regions thats 255 times the hashrate |
16:28:04 | Muis: | *127 |
16:30:51 | kanzure: | dgenr8: i have spent a great deal of time thinking about polymerases... http://diyhpl.us/~bryan/papers2/polymerase/ |
16:31:57 | Muis: | I once read that the DNA of one person is 650 MB of data |
16:32:25 | Muis: | So you one cd-rom is enough to describe you |
16:32:52 | Luke-Jr: | Muis: uh, no. DNA is only the biological blueprint, it doesn't describe anything beyond that. |
16:33:16 | kanzure: | and also the lifecycle is not determined by dna anyway |
16:33:25 | kanzure: | (there's an initial state that is maintained outside of dna, sort of) |
16:34:27 | Muis: | Luke-Jr: i can deduct how you look like based on that |
16:35:02 | sipa: | Muis: there is also mitochondrial DNA |
16:35:06 | Muis: | But only if you are african or chinese, nothing specific ofcourse |
16:35:11 | Luke-Jr: | Muis: only very little about how I look. most people wear clothes, some have scars from their experiences, some dye their hair, etc |
16:35:43 | sipa: | Muis: what does race have anything to do with it? |
16:36:10 | Muis: | sipa: you can deduct that from DNA |
16:36:35 | sipa: | oh, you mean you can determine race |
16:36:50 | sipa: | i read it as "you can only determine anything if they are chinese or african" |
16:36:54 | Luke-Jr: | lol |
16:37:41 | Muis: | But i dont mean race because East european is not really a race |
16:37:56 | Muis: | But more where they were born |
16:38:12 | Luke-Jr: | eh, that'd be hard. people travel.. |
16:38:18 | kanzure: | this is boring |
16:38:21 | Muis: | Dont know the word in english for it |
16:38:35 | dgenr8: | anyway, an idea like Muis's (for better or worse) is just an adjustment to the "polymerase" machinery. it's clear that something has tweaked and adjusted it infintely already, it's just one more tweak |
16:38:36 | sipa: | Muis: ethnicity |
16:39:36 | Muis: | Tnx |
16:40:55 | Muis: | But suppose you digitize your DNA and store those 650 MB in the chain |
16:41:10 | kanzure: | there's no reason to do that |
16:41:24 | kanzure: | most dna is redundant, and you can safely store just the differences (polymorphisms) |
16:41:34 | Muis: | How hard would it be for people to verify its me, or does dna testing take days in a lab? |
16:41:45 | kanzure: | there's nothing in the dna that determines whether it is you |
16:41:54 | Eliel: | are there other reasons to avoid the mini private key format than reduced key length? https://en.bitcoin.it/wiki/Mini_private_key_format (by, the way, the wikipage talks about Mt.Gox still) |
16:41:58 | sipa: | Muis: also, what does putting it in a chain gain you? |
16:42:24 | Muis: | A way to link my identitiy like a web of trust |
16:42:30 | kanzure: | dna is not identity -_- |
16:42:35 | sipa: | you can do that without a blockchain |
16:42:43 | kanzure: | if dna was identity then twins would be impossible. and they exist. |
16:43:07 | Muis: | My twin may open my wallet |
16:43:09 | sipa: | make a gpg key, have an identity with a hash of your serialized dna in it |
16:43:14 | Muis: | But no one else |
16:43:19 | kanzure: | dna is not a private key |
16:43:22 | sipa: | sign it |
16:43:34 | sipa: | if you want to prove to someone you're you, have them do a dns test, and give them the full dns sequence to compare with |
16:43:36 | Muis: | I can make a private key based on my DNA |
16:43:43 | sipa: | that's pointless |
16:43:46 | kanzure: | sipa: dna tests do not give you guarantees like that |
16:44:11 | Muis: | 99.9999 garantuee |
16:44:13 | sipa: | kanzure: i know; but even if we assume dns was 100% constant throughout your body and your life, mutastions didn't happen, and twins didn't exist |
16:44:26 | sipa: | even then there is no fricking point to put it in a blockchain |
16:44:39 | Muis: | why not |
16:44:45 | kanzure: | sipa: i think some people are just really upset that identity doesn't work and that they have been lied to :/ |
16:44:49 | Muis: | Maybe i loose my wallet |
16:44:55 | Taek: | it's bad for privacy. (i kid) |
16:45:02 | sipa: | Muis: i just gave you an 100% equivalent scheme |
16:45:36 | Muis: | Yeah i dont reallyneed it to back it up |
16:45:44 | zooko: | zooko has left #bitcoin-wizards |
16:45:56 | sipa: | sipa has left #bitcoin-wizards |
16:46:18 | kanzure: | dna is a bad idea for private key material because most dna is shared between almost all life we have ever known about |
16:46:40 | Muis: | Haha |
16:46:43 | Muis: | Good point |
16:46:45 | dgenr8: | and you leave copies of your private key on everything you touch |
16:46:50 | Taek: | ^ |
16:48:44 | Muis: | Could the process of matching DNA in a databank be potentially POW, or is it hard to verify the match? |
16:49:56 | dgenr8: | Holy crap. The incoming strand spins at 10,000 RPM. Must invent massively parallel organic miner. |
16:50:44 | kanzure: | Muis: i recommend learning some biology stuff first |
16:50:57 | kanzure: | http://diyhpl.us/~bryan/papers2/bio/books/Molecular%20Biology%20of%20the%20Cell%20-%204th%20edition.pdf |
16:51:03 | kanzure: | http://diyhpl.us/~bryan/papers2/bio/books/Primer%20for%20Synthetic%20Biology.pdf |
16:51:13 | kanzure: | http://diyhpl.us/~bryan/papers2/bio/books/Molecular%20biology%20of%20the%20gene%20(Watson,%20Baker,%20Bell,%20Gann,%20Levine,%20Losick%20-%202004%20-%205th%20ed.%20-%20Pearson).pdf |
16:51:30 | MRL-Relay: | [tacotime] I remember those text books |
16:51:43 | Muis: | kanzure: i hate biology so i hoped u might knew the answer by chance |
16:51:53 | MRL-Relay: | [tacotime] DNA is generally not super stable |
16:52:01 | MRL-Relay: | [tacotime] because of polymerase errors |
16:52:11 | MRL-Relay: | [tacotime] so you'd have to have a high redundancy |
16:52:21 | kanzure: | tacotime: dna is unstable for other reasons, some polymerases have error checking and much lower error rates |
16:52:40 | kanzure: | tacotime: for example, dna has rarely been known to survive for more than 2 million years |
16:53:07 | MRL-Relay: | [tacotime] well, yeah, but it's not like it's RNA |
16:53:31 | MRL-Relay: | [tacotime] i've pcr'd from template samples stored at -20C for a couple of decades with no issue |
16:54:15 | kanzure: | yes... if you look at rna wrong, your project explodes.... or something. |
16:54:20 | MRL-Relay: | [tacotime] as long as you keep it nuclease free, frozen, and in buffer, it's pretty happy |
16:54:53 | MRL-Relay: | [tacotime] yeah don't even get me started on rna isolation, i spent enough time tortured by reverse transcriptase protocols |
16:55:23 | MRL-Relay: | [fluffypony] ah yes, reverse transcriptase something something |
16:55:24 | kanzure: | tacotime: i would appreciate your commentary regarding https://groups.google.com/group/enzymaticsynthesis |
16:55:31 | MRL-Relay: | [fluffypony] the best kind of tase |
16:56:36 | MRL-Relay: | [tacotime] kanzure: is this some kind of group where computer engineers develop theoretical biochemical protocols? |
16:57:39 | kanzure: | tacotime: more specifically the goal of that group is enzymatic dna synthesis (without phosphoramidite chemistry).. my dna synthesizer suuuucks: https://www.takeitapart.com/guide/94 |
16:58:11 | MRL-Relay: | [tacotime] ehm |
16:58:43 | MRL-Relay: | [tacotime] phosphoramidite chem is fine... just make short sequences with small overlaps and use enzymes to fill in/stitch until you get the length you require |
16:59:03 | MRL-Relay: | [tacotime] we used to just straight up synth plasmids with thousands of base pairs for not too much money |
16:59:07 | MRL-Relay: | [tacotime] and that was only a year ago |
16:59:09 | kanzure: | yes but i don't want to spend $40M/genome or whatever... i want $1/genome. |
16:59:31 | MRL-Relay: | [tacotime] oh |
16:59:42 | MRL-Relay: | [tacotime] you want something that prints millions of base pairs? |
17:00:08 | kanzure: | yep... so i have been looking into something more like http://diyhpl.us/~bryan/papers2/microfluidics/synthesis/Synthesis%20-%20Microfluidic%20PicoArray%20synthesis%20of%20oligodeoxynucleotides%20and%20simultaneous%20assembling%20of%20multiple%20DNA%20sequences%20(10%20kb).pdf |
17:00:18 | kanzure: | (using DMD projectors, spatial light modulation, etc) |
17:00:44 | MRL-Relay: | [tacotime] is this your day job? |
17:01:06 | kanzure: | my day job is ledgerx |
17:01:34 | MRL-Relay: | [tacotime] artificial chromosome production is usually via yeast |
17:01:49 | kanzure: | sure |
17:02:49 | dgenr8: | dunno why it took so long to figure out how this stuff works. you can see it all right there in the video ;) |
17:03:38 | MRL-Relay: | [tacotime] ...i wouldn't recommend stepping too far outside of the biological systems for something like this. to stabilize massive amounts of dna you usually use histones/etc too. |
17:04:22 | MRL-Relay: | [tacotime] i was very good at sheering whole genomes after chloroform/phenol extracting them. keeping them in one piece when they're non-compacted is a nightmare. |
17:06:05 | MRL-Relay: | [tacotime] anyway, this is pretty OT... you can get in touch with me at dev.mc2@gmail.com if you want to discuss this in length. all my experience will probably be practical lab stuff, though. |
17:38:09 | btcdrak: | btcdrak has left #bitcoin-wizards |
18:13:11 | belcher_: | belcher_ is now known as belcher |
18:25:16 | oaavi: | oaavi has left #bitcoin-wizards |
18:40:44 | fluffypony: | altcoin whitepapers in a nutshell: http://i.imgur.com/2WuM3pW.jpg |
18:43:46 | belcher: | would read again |
18:46:26 | realcr: | fluffypony: It's great :) |
19:21:46 | Muis: | fluffypone: nice! |
19:21:58 | Muis: | btw: I listened to your podcast |
19:22:07 | Muis: | it was 6 hours long or so it felt |
19:22:14 | Muis: | but very interesting things about Monero |
19:45:42 | amiller_: | i like the leakage one better fluffypony |
19:45:54 | amiller_: | there's a whole journal for this junk :p http://www.anagram.com/jcrap/ |
19:46:11 | fluffypony: | Muis: yeah 3.5 hours, was crazy long:) |
19:46:28 | fluffypony: | amiller_: LOL - the Journal of Craptology! |
19:54:53 | MRL-Relay: | [tacotime] http://www.anagram.com/jcrap/Volume_8/heatherweight.pdf |
19:55:01 | MRL-Relay: | [tacotime] oh dear |
20:15:11 | fluffypony: | "The author is grateful to Antonio Banderas for inspiration. It was while watching his performance as Zorro that the zero function first surfaced as an idea for an encryp- tion algorithm. Previous meditations on his co-star had led to consideration of the zeta function ζ(s) = Σ∞ 1 , which did not work nearly so well." |
20:35:43 | kanzure: | link to the six hour podcast? |
20:47:06 | Muis: | https://www.mixcloud.com/dogedradio/monero-coin-interview/ |
20:47:55 | Muis: | that fluffy needs so many hours just to explain Monero's basics, says enough :D |
20:49:58 | gmaxwell: | fluffypony: you've seen the youtube video related to that chicken-chicken thing, right? |
20:51:04 | gmaxwell: | also if that were a altcoin whitepaper it would end with "chicken chicken, chicken. Chicken Profit!" |
20:55:19 | fluffypony: | lol |
20:56:01 | Muis: | gmaxwell: link? |
21:01:26 | kanzure: | above. |
21:19:30 | gmaxwell: | Muis: https://www.youtube.com/watch?v=yL_-1d9OSdk |
22:04:35 | fluffypony: | https://bitcointalk.org/index.php?topic=990285.new#new |
22:04:43 | kanzure: | .title |
22:04:44 | yoleaux: | [ANN] [CHK] Chicken | chicken. chickens. chicken. | RC Chicken |
22:05:13 | phantomcircuit: | fluffypony, please tell me they have a working client |
22:05:32 | fluffypony: | phantomcircuit: I haven't even started on the source code |
22:05:39 | fluffypony: | I figured I'd wait until someone creates DarkChicken |
22:05:44 | fluffypony: | and then I'd retro-fork that back |
22:08:56 | phantomcircuit: | fluffypony, lol the only reply |
22:11:07 | gabridome_: | gabridome_ is now known as gabridome |
22:41:56 | amiller_: | Muis, fluffypony, this is the craptology equivalent of the chicken talk. https://www.youtube.com/watch?v=89K3j_Rsbco its a panel discussion on leakage, featuring DJB and moti yung and ian goldberg |
22:42:44 | fluffypony: | *clicks* |
22:47:55 | fluffypony: | lol this is awesome |
23:10:27 | Sub|afk: | Sub|afk is now known as SubCreative |
23:59:37 | kanzure: | win 2 |
23:59:40 | kanzure: | whoops :( |