01:25:28 | adam3us1: | letstalkbitcoin tech interview :) (never like sound of own voice, cringe) |
01:27:15 | adam3us1: | (committed tx, fungibility, coinjoin, homomorphic values, centralization, 1-way peg... its long and tech heavy) |
02:03:18 | nsh: | let stalk = bitcoin; |
02:09:11 | andytoshi: | am i correct that the site requires flash to listen? |
02:11:55 | andytoshi: | nope, youtube-dl handles the soundcloud URL correctly: https://w.soundcloud.com/player/?url=http://api.soundcloud.com/tracks/130711534 |
02:22:37 | gmaxwell: | I dunno if y'all have been paying attention, but the gridseed ltc asics are claiming that they'll do 60KH/s for a power consumption of 0.44 watts. This is an improvement relative to gpus very similar to what bitcoin asics had relative to gpus. |
02:23:48 | brisque: | gmaxwell: we'll see, I ordered one just out of curiosity. |
02:25:11 | brisque: | gmaxwell: they're apparently very unstable, from what I've read. |
02:25:48 | brisque: | I still don't get why they paired an scrypt core with a very inefficient sha256d one though. |
02:25:53 | gmaxwell: | What I'm hearing is that the dual sha256/scrypt mode is flaky. |
02:26:09 | brisque: | mm, same. |
02:26:37 | gmaxwell: | brisque: why do you think it's inefficient? it's ~2W/GH for sha256 which is about as good as it gets on 55nm. |
02:27:36 | gmaxwell: | What I think their design is doing is taking advantage of the fact that the scrypt engine is area limited while the bitcoin work is thermally limited... so they get a part thats basically does both for the costs of one. (well, their prices are high, but thats markup) |
02:28:01 | brisque: | gmaxwell: oh I totally misread the email from the seller, I thought it was over 10W/gh for the sha256 side. |
02:28:16 | andytoshi: | nice interview adam3us1, i wasn't familiar at all with hashcash |
02:28:30 | andytoshi: | ..except there was a discover article which mentioned it in passing in 2004 or so |
02:38:23 | gmaxwell: | hm... this actually suggests a flaw in the scrypt paper. The argument for scrypt it based on chip area. But it really should be based on total costs including energy. |
02:39:48 | gmaxwell: | since a cracking chip ends up being thermally limited, increasing the area required may not actually increase costs much at all. |
02:42:24 | brisque: | rather than being thermally limited, couldn't it be that they just couldn't fit a second scrypt scratch pad in and just put a sha256d core there to fill the space? |
02:42:37 | andytoshi: | increasing the die size should increase the cost/unit proportionally, no? the wafers are a fixed size |
02:51:20 | gmaxwell: | andytoshi: no, not when you need to waste area just to act as heat spreading, and not when your total costs for your cracking infrastrcuture are dominated by energy. |
02:51:55 | gmaxwell: | In fact, it may even be counterproductive (e.g. reducing the energy ratio between attacker and defender enough that the attacker's advantage increases) |
02:53:34 | gmaxwell: | it's currently the case that any piece of high performance computing's energy costs surpasses manufacturing if its operated for more than a few months. |
02:54:15 | brisque: | is a piece of bitcoin mining gear worthwhile after a few months? |
02:54:21 | brisque: | currently it's not. |
02:54:46 | gmaxwell: | brisque: sure. lol. careful with those exponential extrapolations. |
02:55:16 | brisque: | gmaxwell: oh I'm not predicting based on them, just observing that it's currently fairly vertical. |
02:55:31 | gmaxwell: | my b1 avalons still mine 3x their power cost, ... and keep in mind that decrease in returns is exclusively driven by competition from more power efficient devices. |
02:55:53 | gmaxwell: | I'm not talking about mining now in any case, I'm talking about KDFs. |
02:56:54 | brisque: | any sensible KDF wouldn't have used the settings Litecoin picked though |
02:57:21 | gmaxwell: | yes, but the 'sensible' settings would make this discrepency worse, not better. |
02:57:45 | gmaxwell: | e.g. you can choose between two KDFs that take 500ms (user tolerance threshold). It's possible that the memory hard one is actually cheaper to attack once you've factored in power costs because it performed far fewer operations in that time because it was spending time waiting on memory. |
02:58:42 | gmaxwell: | the scrypt paper computed costs purely based on area, not power. This is clearly incorrect thinking because on any fixed computing infrastructure the power costs are greater. Though I don't know if it happens to break their conclusions. |
02:58:58 | gmaxwell: | The gridseed parts suggest its a wash at the parameters ltc used. |
02:59:36 | brisque: | if not memory hard, what is the ideal KDF? |
02:59:58 | gmaxwell: | But I'd expected that more memory usage would not increase power usage, but would make it slower on desktops (e.g. fewer operations within the user tolerance window). But that would be interesting to crunch through and see how the numbers work out. |
03:01:44 | gmaxwell: | brisque: well the correct question is given the commodity hardware the users have, the user delay budget, and the most optimal possible attacker hardware, what parameters minimize the attacker's advantage. |
03:03:23 | brisque: | 500ms is probably on the low side of what a user could tolerate. it's amazing what spinning indicators and progress bars can do to alter the perception a user has of a slow operation. |
03:03:26 | gmaxwell: | The Scrypt paper argues that very memory hard things minimize the attackers advantage because it forces the attacker to spend more mm of silicon. I now think this is suspect because mm of silicon is a minority of a large scale attacker's costs... though that doesn't mean that there isn't some particular non-zero memory hardness level that produces the smallest ratio. |
03:04:05 | gmaxwell: | It was a random number, it doesn't actually matter. |
03:04:42 | gmaxwell: | and fwiw, 500ms w/ bitcoind to authorize a transaction is actually irritating when its in the foreground. |
03:05:18 | brisque: | I know, but I've always found it fascinating how users perceive different delays. in a shell a few milisecond delay is horrible, yet people wait 20 seconds for microsoft word to start. |
03:05:30 | gmaxwell: | (our kdf is 100ms by default which is pretty much imperceptable... there seems to be a somewhat sharp wall on delay between imperceptable and annoying somewhere around .5s.) |
03:05:47 | brisque: | if the signing happened in the background and took half a minute it wouldn't matter in the slightest. |
03:06:16 | gmaxwell: | brisque: well except that it can't even tell you if you typed the key wrong until after the delay. |
03:06:23 | brisque: | well it would if the password was typed incorrectly, but it's the fact that the interface shows the latency rather than hiding it. |
03:06:54 | gmaxwell: | the fact that you need to be sure you can get the users attention again and that you can't report success until after its done makes it harder to hide. |
03:07:42 | gmaxwell: | in any case, as I said— it's irrelevant. There is some budget, whever it is. The question is how do you best use it to increase the attacker's total cost. |
03:09:40 | brisque: | probably by avoiding both cases. a very complex algorithm would be a hindrance to hardware implementations, wouldn't it? you avoid the energy saved by waiting around for memory, and you avoid making very simple hashing cores like for sha256d. |
03:10:55 | brisque: | that is, you have the best of both worlds. high power cost for the attacker and massive die space. |
03:15:49 | gmaxwell: | brisque: no. a very complex algorithim just increases the engineering work, but thats probably small compared to other costs for a large scale attacker. |
03:15:59 | gmaxwell: | After all, your own computer runs the complicated algorithim. |
03:19:38 | brisque: | gmaxwell: right. |
03:33:44 | jps_: | jps_ is now known as jps |
05:50:50 | petertodd: | gmaxwell: nifty chips - vitalik claims they're going to do a PoW (+PoS) competition - I predict it's going to be a horrible failure because the don't even have the skills to properly vet candidate judges... |
05:52:13 | petertodd: | gmaxwell: incidentally, I was talking about PoW with a EE unfamiliar with the field, and he independently thought of the area-power re-use thing immediately, which I think indicates how utterly out to lunch 95% of the people here are (scrypt authors included) |
05:52:23 | gmaxwell: | petertodd: well and everyone participating has an incentive to play up their advantages. It's also predicated on a goal which is not proven to be objectively worthwhile. |
05:53:00 | gmaxwell: | yea, this wasn't obvious to me before. Now it really would be interesting to go analyize scrypt power usage and go compute up the total costs. |
05:53:01 | petertodd: | gmaxwell: meh, the other thing the EE immediately saw was how important the goal was - he understood damn well how easily niche technology gets regulated out of existence |
05:53:34 | petertodd: | gmaxwell: it *is* an existential threat and figuring how how best to solve it is very important, even if only to make sure the threat doesn't actually happen |
05:54:44 | petertodd: | I really suspect there's some interesting games you can play with power gating memory and scrypt - for instance you could probably make a low-power dram implementation that doesn't refresh ram and accepts errors in exchange for low power (another thing that EE immediately thought of) |
05:55:30 | gmaxwell: | actually the lifetime of the required memory is so low it probably doesn't need refresh. |
05:56:37 | petertodd: | that's the *problem*! DRAM controllers already take that into account, but on top of that optimization you can probably push voltages even lower than standard, and maybe even use some simple, and custom, prediction stuff to shave it even further |
05:56:50 | gmaxwell: | scrypt access patterns are somewhat unpredictable so it would be hard to just size the capacitors so that it never failed, but you could still get failure rates as low as you want. |
05:57:20 | petertodd: | yeah, and economically optimal is going to be very high failure rates by conventional standards |
05:57:41 | petertodd: | probably orders of magnitude higher - so much so that the design will be 100% custom |
05:58:01 | gmaxwell: | yea, existing mining hardware runs fine at failure rates around 1%. e.g. stuff ships out of the factor with ~1% of returned nonces being wrong. |
05:58:30 | petertodd: | existing computers have failure rates probably... I dunno, twelve orders of magnitude less than that? |
05:58:39 | gmaxwell: | you can't run commodity silicon at those error rates because something important will glitch out and it'll wedge. |
05:59:30 | petertodd: | well... that's changing though, because designers are being forced into that kind of error territory - we're also lucky that GPU's can tolerate higher error rates than other computing stuff, kinda |
05:59:37 | gmaxwell: | (this was actually one of the reasons gpu mining headlessly worked better: most cards could be pushed a lot futher when they weren't displaying anything) |
06:00:44 | petertodd: | in any case, said EE thought my ideas about FPGA "cottage industry" PoW algorithms were feasible, because FPGA hardware these days can have a surprising about of power gating and similar tech |
06:01:28 | petertodd: | similarly things like DRAM often have a lot of control over how the internals work if you're willing to attach it to a custom controller, and those controllers are FPGA-implementable with good performance |
10:34:14 | adam3us: | yeah I was wondering as a trend if FPGAs can get closer to ASIC in density, and reduce the ASIC/FPGA performance gap, and that as seemingly moore's law may top out with current fab around 5nm, then the next stage is more cores, more CISC designs, and reconfigurable - eg if you have some GPU units on the die, why not a slab of FPGA; we already have microcode, why not lower (hw) level reconfigurabilty as an on die FPGA co-processor |
11:03:11 | wumpus: | adam3us: so you're counting on the overhead for (low-level) programmability to go down; any specific reason for that? |
11:03:47 | wumpus: | it would be great, agreed though |
11:06:20 | adam3us: | adam3us: they're running out of other options, and the intel & amd & arm chips are getting more and more cisc. gpu, mmu, power regulator, level 4 cache, more simd instructions, special crypto instructions, codec instructions. seems like the next step. (I am not a hw person tho). so if there is room, and fpga are maybe not so widely used vs cpu so maybe with more r&d focus that asic/fpga gap could be closed somewhat |
11:08:13 | wumpus: | there certainly seems to be a trend toward lower-level many-core paralellism programmability in newer architectures (paralella, xmos), but not entirely at the gate level, it's more GPU-like from what I understood |
11:10:30 | wumpus: | one of the (sw) problems with FPGAs in general-purpose computers is sharing them between applications, it's a limited resource users may not easily understand. GPU vendors spend a lot of work on context switching / multitasking, but on a FPGA that may be harder. |
11:15:19 | wumpus: | of course, if you have a fast programmable FPGA or one that supports partial reprogramming you could maybe dynamically allocate gates, but from what I've seen up to now reprogramming a FPGA isn't quite as granular/fast |
14:09:20 | wallet42: | wallet42 is now known as Guest56965 |
14:09:20 | wallet421: | wallet421 is now known as wallet42 |
16:04:06 | jps_: | jps_ is now known as jps |
22:23:38 | poggy_: | poggy_ is now known as poggy |
23:23:57 | gmaxwell: | Interesting: I emailed Colin Percival and expressed my concern that the scrypt cost assumptions may be inaccurate due to a failure to account for energy consumption and asked if he'd performed or was aware of anyone else performing an analysis which included energy consumption. |
23:24:30 | gmaxwell: | He responded and said "I'm not aware of any analysis which includes energy consumption. I don't |
23:24:33 | gmaxwell: | know anyone who has looked at this who has the necessary expertise in |
23:24:35 | gmaxwell: | microfabrication technologies to accurately predict how energy-efficient |
23:24:38 | gmaxwell: | a *custom* circuit could be." |
23:26:12 | phantomcircuit: | gmaxwell, hmm? |
23:31:37 | gmaxwell: | phantomcircuit: New theory: Scrypt may be less effective as a KDF than the conclusions in the scrypt paper suggest because the analysis there did not include operating costs, just chip making: For number crunching chips the power cost outpaces the fabrication cost quite rapidly... and given a specific commodity hardware time budget scrypt cracker may actually use less power (than say sha256-pbkdf2). |
23:33:19 | phantomcircuit: | gmaxwell, that is certainly correct |