--- Log opened Sat Oct 12 00:00:03 2013
11:53 < nanotube> so my bitcoind node is averaging total connections in the 120s (out of 128 total), with 40-50 through tor. only one data point, but seems to suggest that open network slots are relatively few.
11:53 < jgarzik> indeed :/
11:53 < jgarzik> it gets ever more expensive to set up a full node
11:53 < jgarzik> and all of them are unpaid
12:01 < jgarzik> Satoshi predicted bitcoin would eventually devolve into miners running the only full nodes… that would be disappointing
12:13 < nanotube> bitcoind is also using roughly 500MB of ram. I've got 2G here, so i could up the connection count to like 512 and see how it likes it.
12:26 < nanotube> anyone checked out http://academiccommons.columbia.edu/catalog/ac:110756 ? i just found it, seems like if it would work for tornodes, it might also work for bitcoin nodes?
13:07 < jgarzik> nanotube, the no-wallet mode should help… saves 40-200MB
13:10 < nanotube> personally i'm not hurting for ram on this vps, but yea the less ram it uses, the lower the barrier to running a node.
17:57 < gmaxwell> 09:01 < jgarzik> Satoshi predicted bitcoin would eventually devolve into miners running the only full nodes… that would be disappointing
17:58 < gmaxwell> especially since there are only like 5 miners. :(
18:00 < gmaxwell> nanotube: I've had some ideas about a lottery to pay people that runs nodes... but I'm somewhat concerned that once you've gone down that path it's not hard for someone to outbid you with "My lottery pays 10% more, but you have to run this special node software which is detectable as special only to me that does <???>" (e.g. sends logs of all transactions back to a mothership, imposes new network rules, etc)
18:00 < warren> fully verifying or archival with all blocks?
18:00 < warren> miners don't even need full blocks...
18:01 < gmaxwell> warren: that assumption wasn't that it would be out of necessity. Once you've got a business that has you supporting the bitcoin network, .. having a few hundred gigs of diskspace for it isn't that big a deal.
18:02 < gmaxwell> so at least as the network rules are now, I don't think access to the historic blocks is the greater problem.
18:03 < gmaxwell> (ha, well I say that, but currently none of my nodes could spare more than 100gb for bitcoin... my standalone nodes are on 120gb SSDs, and I only have 270 gb free on my laptop.)
18:06 < warren> I suppose it's more important to have more listening fully verifying nodes than to have archival nodes.
18:06 < sipa> archival nodes can just be gttp servers
18:06 < sipa> or dropbox
18:07 < sipa> *http
18:07 < sipa> there is nothing hard about them, except storage and bandwidth
18:07 < gmaxwell> I really really don't like this bimodal thinking that some people are developing wrt a bright line between full vs archival. I think it's a receipy for disaster, because it provides no way to contribute partially: just a binary "enormous amounts of bandwidth and storage" or "not enormous".
18:07 < warren> would the future client automatically enforce integrity of that bootstrap.dat by keeping the checkpoints?
18:07 < sipa> right
18:07 < maaku> gmaxwell: an altchain using your utxo-pow, plus cross-chain trade exchanging altcoins for bitcoins?
18:08 < gmaxwell> recipe*
18:08 < sipa> heh, i want to get rid of checkpoints altogether
18:08 < warren> I know
18:08 < warren> hence I asked if there's any safe way to automatically enforce bootstrap.dat integrity without it
18:09 < gmaxwell> yea, checkpoints need to go, they're a huge cognitive landmine. :(
18:09 < warren> how so?
18:09 < gmaxwell> warren: they enforce it by having verified the chain.
18:09 < warren> aside from an excuses from the broadcast checkpoint people
18:09 < sipa> why would bootstrap.dat need to be integral?
18:09 < warren> isn't that what you meant by http or dropbox?
18:10 < sipa> no, i meant that there is nothing hard about archival
18:10 < sipa> it doesn't need spexial software
18:10 < sipa> it doesn't need low latency
18:10 < sipa> it dpesn't need trustable nodes
18:11 < warren> so it doesn't matter if it is corrupted or provided by a hostile entity, because it won't verify and come in sync
18:11 < sipa> yeah
18:11 < gmaxwell> warren: Because the notion of a decenteralized consensus is really alien to people, and so they flail around looking for a traditional trust model inside bitcoin. Then they find checkpoints and they say "aaahhh.. Now I finally understand how bitcoin really works" but really they don't understand it at all. Bitcoin has failed if the prodution networks consensus is ever set by checkpoints.  The result is people constantly making lame ...
18:12 < gmaxwell> ... insecure proposals and then excusing them with "sprinkle more checkpoints on them!" which doesn't really solve anything because .. what? are we going to add another blockchain to chose these checkpoints-that-would-actually-matter? and what would secure that one?
18:12 < gmaxwell> warren: plus you can do a light validation of it that just checks its hashes... and then you compare the best block hash to your own chain on your own node, and then you are 100% sure that the bootstrap.dat is correct.
18:13 < sipa> people have somehow accepted that you don't need signatures before the checkpoints
18:13 < sipa> which is true, once you trust the checkpoints
18:14 < sipa> but it really is just a shortcut to avoid a trivial mislead-a-syncing-client attack, if we'd just disable sig checking for old blocks
18:14 < gmaxwell> And of course that stuff closes off thinking optimizations which are not so hostle to a trust free model: things like randomly verifying and alerting people on any violation.
18:14 < gmaxwell> s/thinking optimization/thinking about optimization/
18:15 < sipa> they're an evil necessity once you accept the compromise of not checking all sigs
18:16 < sipa> with headersfirst syncing, you can safely disable sigchecking without checkpoints
18:16 < sipa> well, safely... not less safe than what we have now
18:16 < sipa> it's still a compromise
18:17 < gmaxwell> and there are more degrees available.
18:17 < gmaxwell> e.g. checking 1:1000 signatures in the historic chain is virtually as fast as checking none at all. But with many nodes, you are virtually assured that someone will notice any cheating ... drastically reducing the incentive to create a long fork that would be needed to attempt it.
18:17 < warren> "someone will notice" assumes others are not asleep to hear the warning
18:18 < warren> we have thousands of clients still running old versions that have perma-alerts ... asleep
18:18 < gmaxwell> warren: keep in mind you're already talking about that being predicated on an attacker replacing months of the chain.
18:19 < warren> true
18:19 < warren> ok
18:20 < gmaxwell> warren: I don't think it's worth the risk/code complexity at least in the short term but the response there could be automated ultimately.
18:20 < warren> are you thinking to do random sig validation, and also PoW validation?
18:21 < sipa> PoW + utxo everywhere, sigchecks after last checkpoimt: that's what we have now
18:22 < gmaxwell> warren: e.g. each node checks all the sigs for the blocks within the last two months of POW at current difficulty. And before that they check only 1:1000. (and if you have automatic response) if they find an invalid signature they could announce it, and the network could relay that announcement, and blacklist the block in question. (this last bit I don't think is worth doing in the short term)
18:22 < gmaxwell> (but I think it would be worth doing someday after utxo in blocks, with SPV nodes doing some randomized validation of their own)
18:23 < sipa> we could have something like pow + utxo everywhere, between 1 year and 1 month of PoW worth of burying and increasing % of sigchexks, and in the last month worth of PoW check everything
18:23 < sipa> my phone typing skillz are weak
18:24 < gmaxwell> This could be made stronger if it didn't just check the signatures in the last N POW-months of blocks, but also always checked all of them after a reorg.
18:24 < sipa> hmm?
18:25 < gmaxwell> sipa: e.g. say you check the last month of blocks. Then someone does a 1.25 month deep reorg. You'd still check all of those. So then a reorg could never insert invalid signatures. You could only get invalid signatures on startup... so an attacker could only trick new nodes, and his trickery would end as soon as everyone else got ahead.
18:26 < gmaxwell> basically it reduces an attacker issuing invalid signatures to isolation attacks instead of actually getting the network to accept an invalid signature as valid.
18:30 < gmaxwell> making that gate stateful kinda sucks. It could be better stated.  "You will check all blocks higher than X, if you are aware of a header valid fork at X or prior which has at least Y work more than X", where Y could be something like a days worth.
18:31 < gmaxwell> so normally a new node would check only the last (say) POW-month's worth of signatures. BUT if that node is not isolated and sees a long fork at 1.25 months, it will check since 1.25 months ago.
18:32 < gmaxwell> I am very happy with this. I think the result is that it is only a bootstrapping time compromise. E.g. there could be a conspiracy of bitcoin users to have broken the rules in the past, but nothing worse than that. And that can be substantially closed with random checks before the cutoff. (the conspiracy would only work if it could be kept secret).
18:42 < jgarzik> http://www.wired.co.uk/news/archive/2013-10/12/us-internet-control
18:42 < jgarzik> sad side effect will be greater localization of data inside more represssive regimes
18:43 < sipa> perhaps more on-topic in dev?
18:43 < sipa> or rather, less offtopic
18:50 < nanotube> gmaxwell: as to your concern for "i pay you more to run my special node" <- how does that become /more/ of a concern than now? currently, someone can say "i'll pay you to run my special nodes" and users will be comparing "run bitcoin nodes for no compensation" to that.
18:50 < nanotube> i don't see how compensating our nodes could make that problem any /worse/
18:50 < gmaxwell> nanotube: because presumably we don't have 99% of nodes being run by people who are out to make a profit doing it.  Offering some money to run spy nodes (or whatever) would only switch a small percentage of the total nodes.
18:51 < gmaxwell> nanotube: vs if running a node were widely seen as a money making endeavor, perhaps it would switch most of them.
18:51 < gmaxwell> It's a concern, I'm not sure its a good one.
18:52 < gmaxwell> but I've seen with mining that introducing money into things creates a lot of weird effects. Pirate's hashrate buying service got a LOT of hashrate...
18:52 < nanotube> well, i see what you are saying. but i'm not sure if we model it with real variables, it's actually a concern. let's say currently we have N people running nodes for no compensation.
18:52 < nanotube> if we introduce compensation, we'll have those N people, plus P other people who would only run because of compensation
18:53 < gmaxwell> but will the N continue if there are M people running for pay where M >> N?  Certantly my motivation to run nodes would be reduced if there were already plenty of them.
18:53 < gmaxwell> (My M is your P)
18:54 < gmaxwell> and in terms of network risks,  the ratio of good to bad nodes can matter more than the absolute number of good nodes. E.g. if 99% of nodes are bad it doesn't matter if there are a million good nodes— you'll only infrequently connect to one.
18:54 < nanotube> ok, let's introduce that factor also. :)
18:54 < nanotube> irc sucks for this, i'm going to write some text.
18:55 < gmaxwell> Sweet our model now needs an ordinary differential equation. :P
18:56 < nanotube> heh
18:57 < gmaxwell> I haven't tried to model it in detail because I expect that I can pick parameters that go either way and won't be able to decide between them. :(
19:10 < nanotube> http://pastebin.com/CfNMB85D <- really naive model... basically since marginal benefit to running a 'good node' is larger if we offer compensation, it seems we'd be no worse off.
19:10 < nanotube> the only catch is, if our offering compensation increases probability that evil will use that technique to do evil.
19:16 < gmaxwell> yea, thats something that I specfically argued when I talked to the tor folks about doing this in tor... that there may be a kind of initial hump in getting people to think of running nodes as a viable enterprise that currently keeps an attacker from doing it.
19:16 < gmaxwell> I'm not sure.
19:23 < nanotube> in addition to the hump, the big hurdle of developing the technology would be taken care of.
19:23 < nanotube> cf, how easy it is to create $fakecoin now that bitcoin is out there.
19:25 < nanotube> but for tor it's somewhat different, because it doesn't get /more/ expensive to run a node over time. for bitcoin it does, so the end game is dramatic shrinkage in node count.
19:26 < nanotube> that said, dunno if you're aware, tor has started some compensation scheme, where some nonprofit in the netherlands is going to pay 3500/month (total) to however many nodes register with the program, or some such.
19:26 < nanotube> so we get to learn from their experience on that front, a little bit.
19:28 < gmaxwell> I know, I'd passed on these concerns to them. (particularly pointing out that if they built the infrastructure so that any anonymous party could pay any tor node, that it might create some weird outcomes like pay-to-spy)
19:28 < gmaxwell> seems like they avoided setting things up like that, at least for now.
19:29 < nanotube> heh well, the government TLAs don't need to pay any third parties to spy. if nsa really wanted to take over tor, it'd only take them a trivial fraction of their budget to spin up like 10k tornodes, and make up significantly more than half of the tornet.
19:30 < nanotube> in fact... maybe 2k out of the 4k-some tor nodes already are government. ...
19:38 < gmaxwell> nanotube: perhaps, but paying third parties might be a more cost effective way to do it. ... and if you're some cybercrime group it might be an interesting thing to play with.
19:39 < nanotube> mm maybe...
19:40 < nanotube> i'm surprised the tor router project doesn't seem to have taken off. beyond a wiki page on setting it up https://trac.torproject.org/projects/tor/wiki/doc/OpenWRT
19:40 < nanotube> they could be selling pre-torified buffalos
20:05 < warren> someone we know here expert in embedded systems is thinking about selling bitcoind low power appliances
20:10 < nanotube> aka, netbook with bitcoind on it? :P
20:13 < warren> headless
20:14 < warren> probably ARM with 2GB RAM
20:17 < nanotube> mm
20:18 < warren> businesses often don't use their bandwidth at night when the office is empty, so if it costs them very little in power, they could run high capacity listening nodes at least all night and throttle back or stop listening during the day.
20:27 < gmaxwell> nanotube: one interesting point is that evil vs good pay is probably not mutually exclusive.
20:28 < gmaxwell> nanotube: e.g. you get payed X to run a good node, and if it also spys on users, you get Y too.
20:29 < warren> hmm, headless bitcoind appliances would need some kind of autoupdate mechanism ...
20:29 < warren> the maker could sell subscriptions to good/evil parties
20:30 < warren> It's amoral, it's just business!
20:36 < nanotube> warren: and while you are at it, put a tor node on the appliance also. that way bitcoin network will become less blockable, and if you turn on relaying by default (with some small transfer cap) you benefit the tor net also.
20:37 < nanotube> gmaxwell: hmm good point.
20:37 < warren> they probably won't like exit node by default
20:37 < nanotube> warren: sure not exit, just relay
20:37 < warren> nanotube: I'm not the one designing this thing
20:37 < warren> he just mentioned he might do it
20:37 < nanotube> warren: well, yea, i mean, pass it along :)
20:38 < nanotube> gmaxwell: but that just means it's cheaper to be evil. :)
20:39 < gmaxwell> nanotube: well it means that if someone else is paying the activitation cost to make a pure profit motivated person run a node, an evil party can redirect most of that effort at far lower cost.
20:40 < nanotube> yes. so s/cheaper/much cheaper/ :P
20:40 < gmaxwell> evil only has to pay enough to move people from good to evil, not to run a node.
20:40 < gmaxwell> yea.
20:41 < nanotube> but many forms of evil can be tested for and not paid. e.g., transaction validation and relay variances, etc.
20:41 < nanotube> spying, not really. but... everything that goes through through the bitcoin network is public anyway. so i'm not sure how much use there is in evil-spying for pay.
20:42 < gmaxwell> yea, spying can't be tested except by the evil master, and rule changes that trigger in the future can't be tested for. (well evil master could kinda test for them, but not anyone else)
20:42 < gmaxwell> nanotube: ::shrugs:: bc.i has monetized their own spying pretty well— they post people's IPs and then charge people to use their mixer service. I believe its their only revnue source now.
20:43 < warren> s/mixer/shared send/
20:44 < nanotube> do you think they'd make less money on the mixer if they didn't post people's ip addresses? :P
20:48 < gmaxwell> I do. Though I only have the informal evidence of people showing up in IRC angry that the bitcoin blockchain recorded their IP addresses, from time to time. (::facepalm::)
20:49 < gmaxwell> (and then seeing people direct them to the mixer thing)
20:49 < warren> perhaps delisting for  a fee could be another revenue source =P
20:50 < gmaxwell> cheaper to just block them.
20:56 < jgarzik> gmaxwell, definitely not their own revenue source
20:56 < jgarzik> gmaxwell, hint: advertisements on the front page float by unpredictably
20:57 < gmaxwell> oh hey, I just came up with an almost secure way to selectively hang up on nodes which connect to lots of other nodes.
20:57 < warren> oh?
20:59 < gmaxwell> using cryptographically private bloom filters: http://www.reddit.com/r/programming/comments/1ixoov/cryptographically_private_bloom_filters/cb91uj9
21:00 < gmaxwell> the idea is that your peers give you an encrypted list of their peers. You can then encrypt your list of peers, send them to the peer and have the peer reencrypt them, and then you can decrypt the result and tell what peers you have in common.
21:01 < gmaxwell> I say almost secure because if some node was hated by lots and lots of nodes, those nodes could lie and say he was connected to them, in order to encourage other people to drop connections to that node.
21:01 < gmaxwell> but ignoring that attack, this would let you be able to do something like hang up on peers that are already connected to half your other peers.
21:01 < gmaxwell> without disclosing who is connected to who.
21:02 < gmaxwell> (your peers would limit the number of queries you could perform, so you couldn't just test all nodes against their lists)
21:03 < warren> "you" being a connection or an IP?
21:03 < warren> and does that fail if you change your IPv4, or ipv6?
21:04 < gmaxwell> nah, I don't think so, since they could just limit the queries globally. E.g. I won't answer more than X queries per day or whatever.
21:05 < warren> so you can make the entire system just stop working
21:06 < warren> gmaxwell: this could be defeated by simply randomizing the from addresses, combining all the data into a surveillence net
21:07 < gmaxwell> I'm not talking as much about surveillence as I am about connection satuartion.
21:08 < gmaxwell> Today you can fill up all connection slots on the bitcoin network with 1 IP. With some easy fixes we could increase that you needed 124 IPs.
21:08 < gmaxwell> But making it take more than 124 IPs seemed mostly unsolvable to me, perhaps its not.
21:09 < gmaxwell> making surveillence a little harder would be a nice side effect.
21:09 < warren> ooh
21:09 < warren> there's more low hanging fruit to raise the cost of filling all listening slots
21:10 < warren> https://togami.com/~warren/archive/2013/example-bitcoind-dos-mitigation-via-iptables.txt  (with a limit that is not quite this small)
21:10 < jgarzik> network attacks against bitcoin have best ROI today </standard refrain>
21:10 < nanotube> <gmaxwell> Today you can fill up all connection slots on the bitcoin network with 1 IP. <- i thought current code prevented multiple connection from same subnet ?
21:11 < gmaxwell> nanotube: no, we won't make outbound connections to the same netgroup (/16 for ipv4) but inbound is unrestricted. And it should be— since otherwise it would be somewhat hard to connect from some universities and countries.
21:12 < nanotube> hmm
21:12 < gmaxwell> (instead, when we fill up instead of turning away new connections we should see if there is a less attractive old one to punt, e.g. punt the duplicate IPs preferentially)
21:12 < gmaxwell> But we don't right now.
21:12 < nanotube> huh, so we don't even block the same ip from connecting twice?
21:12 < warren> nope
21:12 < jgarzik> code it up and PR it ;p
21:12 < nanotube> at the very least, /that/ seems like a low-cost thing.
21:13 < gmaxwell> nope And if we did, as I said, that would cause some problems.
21:13 < nanotube> no country/university has only one ip :)
21:13 < warren> nanotube: and that isn't a good defense if you think about ipv6
21:13 < gmaxwell> nanotube: actually several countries connect entirely from a single IP.
21:13  * nanotube avoids thinking about ipv6 >_>
21:13 < gmaxwell> E.g. Qatar IIRC.
21:13 < nanotube> but heh the private bloom filters bit is pretty cool.
21:13 < nanotube> heh really? wow.
21:14 < nanotube> so quatar just has one giant country-wide NAT ?
21:14 < gmaxwell> yea.
21:14 < nanotube> lol >_<
21:14 < gmaxwell> Things you learn being a Wikipedia admin. "oops you just blocked Qatar. Again" "Opps you just blocked univsity of foo. Again."
21:14 < nanotube> well, all of qatar probably has 2 bitcoin users. they'll manage.
21:14 < nanotube> hehe
21:15 < gmaxwell> I accidentaly the whole qatar.
21:15 < nanotube> is it deliberate, or were they just not allocated any ips?
21:15 < gmaxwell> In any case, it would be pretty easy to make the node-full behavior turn into kick out some old peer based on some priority thing. I would have done it already but there really is no end to the amount of thinking you can do behind the priortization scheme.
21:15 < gmaxwell> speaking of that.. I should probably just PR my dont-use-get-my-ip patch, since it seems no one is going to review the idea without a PR...
21:16 < gmaxwell> :P
21:16 < gmaxwell> but first, dinner.
21:16 < gmaxwell> nanotube: I assume it's more or less deliberate.
21:17 < warren> is there one state owned ISP?
21:17 < nanotube> probably
21:18 < gmaxwell> I would assume, I never looked into it. Thats the case in a lot of those places.
21:19 < gmaxwell> not exactly that most important use cases, but I'd rather not make the system gratitiously hostile. there are a bunch of reasons why you generally want to allow multiple connects from the same IP. E.g. my local nodes addnode each other.. and if we were limited to 1 they'd get rejected... even from nodes that don't listen on the public internet.
21:20 < warren> local nodes would have RFC1918 addresses?
21:20 < gmaxwell> mine don't. Not everyone is behind n-layers of nat, esp on ipv6.
21:21 < warren> especially with ipv6, limiting per IP probably isn't going to work
21:22 < gmaxwell> In any case, go look in the logs here I described my thinking on this, I think there should be a set of priortization which protects some nodes from being dropped and then randomly drops based on a score for the rest, the score could include things like being in the same ipv6 /48 as other peers.
21:22 < gmaxwell> (or even the same /32)
21:28 < warren> hm, "BitcoinJ always bootstraps from DNS seeds."
21:30 < jgarzik> indeed
21:31 < jgarzik> bitcoinj-based Bitcoin Wallet does not rotate keys for each transaction
21:31 < jgarzik> bitcoinj-based Bitcoin Wallet does not support P2Sh
21:35 < warren> multibit also appears to not tell you how many peers you have
21:35 < warren> seems rather insecure for the default client on bitcoin.or
21:35 < warren> org
21:40 < gmaxwell> warren: I think multibit only connects to 4 too, but I also thought that about android wallet and sipa demonstrated otherwise.
21:41 < gmaxwell> IIRC bitcoinj also only queries a single dns seed at random. e.g. instead of doing something like taking one peer from each round robbin. (though not like its hard for a network attacker to intercept DNS)
21:42 < gmaxwell> I dunno if you saw the last round of snodwn papers but it looks like the NSA has a DNS race interception infrastructure.. e.g. use passive taps to see dns queries and then respond faster.
21:42 < warren> wouldn't you see two responses if you were the victim of that?
21:42 < gmaxwell> sure, but you take the first one.
21:42 < warren> and nobody is watching for the second
21:43 < gmaxwell> (I have a friend that runs a really big DNS GSLB infrastructure that works that way too: you query for their domain, they forward the query to all their clusters, and then when the NTP clock strikes the next 100ms interval they all respond at the same time)
21:43 < jgarzik> interesting
21:44 < jgarzik> I know ISC does a lot of anycast
21:44 < jgarzik> anycast works much better for UDP than TCP ;p
21:44 < gmaxwell> hehe indeed.
21:45 < jgarzik> For at least a decade, F root was the most distributed DNS setup by 10x, IIRC
21:46 < jgarzik> At least one other root went distributed years ago, hopefully the others have followed by now
21:46 < jgarzik> Google's new database consensus/sync stuff relies on accurate clocks
21:47 < jgarzik> as 'time' is fundamentally distributed and (in theory) always synchronized
21:47 < jgarzik> relying on that become then an expensive hardware problem of "getting the right time, always"
21:47 < jgarzik> *becomes
--- Log closed Sun Oct 13 00:00:05 2013