--- Log opened Thu Aug 29 00:00:50 2013
20:15 < gmaxwell> petertodd: so, generalizing the sighash flags.  Imagine a tree structured transaction seralization. There are N leafs matching up to the N data values being encoded.
20:16 < petertodd> Yup
20:16 < gmaxwell> petertodd: you form an N bit vector, setting 1s for all the items you want to sign for, and then you can encode that vector by encoding run lenths values.
20:16 < petertodd> Exactly what I was thinking too
20:17 < gmaxwell> e.g. if N=100 then you might code <100> to indicate all 1s.. or if you code 101111..<end> 1,98 or whatever.
20:17 < petertodd> You can further simplify it too by making the interpretation of that vector be centered on the input, so simple concatenation works.
20:18 < gmaxwell> and then you can stick on the checksig operator this runlength sequence as an input, you gather up the leafs that are matched by the mask and sort them by value.. and thats what you sign.
20:18 < gmaxwell> petertodd: you don't need to though because to support any changes you'd leave the runlength token outside of the signature.
20:18 < gmaxwell> so someone adding to the transaction would just compute another runlength token.
20:19 < petertodd> gmaxwell: Aw heck, I was thinking to simpify that compute code, but yeah, it'd probably just be easier to index from zero anyway.
20:19 < gmaxwell> But ... the downside of this is that it leaves malleability. And I'm annoyed that I see no way to preserve the flexibility I want without creating free malleability.
20:19 < petertodd> Yeah, I think that's impossible. Better to make a new system where you can sign a scriptPubKey:valout output instead.
20:19 < gmaxwell> (if you want to be complicated there are all sorts of fancy things you can do to make coding the runlength value efficient... but since you never hash it.. it's not really protocol normative)
20:20 < petertodd> *scriptPubKey:value
20:20 < gmaxwell> yea, I don't see how the malleability can ever really be completely removed unless you really heavly restrict scriptsig form.
20:20 < petertodd> Hmm... true you could actually not hash it at all, although that'd be a lot of complex changes in the scripting system.
20:21 < gmaxwell> e.g. OP_NOP <push> checksig is still valid.. so you'd have to have a rule saying you couldn't do that.  But I'm suggesting never hashing that value anywhere in the protocol.
20:21 < gmaxwell> basically I'm saying the scriptsigs for a txn would be a seperate hashtree. You'd still commit it in the blockchain but it would be a seperate fork.
20:22 < petertodd> Yeah, see I'm thinking s/OP_NOPn/OP_CHECKSIG2/ basically, and continuing to get the signature from the scriptSig, and continuing to hash that.
20:23 < gmaxwell> well I'm pondering how I'd completely change the transaction format to make some of the things that are clearly broken better.
20:23 < gmaxwell> e.g. the fact that fidelity bond proofs are unreasonably big.
20:23 < petertodd> Yeah, problem is you do want to preserve the backwards compatibility I think. The main thing we're missing is input values; got anything else in mind?
20:24 < petertodd> re: fidelity bonds, I just wrote a OP_CHECKLOCKTIMEVERIFY patch actually.
20:24 < gmaxwell> proof size and prunability of scriptsigs while keeping everything else (same problem) is what concerns me most w/ the current format.
20:24 < gmaxwell> even with OP_CHECKLOCKTIMEVERIFY I can't check a @#$@ single output without hashing the whole txn.
20:25 < gmaxwell> (okay, with the midstate compression perhaps you can get the last one, but thats a kludgy hack)
20:25 < petertodd> Right, and to solve that I think all you actually need is just to extend the merkle tree into the tx, plus making that merkle tree include input CTxOut's
20:25 < gmaxwell> right thats what I'm thinking about. How do you lay out the transaction so the data elements form an efficient tree... and then express the data you want to include in your hash efficiently as some masking over that tree.
20:25 < petertodd> I can't think of any other fields that are needed; maybe a per-transaction checkpoint.
20:26 < petertodd> Ah I see, yes, that's a good approach.
20:27 < petertodd> I guess the easiest would be to just number the roots of that tree, and make your RLL-encoded bitfield spit out indexes.
20:27 < gmaxwell> I think the txn global data is a version, a nlocktime, a checkpoint, and the counts and sums for the subtrees.
20:27 < petertodd> Right, sums are important.
20:27 < petertodd> Do you want a single checkpoint for the whole tx?
20:28 < gmaxwell> And the inputs have a sum tree of input data, the scriptsigs have a sumtree of sigsize bytes, the outputs have a sum tree of output value. the two sums give you the fees.
20:28 < petertodd> That's good
20:29 < gmaxwell> petertodd: I _think_ so, as they're redundant if they aren't identical, but it might make some merging complicated as you'd have to agree on the checkpoints when you include them.. otherwise the checkpoint should just becomes scriptsig operator that pushes the checkpoint onto the stack of data that gets signed.
20:29 < gmaxwell> (and checks that it matches the chain)
20:30 < petertodd> Yeah, the checkpoint operator might make more sense, although it is a bit tricky as that means someone else could make the fees of your whole tx not apply unless you're careful. Maybe a non-issue though.
20:30 < gmaxwell> (which may really be the best way to go)
20:30 < petertodd> Though remember we want to encourage people to use checkpoints, so make them mandatory.
20:30 < gmaxwell> I mean, if they added it to your txn without you being able to know, a miner could take it out again.
20:31 < gmaxwell> petertodd: right putting them in the header makes that easier.. it's just part of the structure. Though pushing things onto the signature stack is useful.
20:31 < petertodd> Heh, which actually is ok in the case of someone taking your tx, and adding some inputs to it.
20:32 < petertodd> Heh, one crazy thing about all this, is it suggests maybe the entire block should be nothing more than a single transaction, with signatures signing that they want part of the block to exist basically.
20:32 < gmaxwell> also, I think its moderately important that checkpoints be not prunable or at least seperately prunable from the scriptsigs.
20:32 < gmaxwell> because I am imaginging a future where the scriptsigs are eventually completely pruned and forgotten by everyone.
20:33 < petertodd> Hmm... though what's special then about the checkpoints vs other parts of the tx?
20:33 < gmaxwell> actually no nevermind, the checkpoint isn't actually useful anymore without the scriptsig. it really could go into it.
20:33 < petertodd> Cool
20:34 < gmaxwell> just as a special operator which checks the checkpoint and if its valid pushes it onto the sigstack (otherwise pushes 0 or something)
20:34 < petertodd> In this world, the checkpoint should be just <block id>, and at the same time we should add a merkle-mountain-range'd version of the block hash index to make proofs small.
20:34 < gmaxwell> if you want to make it mandatory, do so with an isstandard sort of rule.
20:35 < petertodd> Well, but why not just make a CTxIn include it as a hashed field?
20:35 < gmaxwell> petertodd: I actually think there should be a real (partial) block hash there so that you can validate the transaction statelessly.
20:35 < gmaxwell> E.g. "assuming the checkpoint is good,  is this txn valid?"
20:35 < petertodd> Oh, sorry, by <block id> I mean <block hash>
20:35 < gmaxwell> ah okay.
20:36 < petertodd> And yeah, I'd say just put the whole hash in there and be done with it.
20:36 < gmaxwell> well if its useless if you've pruned the signatures, then it should always be pruned with the signature.
20:37 < gmaxwell> likewise, thats how nlocktime should work.
20:37 < petertodd> Yup, hence put it in CTxIn(2)
20:37 < petertodd> Oh, that's an interesting point
20:37 < gmaxwell> 12345 PUSH_CHECKTIME. also some care needs to be required to prevent emulation.
20:37 < petertodd> emulation?
20:38 < gmaxwell> e.g. say I sign a list of only outputs  0xDEADBEEF,0xBEEFBEEF ... and then some wiseass removes the deadbeef output and replaces my signature with 0xDEADBEEF VERIFYPUSH CHECKSIG
20:38 < gmaxwell> e.g. every kind of insertion into the verify list needs a unique prefix that can't be emulated.
20:39 < gmaxwell> TXOUT|0xDEADBEEF,TXOUT|0xBEEFBEEF   vs PUSH|0xDEADBEEF,TXOUT|0xBEEFBEEF
20:39 < petertodd> Ah right, yeah, I was gonna say you need to do HMAC(subtree-digest, magic) at various points in this tree.
20:40 < gmaxwell> I don't actually think there is any tree on the signature parts.
20:40 < petertodd> IE the scriptSig is still just a bunch of bytes?
20:40 < petertodd> Makes sense
20:40 < gmaxwell> Well I mean that the data its signing is just a list of leaf hashes, not trees. If you make it a tree it makes the neighboring parts of the tree (outside of the masking) non-malleable.
20:41 < petertodd> Oh, right, I see what you mean.
20:41 < petertodd> The magic's I was referring too was more to make sure proofs of merkle paths in the tree can't be faked.
20:41 < gmaxwell> so the scriptsig  should be  nlocktime PUSH_LOCKTIME blockehckpoint PUSH_BLOCKCHECKPOINT txoutrlecode PUSH_TXOUT CHECKSIG
20:42 < petertodd> Ah ok, so we're pushing a bunch of validation values to a stack, and then a tree is made of that stack, and the signature is on the digest.
20:42 < gmaxwell> and the data signed is NLOCKTIME|nlocktime,CHECKPOINT|blockehckpoint,TXOUT|0xDEADBEEF,TXOUT|0xBEEFBEEF
20:42 < petertodd> Right
20:43 < gmaxwell> yea, I don't even think you need to make a tree. I don't think it has any particular value to do anything but hash the stack. But maybe there is a reason.
20:43 < gmaxwell> and in particular if you don't want to hash say, the value of a txout you could choose to seperate that stuff out.
20:44 < petertodd> Hmm... could come in handy to make fraud proofs smaller.
20:44 < petertodd> IE find the one part of the tx that was wrong, and prove just that.
20:44 < petertodd> Although I guess that doesn't actually work...
20:45 < gmaxwell> E.g.   <1 btc> <tx_index> PUSH_CAPACITED_TXOUT   which pushes  <TXOUT_MAXBTC|H(scriptpubkey),1,max(1,value)>
20:46 < petertodd> makes sense
20:46 < gmaxwell> (or really, instead of txindex, it would be an RLE code that could match multiple ones)
20:46 < petertodd> Yup
20:46 < gmaxwell> (RLE meaning run-length encoding)
20:46 < gmaxwell> though I don't know how useful value masking it.. not sure what your goal was there.
20:47 < petertodd> One issue is it might be nicer from the point of view of merging tx's if what selects what part of the tx is "visible" to the scriptSig was not actually in the script, and not actually specific to a particular form of script.
20:47 < gmaxwell> well thats why I'm talking about making the entirity of the scriptsig largely seperate.
20:48 < gmaxwell> I'd even suggest using as txid the transaction without the scriptsigs. The only problem I have there is that people could reorder the damn outputs still and then fixup the scripts to still validate. Which is something I wan't but not if its used maliciously. :)
20:49 < petertodd> I guess my point is if I'm spending "weird ass txout", that means the scriptSig that satisfies that txout is also strange, and anyone who wants to merge their tx with my tx now has to understand what my tx is doing.
20:49 < gmaxwell> do they care so long as it passes validation?
20:50 < petertodd> Point is though all these indexes need to be changed in the merge process.
20:50 < petertodd> But what is index, and what is some other data, is specific to the scriptPubKey.
20:54 < petertodd> Oh, and a thought on backwards compatibility, re soft-fork: for every txin:txout, take the hash of the relevant part of the v2 transaction, and put it into the corresponding scriptSig or scriptPubKey. That'll always be spendable from the viewpoint of non-upgraded nodes.
20:55 < petertodd> You should be able to define a 1:1 transformation from new-style blocks to old-style blocks that way.
20:55 < petertodd> (obviously if it's spending a v1 tx, put an actual scriptSig in the right place)
20:56 < petertodd> Though from the point of view of not changing too much code in one go, it may be better to try to keep everything such that it fits in the existing transaction serialization.
21:05 < amiller> so, fuck it, we're going to have arbitrary recursive snarks
21:06 < amiller> the crypto theory for this stuff is so weird but it's plausible enough that no one might care
21:06 < amiller> the approach to theory seems to be like, we wanted a unicorn but unicorns don't exist, so instead we'll ask for a time machine
21:07 < petertodd> amiller: Why not a movie set?
21:07 < petertodd> amiller: Or CGI...
21:08 < amiller> i'm going to add snarks/pinocchio/tinyram to my ads language so that you can compress functions with snarks, in addition to compressing data with hashes
21:08 < amiller> everyone will like it and 'maybe' it's secure who knows/cares
21:10 < amiller> probably even will be practicalish, just would require implementing all the elliptic curve operations from scratch in c
21:32 < gmaxwell> amiller: well and the pairing operations too.
21:32 < gmaxwell> this is using tate pairing right?
21:32 < amiller> yeah
--- Log closed Fri Aug 30 00:00:28 2013