Breaking Monero Episode 07: Remote Nodes
https://youtu.be/n6Bxp0k7Uqg
Remote nodes are convenient, and many wallets are configured to use them by default. Unfortunately, there are privacy and security limitations when users rely on others to provide copies of blockchain data. We discuss some of these limitations, some remote node attacks, and how to mitigate against them.

All Breaking Monero Episodes

Episode Transcription

Justin: Welcome back to Breaking Monero. Today Sarang and I are talking about remote nodes and some of the considerations that come when using the remote node. Now we all know that remote nodes are really convient. Most wallet clients for any cryptocurrency including Monero bitcoin and many others will try and encoruage you to use a remote node because it’s easy to get started, it’s easy to use and I believe even by default the Monero GUI client provides an easy option for you to connect to remote nodes so it’s much easier to get started. You don’t need to sit there waiting for a potentially slow connection for days to get started. But despite these being really convenient, you delegate some of the responsibilities and some trust to these nodes as you’re using them. So you want to make sure people understand the trade offs that are involved in using these remote nodes. So Sarang is going to start off talking about the ways you can use Monero with different types of connections and really what these nodes do for you.

Sarang: Right, so in some sense regardless of the wallet software that you choose to use at some point you need to communicate with some type of software that’s been maintaining a copy of the blockchain. The reason you do that is that when you generate a new transaction your wallet software both needs to know the particular outputs you are trying to spend as well as be able to pull decoys for ring signatures, which we’ve talked extensively about, from the history of the chain. So at some point some software needs to have a copy of the chain. Now there’s a few different ways to do that, one of which you will often hear people say is “run your own node”. Well running your own full node is basically means that you have a machine somewhere, some kind of device, typically it’s a computer. It actually contains a full copy of the Monero blockchain and as it receives new blocks it tacks them onto the chain, verifies that they are all correct and then makes that chain data available when you need to send a transaction. Then it can scan new blocks that come in and update the balances of wallets that you’ve told it to watch for. However what you can also do typically is often with devices that are small and lower powered or don’t have the storage capabilities for example like a phone where I typically don’t want to have to download and maintain a copy of the chain. That would involve a lot of time, a lot of space and probably a lot of battery that I might not want to do. So in that case I can tell that wallet software to connect to a node, a remote node we call it, that’s maintaining it’s own copy of the chain somewhere else. That could be a computer that I control somewhere or it could be a computer that someone has chosen to make available as a service to the community and the network. There’s also another option where you can use a sort of semi custodial service like MyMonero or an open monero server and those work a little bit differently and that you delegate a little bit of trust to them so that they can watch for transactions on your behalf but it’s not really quite the same thing. So perhaps Justin now you can talk a little about what any of these types of nodes actually does and why we need to interact with them at all.

Justin: Excellent! So as you sort of hinted at you need someway to communicate with the other participants on the network. It’s all fine for you to sign a transaction locally but you ultimately need to tell everyone that you made this transaction take place, to get it to other places. The first thing you really need to use it for is getting information out to the rest of the network. One way to do that is to connect to someone else who already has these connections and tells them to forward it along. Likewise if someone is trying to send a transaction to me for instance, instead of running a direct connection to other people in the network I might say, I’ll just communicate with this one person that’s running this node infrastructure and they’ll tell me when I have transactions that come through. They will speed me blocks, I’ll take those blocks, I’ll look at them locally, I’ll determine whether or not any of those blocks have any transactions that are for me and I learn about the network status update so to speak through this node and how it’s proceeding. It’s really useful when using this remote node because it means that it facilitates the initial sync process when you are sort of putting your remote node back online. Sorry, when you are putting your wallet back online. When you put your wallet back online and it hasn’t been around the past few days or hasn’t communicated with the network the past few days it needs to look at all the recent data that’s been generated and determine if it’s for you. If you use a remote node that means you can get those blocks directly. If you try to run your remote node your own full node then you would need to sync all the block data sort of that way. But one thing we really want to emphasize though is that when you’re using a remote node you do not give away your private keys. You don’t give give the right to spend money to these nodes, instead you’re simply sort of requesting information from them or giving them information but none of the information you give them gives the remote node the right to spend your funds. Similarly we’re not referring to the private view key either like you would give to a MyMonero or open monero type service. There’s like a layer removed in terms of interactivity, it’s less interactive with these remote nodes compared to that sort of setup. Ultimately when you send transactions remote nodes give you the data that’s needed for you to actually build new rings in your transactions. You have your one output that you’re trying to send to someone else. You need to find all the other outputs to send this transaction so you request these other outputs from the remote node, they give them to you, you sign off on the transaction, give the transaction to the remote node and they broadcast it to the rest of the network. Sarang can you talk about the actual process of sending that transaction out in a little more detail?

Sarang: Right, suppose that my phone for example wants to be able to interact with the Monero network but I don’t want to run a full node on my phone that requires a lot of space, a lot of time to keep it up so instead I choose to connect to a remote node. Again that can be a node that I have personally set up and know that it has correct data or it can be a publicly available remote node that someone else has set up on behalf of the network. First what I have to do is that I have to tell my wallet software to go and actually sync the full chain. Now that doesn’t mean that it downloads and keeps the whole blockchain, instead what it does is it requests some block information which in particular contains transaction data. Now again that node does not know what transaction is destined for me cause I’m not going to tell it that, that’s leaking information. So instead I basically tell it to just start sending me transaction data and my phone locally will go through and identify the transactions that were destined to me and use those to kind of store some balance information later on to send transactions. Of course then as new blocks start coming in that the remote node passes on to my wallet, my wallet can kind of update itself. It’s keeping a minimum amount of information and relying on the fact the remote node is keeping it’s own copy of the blockchain in sync. Then when I’m ready to send a transaction and my wallet already knows what outputs I want to send. What I don’t know though is all the other decoy output data. Remember my phone is not keeping that information because that would imply that I’m keeping the entire chain locally. Instead my phone goes and asks the remote node, please send me the details on blockchain outputs 1 through 10. Of course there are actually going to be outputs selected according to a distribution that we talked about earlier. Though crucially when I say that I would like outputs 1 through 10, or 1 through 11, or however many I want depends on the ring size. I’m also going to be asking for the output that I already know the details of. Because remember for every ring that I am going to be generating I already know the true sender output. I know the output that I want to send. Of course I don’t what to just ask for the other decoys from the remote node because later when I actually send the transaction the node could be like, wait, you requested only 10 decoys but this ring carries 11 members. I know which one the true spend is. That’s bad. Instead my phone will request all 11 decoys that it’s going to want and then from there build the transaction with them. Again what’s the benefit? The benefit is that the remote node does not learn from that which output is the true spender. So it requests all the decoy information. The node itself does not know necessarily what the decoys are. Then my phone will use it’s own crypto libraries to go and do the cryptography, to do the signatures and do all the transaction generation stuff to build that transaction and then it will go and actually send the transaction off. In theory there’s very little communication that needs to happen with the remote node to build the transaction. It’s basically just pulling information about the decoys and the actual ring member needed to build a ring.

Justin: Yes I think it’s important to note that there was a lot of effort that was put forth to limit the amount of data that is leaked to these remote nodes so it’s not super obvious that remote nodes can do a ton of really really bad things. But this is Breaking Monero We’re supposed to go through and find all the limitations. We talked about the status status quo of course, sort of how these remote nodes work. Now lets talk about what happens if the remote nodes you’re using are evil, they’re attackers, they’re trying to mess with you in some specific way to some end. What could the remote nodes do if they’re evil? Well they could know, well not they could, they do know for certain what ip address you are using to connect to them. If you are on your home network and without any other protections you just directly connect to your remote node and start sending transactions, they know your home ip because you opened a connection from to them from that ip. If you try to go through some precaution where you use a vpn in the middle or something you distribute your trust to the vpn service or to the tor server or whatever service you tried to throw on the middle and that would help prevent the remote node from knowing the information based off your ip really clearly. That would potentially help but there is a bigger risk to leaking your ip when you use remote nodes than compared to connecting to the network at random because whereas if you connect to the network normally from your real ip people still don’t necessarily know you sent transactions from your ip but if you use a remote node they know exactly what transactions you’re trying to send. Similarly they could try and provide you with bogus ring data. Sarang is going to talk a little more about this exact sort of attack more specifically and the ways that Monero has tried to mitigate this. But you request information from the node and you have no ability to verify information other than the money you control locally because you don’t have the local copy of the chain so the node can try and mess with you under certain circumstances and in certain ways to learn some information about your transaction. And then finally they could just be like a sybil type attack where they refuse to propagate your transaction. Suppose that you are sending a transaction from, I don’t know, a specific internet service provider connection and they don’t like that isp for some reason. They might just block transaction broadcasts from those sort of isp companies. Those are some sort of risks you are taking when you are connecting to these remote nodes and they just won’t do what you are hoping they will do, they might act in some way. In that case they’re more so being annoying but there being annoying to your (?) your going to have to find another way to get your transaction out there. Sarang can you speak a little bit more what happens if your node receives bogus data and what your concerns are there?

Sarang: This is something that is kind of specific to Monero. Like I had said when you need to send a transaction if you have a ring size of 11 like we do today, if I would be spending a particular output I already know the details of that output locally because my phone stores my own outputs data. That’s how it keeps balance information for example but it’s going to request 11 total outputs from the remote node, 10 of which I know are decoys but one of which I secretly know is my own. Of course the remote node could just decide, well ok here’s what I’m going to do. I’m going to replace some of those outputs that I’m sending back to my phone with either corrupted data or with information about outputs that the remote node itself might control. Because again I locally don’t have any information about those. I can’t tell if they’re corrupted.

Justin: Sorry, to interject really briefly. Sarang, to give an example suppose I am connecting to you, you are the remote node. I have my output that I’m trying to spend and I request 11 outputs from you. I request my own output that I already control and 10 others that I choose but since I don’t have the means of verifying those 10 other outputs you might sort of mess with those. With any number of the outputs that are requested you might include information about outputs you already control or with complete bogus data. Is that correct?

Sarang: Yes exactly. Now on one extreme, I as the remote node could decide to send Justin, for example, just either all 11 outputs I control or just all 11 just random totally corrupted outputs. Now Justin’s software knows his own output and since I sent all corrupted output information that means that the one that his phone knows about is corrupted. In that case his wallet if it is set up correctly to do this, the default wallets are, will just raise holy hell and tell him that something is going on and that there was some kind of corrupted data. If he’s smart he will immediately disconnect from my node and never trust it again. What he could try to do if he was just, I don’t know, really really bad at this, and he’d be like fine I’ll just try generating the transaction again. Well his wallet is not smart. That means it’s going to try to spend the same output again but it might request an entirely different set of decoys. In that case I who am again receiving this information would be able to see the inputs that are in common between those requests that he made and try to pull some kind of ring intersection shenanigans and figure out what his true spend is. But there is a more insidious way to do it. I’m not going to do this because his wallet is going to freak out but what I could do instead is maybe I selectively pick just a few outputs and again, I don’t know which one his is so I’m just going to randomly pick a few of them and corrupt those and send the data back. If I’m unlucky I will have corrupted his true input and then his wallet will freak out and raise holy hell. But if I don’t and if I happen to only corrupt the ones that are only used as decoys his wallet software can’t detect that. That means if he sends the transaction anyway I can look at my data and say ok that means I know statistically at least that I have a greater probability of trying to determine what his true output is because I know it’s not one of the ones that I corrupted. The wallet the remote node can basically try to pull some statistical attacks to try to gain information about what the true spend is. It’s kind of a subtle attack that only works some of the time, it’s kind of insidious. It’s worth noting though that in all of these cases this does not mean that the wallet remote node can do anything like try to spend funds that it doesn’t control, that’s not possible. All it means is that the remote node can try to pull some shenanigans to either definitively show what the true spend was, in the case of Justin requesting totally new outputs, or it can try to corrupt selectively some of outputs in an attempt to gain a greater chance of being able to guess the true one statistically. It’s very subtle and it involves the fact that you have to request information from the remote node some of which you can’t verify as correct.

Justin: If you sent a transaction or sent outputs to me that were corrupt and I signed it and gave it back to you, would that transaction be able to be relayed to the rest of the network or would the network reject it?

Sarang: It depends if the outputs that I corrupted are still valid outputs somewhere. If I just start replacing the outputs I send to you with random bs data for example then you will basically be signing something that has bs data on it and then the network will just reject it. But if instead I happen to control a selection of outputs and those are the ones that I feed to you, those are all valid outputs on the chain somewhere and I happen to control them, then you would sign a transaction that does to the rest of the network appear valid and it would actually be accepted. So it all depends on how I do the corruption.

Justin: Interesting, thank you so much. We’re now going to end up with, what are ways that people can mitigate against these sorts of attacks? I mean the real obvious one here is don’t use a remote node, right? If you can help it you can avoid all of this by using your own node. There are still considerations with using your own node but ultimately that’s what’s best for your security and privacy. Even if you want to straddle the convenience of a remote node with the benefits of running your own full node what you could do is just run your own full node at your house and just have it running 24/7 and then you can have your computer run a sort of soft ware library, like open monero or fast sync for example, where you’re able to essentially connect to it as your own remote node or as your own MyMonero type server and that way you’re able to get all the benefits that you would get by connecting to a remote node but instead of trusting someone else with this data or trusting them not to pull shenanigans, that you’re trusting yourself. There no incremental level of trust there, you’re getting all those sort of benefits by using your own set of infrastructure that you have set up, not trusting someone else to set up infrastructure. There’s a few other reasonably safe options and this really depends on your use model. If you are willing to trust some people, say I think Sarang is a really awesome dude and Sarang is like, hey I’m running a remote node or I’m running a node at this ip address if you want to send transactions through it by all means go at it. If I trust Sarang then I can use his remote node and of course I still need to trust him but if I’m willing to do that then that’s ok for my specific threat model identified in that case. What’s potentially a little less safe, it depends on the exact set of circumstances of course, but you can just connect to a list of remote nodes at random. Connect to some remote node that you have no idea who runs it or any sort of circumstance. As an example there are several services, and I’m going to share my screen here, that show you or at least pull up a list of all the remote nodes that are available. This is an example is node.pwned..systems and you can see that they just scrape the Monero network for open nodes. There are quite a few of them on this list, I’m obviously not going to cover all of them or any of them specifically but just as an example these are all nodes I can connect to right. They’re available, they let me do it so I theoretically could connect to them and just connect to one at random. Then of course I don’t know who’s behind it. They could, be you know, pulling some shenanigans or something. And of course the best thing you really can do is, if your wallets starts complaining to you that something is wrong with the remote node, something is wrong with the remote node! So stop, don’t use it anymore. Connect to another one because if you’re getting some of these warnings in your wallet, it’s really really likely that the remote node is acting up and trying to be malicious against you. It’s not like any reasonable remote node will be configured to feed you bogus data to begin with, like by default. That does not make any sense.

Sarang: In that case the best possible scenario is that you are connecting to an unreliable node which, you know, is not useful. And then the worst case that it is being malicious and trying to pull some shenanigans in order to gain information. There are plenty of trusted nodes that are either run by trusted community members or trusted by trusted community members. But again the best solution is just to run your own remote node. You can still use a lower powered device like a phone to connect to that but you know that the chain data that’s being provided is controlled by you and you are the most trusted person to you I hope.

Justin: Exactly, and then one last thing that we talked about a little bit earlier, if you want to help hide some of the network layer meta data from the remote node, potentially not knowing your location is for instance, you can still route your connection through tor or something. Maybe not if you’re syncing your data, that would be really slow, but especially if you’re just doing transaction broadcasts and things with remote nodes that could be pretty reasonable. If you want to use a vpn that you trust for instance, that that might be beneficial for many users with those specific threat models out there too. Alright, so I think we covered a lot about like what the what the status quo is about what remote nodes are in bitcoin and Monero and what’s unique to the Monero part and what some of the concerns are with using remote nodes in regards to security and privacy. Is there any last thoughts you want to leave the listener with Sarang?

Sarang: Just that this is something that really depends on your personal threat model and your personal use case and you know it’s worth sitting down for ten minutes and deciding what level of trust you’re willing to offload onto other entities, if any, and then use that to make an informed decision about your interaction with nodes at all.

Justin: Alright, thank you again Sarang, thanks again everyone for watching and catch us on the next one. Take care.

Sarang: See ya.