Page 1 of 12 1 2 3 11 ... LastLast
Results 1 to 15 of 174

Rodi - anonymous P2P

This is a discussion on Rodi - anonymous P2P within the P2P General Discussion forums, part of the P2P Forums category; I noticed that this new experimental P2P application called Rodi was being promoted here, so I thought I would start ...

  1. #1
    tm
    tm is offline
    Registered User tm is on a distinguished road
    Join Date
    Jul 2004
    Posts
    2,096
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    86

    Rodi - anonymous P2P

    I noticed that this new experimental P2P application called Rodi was being promoted here, so I thought I would start a thread about it.

    One thing that I think the website badly lacks is a short summary of what Rodi does; just to figure out what Rodi is requires slogging through pages of a combination of intricate details, theoretical rationalizing, and generalized arguments about what's wrong with other P2Ps. A short summary at the beginning in as few words as possible would be very helpful.

    Rodi is a command-line client, which is going to make it a hard sell to the bulk of the P2P community, since nearly everyone these days expects a GUI in any Windows application.

    I tested it, but runRodi.bat does not work for me - the command prompt windows will only flash when I run it. I looked inside and the java directory needed to be changed, as mine was not "C:\j2sdk1.4.2_05\bin\javaw" But even with the right directory set, it still did not run. Linux users will probably find the instructions a lot easier to understand than Windows users.

    However, everything in the 'tools' folder will run - or at least it does not crash. But that file named 'bigRedButton.bat' is a bit scary. Will it really delete all files in C: like it implies?

    Rodi appears to use IP spoofing; the uploader sends UDP packets with a fake originating IP address, and the receiving client then sends the ACK packets back to a 3rd-party proxy, which then reroutes the ACKs back to the uploader. So the downloader's IP address is known to the uploader, but the uploader's actual IP is unknown to the downloader. Therefore, anonymous file transfers are achieved without any significant bandwidth overhead because only the ACK packets get routed through a proxy; the file's data transfer itself is direct.

    IP spoofing is a very promising technique, achieving 100% anonymity while at the same time near 100% speed. SUMI is a now-dormant open-source project that was trying to achieve that same goal of IP spoofing - however, SUMI was not a network, only a 3-party anonymous transfer mechanism that used an IRC server as ACK proxy.

    I understand that many people, maybe a majority, cannot IP spoof because their ISP will drop all packets with a non-matching source IP. The former ES5 users might be the best source of real-world information about IP-spoofing and who might be eligible to employ it, since ES5 used a version of spoofing with its AXP protocol.

  2. #2
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    I tested it, but runRodi.bat does not work for me - the command prompt windows will only flash when I run it. I looked inside and the java directory needed to be changed, as mine was not "C:\j2sdk1.4.2_05\bin\javaw"
    try just click JAR file instead of using ruinRodi.bat
    try to replace javaw by java
    something like
    Code:
    C:\j2sdk1.4.2_05\bin\java -Xdebug -jar rodi.jar LocalIP 0.0.0.0
    
    and post here or send me (mailto:larytet@yahoo.com) the log and i will try to figure out what is going on.
    for alternative ways to run the client see http://larytet.sourceforge.net/tryRodi.shtml

    Rodi is a command-line client, which is going to make it a hard sell to the bulk of the P2P community, since nearly everyone these days expects a GUI in any Windows application.
    GUI is under development. My mental milestone - end of Apr.

    CLI is not totaly useless. For example, one can put applet on the web server hiding under "Download file X" link - size of the applet is about 250K (like torrent for large video) and can be made ~30% smaller if remove debug info and debug CLI commands.
    When a user click the link she actually runs the applet and connects to the Rodi network. Rodi CLI client can run initialization scripts, so WEB master can easily configure the Rodi client to download the correct file.

    One thing that I think the website badly lacks is a short summary of what Rodi does
    and this is after two or three major clean ups. The main page http://larytet.sourceforge.net/btRat.shtml is reasonably understandable considering my English and very limited writing skils.

    i invite everybody to help the project (see join us page at http://larytet.sourceforge.net/joinUs.shtml). The project is desperately looking for a person with good writing skills. The project is open source and i am not seeking for money/donations/etc.



    I understand that many people, maybe a majority, cannot IP spoof because their ISP will drop all packets with a non-matching source IP.
    That's true. But i think that it's enough that 1% has an ability to spoof. Courts will eventually stop to accept logs of IP addresses collected on the edge of the network, because IP source of the packet is not reliable.

    Anonimity is not a problem number one though. Protecting of the publisher in case of DDoS not less important task. One of the ways to do it is spoof IP port (see User Manual Lesson 3 http://larytet.sourceforge.net/userManu ... sson%203.0)

    But that file named 'bigRedButton.bat' is a bit scary. Will it really delete all files in C: like it implies?
    Code:
    @set EXTENSIONS=*.mov *.avi *.rar *.vob *.mpeg *.mp3 *.mp4
    @set DEL_PATH=C:\
    @set PATTERN=The rabbit-hole went straight on like a tunnel for some way, and then dipped suddenly down, so suddenly that Alice had not a moment to think about stopping herself before she found herself falling down a very deep well
    @set FILE_NAME_PREFIX=bigRedButton
    
    @echo Delete %EXTENSIONS% from subdirs of %DEL_PATH%
    :: @echo press Enter if you are agree, press Ctrl-C to exit
    :: @pause
    
    @cd %DEL_PATH%
    @del %EXTENSIONS% /S /Q
    
    @echo Fill disk
    ....................................
    
    This code removes all files with extensions *.mov *.avi *.rar *.vob *.mpeg *.mp3 *.mp4, etc. After that it attempts to fill the disck with some pattern. I think that it makes the data of removed files unrecoverable. though i can not garantee it. The batch file takes time - ~20 minutes for 80G disk

    I suggest - i am deadly serious - to use external hard disk and keep a bottle of gasoline around. It shoud take 3-4 minutes to heat hard disk up to the point where the stored data is d
    estructed.


    finally i am happy to hear that somebody at least tried this thing and i appreciate your attempt to bring attention to my project.


    P.S. "anonymous" is not the exact word and it is only part of the project, and not even significant part. probably it is important from "marketing" point of view but i do not want to leave any misundesrtanding here. Main propose of Rodi is open source decentralized alternative to the existing commercial Internet search engines. I want to find text by content and I want to download it immediately without going to Amazon or somewhere else. I want to be able to find anything stored in Internet and not only linked HTML files. I want to protect publishers of sensitive content. I want to build a network of trust (see public/private keys and identification server in the User Manual) which does not need any index servers like SuprNova. I truly believe that I found answer to all this.

    P.S.1 IP address of the downloader can be spoofed too, btw. Only publisher knows what IP send data to, downloader can send packets with arbitrary IP source in the IP header. Man in the Middle will never know for sure where packet comes from, but only the destination.


    P.S.2 Must read - http://craphound.com/complexecosystems.txt
    It explains what i am trying to do - i am trying to open Internet search for anybody by providing immediate access to the content and ability to run own WEB page index engine with own criteria of word rank

  3. #3
    tm
    tm is offline
    Registered User tm is on a distinguished road
    Join Date
    Jul 2004
    Posts
    2,096
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    86
    Thanks for your help, larytet. That should help me get setup.

    I was not familiar with jar files - I assumed that it was just another archive format like zip or rar, so I just unpacked it. I have had problems with running several Java applications; Ants does not run at all. I might set up another partition with clean os installation as a test platform.

    The GUI will be a very welcome feature. Would it be possible to create an extremely basic GUI - basically a demo that still requires some command-line functions - just so the average (non-programmer) computer user can have something to run and demonstrate Rodi's abilities? Rodi seems to be (at least so far) aimed more at people who might be at ease recompiling a kernel, but the vast majority of computer users will probably not want to even deal with anything that requires more than point-and-click functions. If there is still much work to be done on the command-line client, a fully functioning GUI might just be a waste of time at this point, but I think that a very simplified (demonstration) GUI might be good to have even now.

    I think that the main problem with Rodi's website is that people visiting the website are likely to wrongly conclude that Rodi is probably the same type of anonymous self-proxying P2P as Freenet, Winny, or Mute - P2P applictions that many of us have already tried and judged to be much too slow.

    In order to attract users, any new P2P network must have a reason to exist - something that makes it substantially better than other available P2P networks. Anonymity combined with speed is a very desirable feature, and would be a very strong selling point.

    I would like to volunteer my help in writing a synopsis for Rodi, but I do not yet have a full understanding of exactly how the Rodi network functions. A brief outline - and a simplified diagram showing a 3 member cell - would be very helpful.

    Rodi is the best idea for a P2P network that I have yet seen. Being opensource, I hope Rodi can attract other coders and develop into the next major P2P network.

  4. #4
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    first of all i want to thank you for the support and warm words. it's worth more than any amount of money (or almost any ) when somebody appreciate your efforts.

    Ants does not run at all. I might set up another partition with clean os installation as a test platform.
    tyr to reinstall java from java.sun.com you need only java run time envirnoment. This one will do i think http://java.sun.com/j2se/1.4.2/download.html

    If there is still much work to be done on the command-line client, a fully functioning GUI might just be a waste of time at this point, but I think that a very simplified (demonstration) GUI might be good to have even now.
    main stream functionality is already supported by the Rodi core - find file(s) by file name, publish file, download file. I work on GUI and beta will be released in 3-5 weeks from now. as usual i prefer functionality, speed and low memory and CPU consumption over pretty interfaces (preview is available on Try Rodi page).
    The good news are that interface between Rodi Core and GUI is documented and socket based. Abybody can write user interface controlling the Rodi Core.

    A brief outline - and a simplified diagram showing a 3 member cell - would be very helpful.
    i thought that diagram on this page http://larytet.sourceforge.net/rodiAnonymity.shtml explains it.
    See also text immediately under the diagram. If you think that the diagram is not clear enough and contains too many details i will try to split it by two figures - one simplified and one more detailed.

    William - beta testing and documentation, has some personal staff to care about and can not devote signifcant time to the project
    princejer - project management and design, bought nice piece of RE in California (i guess you are aware of the prices in the golden state) and is looking for new job now.
    It means that i am essentially the only person behind this project. My Internet connection is guarded by corporate bidirectional firewall which i yet have to figure out how to penetrate. Rodi network exists on my single CPU PC when i run tests with multiple Rodi clients. yes, that's right you can run multiple clients on the same machine. I tried up to 9 applications. Every additional client cost is ~25MB of RAM/~0.1% of CPU

  5. #5
    tm
    tm is offline
    Registered User tm is on a distinguished road
    Join Date
    Jul 2004
    Posts
    2,096
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    86
    Many times in the past I have posted a long, detailed explanation to someone's question, only to have the same person re-ask the same question farther down the thread, obviously ignoring my post. If there is one thing that I learned it is that many, if not most P2P users simply do not wish to take the time to read a long detailed explanation - they are too impatient and want only a quick answer. That was why I recommended 2 parts to the website - one for detail-oriented technical people, and another section - a short summary - for non-technical people that describes it in the simplest possible terms (even if not entirely accurately)

    Because Rodi is a truly innovative idea and is different from all other P2Ps, it is going to be more difficult for the average P2P user to understand the concepts behind how it works. That is why I think a simplified explanation targeting the casual user is important.

    Another thing that would be very helpful would be a detailed help page that carefully explains every step needed to set up and run Rodi, as well as common problems likely to be encountered. Otherwise, a lot of people are going to have a hard time and may quickly abandon Rodi. Like I mentioned earlier, I could help with this once I understand Rodi better.

    One question that I have is how are files hashed? Is there a separate checksum of each 32KB part of the file, for instance? I think that for IP spoofing to work effectively, the file should be sub-hashed using the smallest possible unit (perhaps even measured in bytes, not KB). That way a file could be virtually "streamed" anonymously and an ACK would only need to be sent back for the file segments which failed to complete. Theoretically then, if every segment completed and hash-checked without an error, then no ACK would even need to be sent back to the uploader.

    From http://www.open-content.net/specs/draft ... ex-02.html:
    Thus the authors recommend a segment size of 1,024 bytes for most applications, as a sort of "smallest common denominator", even for applications involving multi-gigabyte or terabyte files. This segment size is 40-50 times larger than common secure hash digest lengths (20-24 bytes), and thus adds no more than 5-10% in running time as compared to the "infinite segment" size case -- the traditional full-file hash.
    Although a 1KB segment size is recommended, most P2Ps use much larger hash segments, only to discover later that the large segment is vulnerable to a corruption attack, an increasingly common practice.

    What size are the smallest file hash segments in Rodi?

  6. #6
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    Another thing that would be very helpful would be a detailed help page that carefully explains every step needed to set up and run Rodi, as well as common problems likely to be encountered.
    i tried to do that in the User Manual. Lesson 5 should more or less answer to most of the questions. Rodi does not require any installation if Java already is on your PC. You simply run Rodi applet or click the JAR file (see Lesson 1 and so on)

    I am aware of this problem of not reading documentation - GUI will help. Menawhile Rodi is for WEB proffessionals, server admins, etc.

    What size are the smallest file hash segments in Rodi?
    hmmm ... actually this is 4M blocks, though the streaming itself is in chunks 1024 bytes. This is a simple answer to the question.

    There is another answer part of which can be found here http://larytet.sourceforge.net/btRatDes ... l#download (Rodi Design Download&Upload)

    Publisher can create something like torrent file (see User Manual lesson 4.1 http://larytet.sourceforge.net/userManu ... sson%204.1) which is an XML (torrent files are bencoding files) file containing hashes of 4M blocks or blocks of any size - size of the block is one of the XML file parameters
    Code:
    <?xml version='1.0' encoding='utf-8'?>
    <rodiHash name="http://larytet.sourceforge.net/scripts/example7.data" size="12411" blockSize="4194304" blocksNumber="1" hash="25f37edd97d83d3ef13fa7b1eba5511e">
    
    <file  name="http://larytet.sourceforge.net/scripts/example7.data" size="12411" blocksNumber="1">
    <blocks>
      <MD5>9a2ec33de0d18190b58838bbfb53dfcb</MD5>
    </blocks>
    </file>
    </rodiHash>
    
    There is another layer of protection which hopefully will make torrent files optional. It is called identification server. I attempt to create network of trust based on the nicknames and public keys of the publishers.
    I think that this is a first attempt in the history of Internet to create Anonymous Certification Authority and for sure this is first such attempt in P2P networks history.

    Regular downloader will accept table of the hashes for the data only from the trusted publisher.
    The next question you will ask whome do i trust. Did you trust mesasge board at SuprNova ? this works in the same way. once you downloaded a file from the publisher with nickname TM you will trust this publisher. The only problem how you distinguish between packets sent by TM and by adversary. Here public and private keys enter the game.
    See User Manual Lesson 4.2 http://larytet.sourceforge.net/userManu ... sson%204.2
    Publisher TM signs all packets using his own private key. Downloader use public key of TM on the identification server (like this one http://larytet.sourceforge.net/ipRange/ipRanges.php) and decrypt the signature. Default encryption algorithm is SHA1 RSA and typical size of the packet is under 512 bytes.

    Besides there is a 64 bits randoml request ID generated by the downloader for every request and 64 bits random session ID generated by the publisher to make sure that all packets are different and can not be used more than once.

    All that said i am not security specialist, but i think that the network is protected at least against most obvious attacks like pretending to be a trusted publisher or taking down this or that server.

    The network is completely decentralized and even identification server is optional, because you can receive public key of the publisher by email or from IRC/ICQ chat or you can signatures and trust anybody or you can receive public key as part of the packet first time you download something from the content publisher and learn the public key and keep the public in the database of truste
    d public keys and in the future check the packets expecting correct signature.

    Also one can build hub - small network of trust where everybody knows public keys of everybody. Such network can serve content to the whole world without being compromized as far as there are nodes who are able to spoof IP source. Or such network can operate like closed Direct Connect group - nothing in and nothing out. Possibilities are infinite.

    Identification server does not keep anything besides the information publisher provides - nickname and public key. Optionally publisher can also provide range(s) of IP addresses of his/her ISP and port Rodi uses, for example 31211.

    Downloader is expected to check all IP addresses in the specified range (like IP scan) until one of them does not reply.

    The IP address(s) publisher post are not necessary belong to the publisher, but can belong to the proxy or bouncer.

    Downloader D sends request get data to the IP 192.168.18.1 port 4111. 192.168.18.1 belong to the bouncer B.
    B is configured to forward all packets arriving to the port 4111 to two IP addresses 178.2.31.1 and 57.175.1.49
    IP address 178.2.31.1 is dummy and IP address 57.175.1.49 is an IP address of the publisher. B has no idea what IP is a real one and does not care about it. overall traffic in B is very low, because B forwards only "get data" requests, not the data itself.
    Publisher P sitting behind 57.175.1.49 recives "get data" request and checks that the downloader is authorized, etc. Publisher builds UDP packet holdoing the requested data and sends the packet directly to the downloader. IP source of the packet is arbitrary.
    Code:
    D -> B -> P -> D
    
    Downloader only knows IP address of the bouncer. Bouncer knows two or more IP addresses and one of them supposedly belongs to the publisher but never sees any data and can not be sure that forwarded packets reach the destination or this is total junk.


    only to discover later that the large segment is vulnerable to a corruption attack, an increasingly common practice.
    MD5/SHA-1 is still considered reliable enough. it takes some time to create two blocks with the same MD5 or SHA-1 even on the modern machine. requred time is measured by days and on mainframes by hours.

    I understand (i am not sure) that corruption attacks are possible in the networks like eDonkey where you know only hash for the whole file. Downloader in these networks can not check downloaded from multiple sources blocks. After the download is completed downloader finds out that hash of the file is wrong, but downloader can not know which one of the mutliple data sources sends corrupted data.

    In the Rodi network (not in the current version) downloader gets list of hashes from the trusted publisher. Size of the block matters only because time of the download of the corrupted block is wasted. Because typical connection today is 1M+ and download speeds ar 100Kbytes/s 4M block size looks like a reasonable compromise.

    Small blocks create other problems, like traffic overhead, for example. When one peer (A) sends map of the blocks to other peer (B) - blocks A has, the packet containing the map should not be too large.
    Another problem is disk fragmentation. When you download small chunks belonging to the different parts of the file it creates fragmentation on your hard disk to the degree that it starts to influence overall performance of the PC. different schemes were developed to fight this issue, but everything has it's cost both in terms of memory and CPU

    Torrent file can be large and calculation of the torrent file can take some time. For larger blocks this times time tends to be shorter.


    Merkle Hash Trees
    I am aware of existence of hash trees. Merkle Hash Trees is not the only example. major problem with these algorithms is calcu
    lation time and file size. It is easy to say that the hash tree size is only 1% of the original file size, but in reality you are not going to post 80M torrent file on you WEB server, am I right ? what is size of avatar P2P permits ? 4K ? 8K ?
    Not to say that i am against it. As i mentioned above torrent file in the Rodi network is XML file with open self-explaining format. Use any hashes you want, send me plugin for the Rodi Core and trust me - next day it is in and part of the official release.

    P.S. i hope the post is still comprehensible despite of it's overall length and my lousy English. feel free to divide it into mutliple posts.

  7. #7
    tm
    tm is offline
    Registered User tm is on a distinguished road
    Join Date
    Jul 2004
    Posts
    2,096
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    86
    larytet, your post is very good and understandable.

    In the Rodi network (not in the current version) downloader gets list of hashes from the trusted publisher. Size of the block matters only because time of the download of the corrupted block is wasted. Because typical connection today is 1M+ and download speeds ar 100Kbytes/s 4M block size looks like a reasonable compromise.
    If the smallest data unit that is individually checksum verified (using SHA-1) is 4MB size, then that is much too big, prior experience has shown, for any kind of multi-source download because a hostile source can upload only a few bytes of corrupt data that requires a re-download of the entire 4MB segment. I had discussed this in THIS THREAD

    The reason I mention this is that several other P2P networks discovered too late that their chosen segment size (ED2K's is 9500KB) was much too large after corruption attacks became common. eDonkey's original 9500KB size also seemed like a reasonable compromise at the time it was established - it was only through actual real-world experience that it later became obvious that this size was much too big, the revised segment size being reset to about 2% of the original size. And this was with broadband users considered as the primary users. If it has been designed for dialup users, we can assume that the corresponding segment size to thwart corruption attacks would have been proportionally smaller.

    The way these corruption attacks work is that a hostile user connects and uploads just a few bytes of corrupt data for every segment (which is enough to corrupt entire segment) and then disconnects and repeats this with other users. As a result, no one can ever complete a single segment of the file.

    The worst case of corruption attack I ever encountered was for a very popular 1MB file (with hundreds of sources). After being online for about 10-12 hours, I still did not get the file, even though my computer actually downloaded it maybe a hundred times or more. I was able to identify who all the corrupters were because they only uploaded a tiny amount then they would disconnect.

    Corruption attack is a big problem with multi-source P2P because only a tiny amount of corrupt data can ruin the entire segment. This does not require breaking the SHA-1 algorithm or anything like that, but just simply mis-identifying upload data for any given file hash. (the attackers usually uploaded 0-filled parts)

    Rodi takes an innovative approach to trusted 3rd parties.
    That's interesting about an Anonymous Certification Authority; this is what MUTE's developer had to say:
    What about trusted third parties?

    At this point, you might be wondering how encryption works on the rest of the Internet. For example, millions of credit card transactions are passing from web browsers to web servers each day through SSL connections, and these connections are built on top of Internet routes through untrusted routers. Am I claiming that SSL and secure HTTP are a sham? No. But these connections use a key exchange mechanism that is not practical in the MUTE network: a trusted third party. When you make a secure connection to Amazon.com, you do not blindly obtain Amazon's key directly from Amazon's web server, since your transaction would then be subject to a person-in-the-middle attack. Instead, you rely on a third party, called a "certificate authority," to verify that Amazon's key is really from Amazon. The assumption is that if you obtain Amazon's key through a different channel than the channel through which you obtained the authority's key (authority keys are shipped with your web browser), a single person-in-the-middle attac
    k will be thwarted.

    So why not use trusted third parties in MUTE? The core problem is that it is difficult or impossible to be trusted (in any secure sense) when you are anonymous. As soon as you connect directly to the certificate authority, your anonymity is compromised. We might try routing messages to the authority through MUTE, but then we are back to our original problem: we cannot communicate securely with the authority unless we have a secure end-to-end channel. Of course, we might forgo the secure end-to-end channel and just route unencrypted messages to the authority. In this case, we would be assuming that our sender-receiver route is different from our sender-authority route so that a single person-in-the-middle could not interfere with both routes. But what about coordinated person-in-the-middle attacks, where several nodes work in concert on different routes? Such an attack could both fake the receiver key and fake the certificate.

    Even if we can connect securely to the certificate authority in some way, we have no way of talking about the receiver. In other words, how are we going to describe the receiver to the authority to obtain a certificate? Using the receiver's virtual MUTE address? But where did we obtain that virtual address from? The address was probably sent along with search results through a route from the receiver to the sender. In other words, the person-in-the-middle might already have interfered with our communications, replacing the receiver's address with his or her own address, before we even contact the certificate authority. In that case, we would be asking the authority about the wrong address and then routing all future messages through the person-in-the-middle. So, because of the anonymity in the MUTE network, even a trusted third party would not enable secure end-to-end communications.

  8. #8
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    The way these corruption attacks work is that a hostile user connects and uploads just a few bytes of corrupt data for every segment (which is enough to corrupt entire segment) and then disconnects and repeats this with other users. As a result, no one can ever complete a single segment of the file.
    Rodi drops whole 4MB block if it is not completed and starts download of the block from the very beginning. Rodi keeps statistics for every peer. If peer is unreliable no packets/requests will be accepted from this peer for long period of time or even ever. Such adversary will have to change IP addresses until all available for the adversary IP address space is eaten up.

    Block (4M) is considered ready for uploading/sharing only when downloaded competely and hash is checked. The whole block (4M) will be downloaded from the same source/peer. It is part of the spec.
    Multisource download starts to work only if file is larger than 1 block - from 8M. Rodi reaches max performance after 10-12 blocks or 40M files or larger.

    There is one last touch (artistical :) in the Rodi spec - downloader can ask trusted publisher to calculate hash for any block. downloader specifies in the request offset of the block and block size. Publisher can the request, can serve the request (requires computation power) or can send hash for the block which includes the one specified by the downloader.

    Rodi keeps list of peers between download sessions. Rodi never resets the list. Rodi discard packets from peers with low reliability or not signed packets. Rodi discards packets silently. Adversary can not know for sure why it's requests are not answered - because the peer is down or because it's IP on the "black list"

    Download is always initiated by the downloader and never by the uploader. Uploader serves the incoming requests and discrds them silently or replys with busy. Downloader keeps trace of IP destination where requests was sent to. Indeed downloader source IP of the packet, but downloader does not need source IP.

    It is probbaly requires more explanation. From the User manual (General discussion )

    Imagine that you knock a door and from window of house across the street you get a glass of milk. You never know who stands behind the door and you never know how many phone calls are made to serve you the milk. This is more or less how Rodi network operates. You send IP packet to the range of IP addresses (you knock many doors on the street using correct knock pattern - Rodi protocol) until you find out the IP destination (the right door). You send IP packet like GET DATA request to this IP address and you recieve IP packet containing requested data from some other IP. you do not need to know what IP address the data arrives from (and it's useless actually) as far as it contains request ID you initially sent (right knocking pattern), authentication of the publisher (see Post IP page) and data with correct MD5.
    There is no way or let's say it is not trivial for the adversary to deliver corrupted blocks en mase without rotating IP addresses.

    MUTE
    hehehe. all said in the quote is true. but as i said i use innovative approach to the problem. i do not trust Certificate server, but i trust to the publisher with nickname TM and 48 bytes public key, because once in the past i downloaded data from this publisher server. i do not care where pair nickname/public key comes from - there is no doubt or let's say there is fair amount of doubt that identification server is compromised.

    i do not care. i am an oiptimist (life is full of reasons to be optimistic, right ?). i feel lucky today - i give a shot. i send LOOK request and get table of hashes. I start download and get the file. Let's put aside for a moment executable files. Let's say
    that this is HTML file - last letter of dying in RIAA's prison leacher. I read the file and it looks allright. From now on i know this guy - TM, that is, this guy is all right. next time i see his nickname in the end of the properly signed packet i trust him.

    do you trust SF when you download binary files from their servers ? i guess the answer is generally yes. Even though they use multiple mirror servers and each and every one of them can be compromised. why do you trust them when you download Ants application ? and Ants application writes/reads to/from disk, sends packets to the Internet and does many things which are typical for the spyware/virus. Then why do we trust this s**t ? Because we tried it once and it worked. i downloaded Firefox first time and i have seen that this is good. I decided that their upgrade is going to be even better.

    this is very important point and this is why i put so many lines in this post.

    Rodi is different from Amazon. Rodi does not make an attempt to establish authorized certifcate server. not even close. everybody can run Rodi key server - everybody and everywhere. If you tried key server of the Rodi Hunters (there is no such server in reality, but i hope there will be in the future) and found that it's good you will probably try it once again and you will probably even trust all publishers belong to this house.

    let's simplify how Mute works. peer connects to small number of peers (3-5) and exchange encryption keys with them. from now on all transactions between peers are going to be encrypted. Mute client "knows" what key should be used to decrypt the packet looking on IP source. There are so many distinct pairs of keys as there are different peers in the Mute network.

    The same is in Rodi. Number of key pairs is equal or larger than number of peers.

    Where is the difference ? I declare that encryption of the payload is optional. For vast majority of the applications signature is enough and only for the most sensitive data publisher can decide to encrypt the whole payload.

    The other deifference is that in Mute you can exchange data only with the hosts you connected to - immediate neighbors. Not the most reliable, not the closest, not the fastest, but just the first 4 or 5 peers you were lucky to connect to.

    Mute is good. Probably this is one of the best anonymous networks created so far. If you share really sensitive content and i don't talk here about popular video title - Mute is the way to go.

    Privacy comes at performance cost. Search is slow and not reliable. Data is routed thorugh mulitple last mile connections. Packets are encrypted and can not be cached by the ISP. Connections are random in geographical sense (and this is the goal to make them as random as possible), but ISPs prefer to use their own network as much as possible and we want to work together and help each other, right ? this is part of democracy. ISP wants to help us, let's help to ISP. encrypted payload and intercontinental peer 2 peer data transfers are not very ISP friendly.

    Rodi can be ISP friendly, but it has it's own ugly face and strong teeth if the situation requires - paylod encryption, faked RTP and DNS packets, etc.

  9. #9
    tm
    tm is offline
    Registered User tm is on a distinguished road
    Join Date
    Jul 2004
    Posts
    2,096
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    86
    Rodi drops whole 4MB block if it is not completed and starts download of the block from the very beginning
    Doing that would prevent the type of corruption attack as I described above, but this would make Rodi totally unsuitable for dialup users. This reminds me of using Napster years ago with a dialup connection: a 4MB MP3 would take about 20 minutes to download, and then a lot of times it would disconnect at 90% complete, and Napster would not support resuming downloads. That made a lot of wasted time. Napster was terrible for anyone with a dialup connection because it would not resume interrupted downloads. If Rodi's blocks are not resumable, why are they set so big? Another advantage of using a small block size is that a popular file can spread faster.

    All P2Ps since Napster supported resuming. Is there a reason why Rodi will not resume downloads within 4MB blocks? Is this something that is still in development?

    Such adversary will have to change IP addresses until all available for the adversary IP address space is eaten up.
    In the corruption attacks I have experienced, dozens of different IP addresses were used. Use of proxies could potentially give virtually unlimited number of IP's.

    In the example of ed2k hashes, they were originally 9500KB, then eDonkey/Overnet developers decided to lower the size to 512KB, then eMule developers decided to lower the size even more, to 180KB. These decisions were not made casually. The 180KB block size came as a result of years of experience and numerous discussions of many developers and users.

    9500KB blocks originally seemed like a good size, but experience soon proved that it was a mistake to set it this big. The latest block size, 180KB, turned out to be a much better size.

    In the 4 years since eDonkey was released, internet speeds have increased, but the hashed block size keeps getting smaller, not larger, as would be expected based on average download speed.

    Larytet, If you really believe that 4MB is the ideal block size, I would ask you to investigate the reasons why both eDonkey and eMule developers independently decided to change their standard to a much smaller unit, even though it created additional complications to do this after the original 9500KB was already in such widespread use as posted hashlinks.

    Don't accept my opinion on this, but instead consider the decisions of many other experienced P2P developers who eventually all came to the same conclusion: that the block size needs to be very small.

  10. #10
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    Use of proxies could potentially give virtually unlimited number of IP's.
    Rodi remembers original "destination" address and ignores where response comes from. destination can be proxy or real IP of the adversary - it simply does not matter. Once Rodi client tried and dropped IP never again it will send new request to the same IP and the adversary will not even know what happened. Adversary can not use faked IP addresses to reach the goal. And let's not to forget that adversary have to upload 4MB too before dropping the session.

    I do not understand how proxies could be used to create unlimited number of IP's. there are 4G IP addresses available. among them ~5% is reserved. the rest is in use mostly by ISPs. there is huge deficit of available IP addresses. another 500 mil of broadband users in India and in China will finish all avalibale address space. theoretically you are right - hundreds of millions of IPs are available for the adversary, but in reality nobody will provide access to these IP addresses. we can talk probably about hundreds or thousands of addresses.

    Unrelated note. When IPv6 (16 bytes IP address) enters mainstream mighty adversary can indeed allocate effectively unlimited IP address space and PeerGuardian and eMule will be useless. The problem you see that table of IP addresses to ignore/block is going to be too large. In the most trivial implementation (i am not sure that is's interesting and the right place to post and i will go on anyway) application can use what they call hash table. Typical size of hashtable is 3-10 times of number of enties strored in it - in Rodi at this phase i use hash table . Alternative to the hash table is patricia tree. Size of the patricia tree is equal to number of nodes. One could store in the hastable IP ranges, but in this way one will block users which are not adversary. Interesting problem.

    Don't accept my opinion on this, but instead consider the decisions of many other experienced P2P developers who eventually all came to the same conclusion: that the block size needs to be very small.
    when size of vast majority of the files in the network is well under 500MB (and in reality is under 4Mb - one MP3 file) large blocks makes mutliple source download impossible and reduce performance severely without any visible gains. btw are you satisfied with current dld rates in eDonkey networks ? (i personally do not use filesharing, because i am way too easy target for The Guys)

    4MB block on 100KBytes/s connection takes ~30s and there are 1000 such blocks in 4GB file and 10,000 blocks in 50GB.
    24 frames/s * 1024 * 1280 * 60 * 60 = 100G/hour. with reasonable compression it means ~40GB for 2 hours video clip. that's the reasoning behind 4MB block and nothing else.

    the moment majority of the files are 40MB+ average performance of Rodi is going to be close to performance of eMule.

    BT is used not for single MP3, but for albums and whole directories. I have seen 10G+ torrents containing libraries of SW, audio books, etc. If block is bad - drop it. this is only one block from 2,000 of blocks to download. if adversary is ready to upload 4MB blocks infinitely you are ready to download them. after all they pay for their upstream alone and you are MANY, you are WE.

    Not everything is bad though - there is a backup. In the LOOK ACK publisher specify what the block size is going to be (see Rodi design - Block Information Element). Basically publisher controls how much the file is chopped.
    Block information element
    Code:
    Describes block to transfer in GET DATA request
    
        * Offest (8 bytes)
        * size (4 bytes)
        * File hash (32 bytes)
        * Block hash (32 bytes) 
    
    Rules to build unique URL - TBD
    Block hash is mandatory field in case of GET DATA ACK messages
    
    File Hash inf
    ormation element conatins following fields
    Code:
    # file size (8 bytes)
    # block size (8 bytes)
    # number of blocks (4 bytes)
    # hash size (2 bytes)
    # hash of the file
    # array of hashes for blocks
    

    P.S. returning to the Proxy. your regular HTTP proxy is no good for Rodi, because Rodi is UDP based. Adversary will have to run something like Rodi bouncer. i would let them to upload. my wild guess they will quickly give up. more popular data is the harder work for them. let them pay for fiber connection. ISP would glad to get their money.

  11. #11
    tm
    tm is offline
    Registered User tm is on a distinguished road
    Join Date
    Jul 2004
    Posts
    2,096
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    86
    4MB block on 100KBytes/s connection takes ~30s and there are 1000 such blocks in 4GB file and 10,000 blocks in 50GB. 24 frames/s * 1024 * 1280 * 60 * 60 = 100G/hour. with reasonable compression it means ~40GB for 2 hours video clip. that's the reasoning behind 4MB block and nothing else.
    But what would the situation be for a dialup user? I am on dialup right now, but my download speed of only 2KB/s-3KB/s (very poor line conditions!) would make a download of a 4MB block take close to 30 minutes. Since dialup users often experience frequent disconnects, completing a 4MB block might even take an hour. That's punishing!

    An example: On the 'old' eDonkey/eMule, downloading a file from 10 sources - if the computer crashed, then each currently downloading block, 10 in all, would be corrupt from the crash. That is a total of (avg.) 46MB (9,28x10x0.5) that would be corrupt from a single crash. For a dialup user, this might take all evening to recover the lost fileparts from that crash. But on the 'new' eMule, that same crash during 10 uploaders would only lose 0.9MB (180KBx10x0.5)of file due to corruption. So that is 46MB lost vs. 0.9MB lost. Big difference a small block size makes in lost data due to crash.

    Now of course there are ways to recover data blocks corrupted during a crash; that's true, though not always successfully. This is just one more benefit of using a smaller block size. Since Rodi's blocks are not resumable, a crash during the downloading of multiple blocks - either from a multisource download or multiple simultaneous file downloads - can cause a lot of lost time for a dialup user. Same with loss of internet connection, which is very common with dialup users. For these reasons, Rodi's 4MB non-resumable blocks are a totally incompatible size for dialup users.

    Since close to 50% of US internet users have dialup service, then that would omit a lot of people from utilizing the full benefit of using Rodi. Most other P2P networks (except for original Napster) work well on both broadband and dialup.

    But it seems that although the spec sets 4MB as a block size, is the block size changeable by the file publisher such that any downloader could download a file of any block size, not just 4MB? Does the hashlink identify if the file is being published with a 4MB block size or (for instance) a 4KB block size? If the same file can be hashed more than one way, then would it be possible for a single file to be seen as many separate files because the file was hashed differently each time it was published?

    Of course, using a tree hashing structure could eliminate this false-duplicate-file possibility because a file hashed (for instance) as a 6 layer tree could be compared and matched to the same file hashed as a 5 layer tree, since 5 of the 6 layers of the hash trees in both examples would be identical.

  12. #12
    Registered User nms04 is on a distinguished road nms04's Avatar
    Join Date
    Sep 2003
    Location
    Brixen
    Posts
    836
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    67
    does someone here use it?

  13. #13
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    oes the hashlink identify if the file is being published with a 4MB block size or (for instance) a 4KB block size? I
    The answer is yes
    http://www.p2pforums.com/viewtopic.p...08887c5e#85814
    see Code part with an example of XML describing file, pay attention to the end of the second line.
    i do not use hash trees in the current version. There are only hash of the block and publisher decides what the block size is going to be and hash of the file which is identification of the file. Hash of the whole file remains the same no matter by how many blocks publisher decided to chop it.

    As you mention in the previous post resuming of the block using different sources appears to be a nice feature only at first glance. Adversary can hijack this hole easily preventing download for multiple clients.

    There are two types of filesharing networks.

    In one type (eDonkey) you find many small files and start to download all of them simultaneously from multiple sources. In one-two weeks time you will get some of them or may be even most of them on your disk.

    In the second type (BT) you look for 600MB video clip which over 5Kbyte/s connection will take 30 hours. You do not have illusion regarding your ability to download something else in the same time. You leave it alone to download the file. You dedicate your line to one file only. In this specific network where large files are exchanged overhead small blocks create can be significant and comparable to all possible drops in connection.

    But again, the protocol is flexible enough and block size can be changed and you can ask trusted publisher to send you hash for arbitrary block. I though that flexible block will be used to increase the block size in case of really huge files, but there is no limit on the downside either.

    There is just one thing to remember - Rodi sends blocks map in a single UDP packet. Map of blocks is a bit mask, where 1 means that this block is present and 0 that there is no such block. For example, if we assume 8G file and 4M blocks it gives us 250 bytes of blocks map.
    (see calculator http://larytet.sourceforge.net/btRat...orblockmapsize)
    UDP packet can carry up to 64K od data. It means that you can chop file by 16KB blocks

    Usually this is a good idea to keep UDP packets under 1,500 bytes (size of Ethernet frame). I do not see any reason to use small blocks for large files.

    You mention that most of the modern P2P network work well for dialup. did you try to download 8G file from BT using dialup ? by my estimation only the overhead large swarm creates should shutdown your connection. And one of the major reasons behind it is a small block size.

    Peers attempt to send each other information about what block they have. Tracker feeds the swarm with lists of IP addresses. And tracket has no idea that your connection is dialup. Your modem handles lot of SYN requests from other peers. Surely your BT client discards them because you are already connected to 4 or 5 peers and that's enough, but SYNs continue to come in.

    Even if you bring BT client down for a day or two you will see TCP connection requests in your firewall log, because some peers (and tracker) still remeber your IP.

    Indeed, Rodi is not the best choise for dialup connections. From functional requirements (http://larytet.sourceforge.net/funct...irements.shtml):
    Protocol is developed for reliable (low packet loss rates) and fat (1/4M) links with significant round trip delays (300ms and more).
    It can be used by dialup users too, but publisher should be aware of their existence.


    dialup users often experience frequent disconnects
    Rodi is UDP based and failure of the network connection does not mean cancel of all runnin
    g download sessions. There is timeout on the publisher side, which is currently ~5s. If uploader sees no packets from you for 5s AND there is nothing more to send/stream to you uploader decides that you are satisfied and cancels the session. It means that your connection (your dialup) should be back in 5s in the worst case or 5s + time required to upload the rest of the block, which is 30min according to your calculation. The timeout is going to be configurable.
    This is interesting thing about Naive Data Transfer plugin i use currently. see http://cvs.sourceforge.net/viewcvs.p....3&view=markup
    read comment at the top of the file.

    P.S. i read the comment in the source file and there is a couple of things to correct. First GET is issued by the downloader for 4MB block, not for 1024 chunk. Uploader streams the whole 4MB without any additional requests from the downloader. Retransmission requests from the downloader contain one or more chunks.
    In the network with 300 ms round trip delay
    retransmissions will add up 4 minutes to the data transmission time.
    The calculation is correct if downloader sends retransimission request of for one chunk at time. In the exsiting code downloader sends retransmission requests containing list of up to 32 chunks x 1024, which is accidentally typical size of TCP retransmission window

    I found that Data Transfer Naive is extremely scalable and efficient even in the conditions of high packet loss raios.


    From
    http://www.utsc.utoronto.ca/~rossele...2p-measure.pdf


    the percentage of Napster users connected with modems (of 64Kbps or less) is
    about 25%, while the percentage of Gnutella users with similar connectivity is as low as 8%.

    At the same time, 50% of the users in Napster and 60% of the users in Gnutella use broadband connections
    (Cable, DSL, T1 or T3). Furthermore, only about 20% of the users in Napster and 30% of the users in Gnutella
    have very high bandwidth connections (at least 3Mbps).


  14. #14
    Registered User larytet is on a distinguished road
    Join Date
    Jun 2006
    Posts
    0
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    31
    P.S. this thread is quickly getting more and more technical. i am not sure how usefull it is for the readers of P2PF. also i am not sure how effective message board is for communicating technical aspects of the project.

    I have no problem to continue this thread because it's just another way to improve online documentation, simply i am not sure how usefull it is for this particualr forum.

  15. #15
    Registered User carpefile has a spectacular aura about carpefile has a spectacular aura about carpefile's Avatar
    Join Date
    Jan 2004
    Location
    Omnipresent
    Posts
    1,215
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    73
    At the very least it builds interest and gets the Rodi name out there on peoples lips. Which is a good thing.

    With all due respect to tm, dial-up is quickly on its way out, a number of the larger telecoms are even now installing the "last mile" of fiber.
    In less than 5 years the avg d/l speeds will be around 10Mbps, it would be a waste of time to develop for dial-up.

    But get the GUI going, or noone will ever use it.
    "Sorry, I'm not willing to even try to contend with this gibberish." kluelos
    "Download speeds here are directly proportional to your overall helpfulness and attitude in the forums, hence your shitty speeds... ," Freakin Weasel (my hero)
    Once in a lifetime statement "I apologize. :-\" Pyronic

+ Reply to Thread
Page 1 of 12 1 2 3 11 ... LastLast

Similar Threads

  1. Is Rodi BitTorrent's Replacement?
    By Lord_of_the_Dense in forum Bittorrent
    Replies: 0
    Last Post: 07-05-2005, 11:13 PM
  2. IntegrityP2P's Interview with Rodi Developer Larytet
    By Vivacious in forum Digital Media News
    Replies: 0
    Last Post: 04-23-2005, 10:21 PM
  3. Rodi project beta
    By larytet in forum P2P General Discussion
    Replies: 3
    Last Post: 01-06-2005, 02:34 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts