samedi 31 janvier 2015

Conspicuous changes in the encrypted Truecrypt container [on hold]



I have a backup of my important data in the cloud with a given size of several gigs. To prevent snooping the data is inside a TrueCrypt container and everything which will be transmitted is the TrueCrypt container (no hidden volume).


Because uploading several gigs with even a fast Internet connection is slow, I have split the container file into pieces of several MByte size. I wrote a script which compares the current container with an instance of the past container (The past container is stored on the local disk to speed up the process) and only the changed pieces are written out and finally transmitted to the cloud.


Works great so far. Changes occuring from backup to backup weekly are in the 5% range of the original file, so updating is no problem.


Now a short time ago I prepared the backup and was wondering why it takes so long. I was puzzled to see that the changes have increased from the normal 5% to 50% ! Alarmed, I mounted both volumes (old one and the current one) and made a file compare and could not see any reason for the increase.


I have no knowledge how exactly TrueCrypt lays out the data internally, but as a file system the simple explanation could be that some data is heavily fragmented and deletion/appending such data could cause such an effect.


But I am still uneasy, so


a) Is the simple explanation correct and it is normal that changing only small portions of data could cause changes affecting a much bigger portion of the data in the TrueCrypt container ?


b) If you think a) is not a sufficient reason and you find that it is strange behavior, how to continue ?


ADDITION: There are several misconceptions here, so I will now describe exactly what I am doing because it does not seem "obvious" enough.



  1. I have several gigs of data. And yes, if you have pictures, documents, your source code repository and other important stuff you really want to have again if e.g. your computer is destroyed, then you easily accumulate several gigs.

  2. I want to store the data in a cloud server over webdav. Because I don't want anyone to snoop in my data, I am using encryption and because I do not distinguish between critical confidentiality and...what exactly is the other option ??... so I use TrueCrypt 7.1a.

  3. TrueCrypt offers the option to mount a single big file as virtual drive. The encryption is the Serpent/Twofish chain (no AES). In the meantime I calculated the hash values of the binaries and TrueCrypt seems to be ok, so one possible problem is eliminated (not strictly, but there is hope)

  4. The TrueCrypt volume is mounted, let's say it has the name TC. The first step of the script is started; it compares the content of specific directories (e.g. /foo) to the equivalent TrueCrypt (TC/foo) and updates the TrueCrypt content so that they are equivalent to the original directory. The update uses simply delete/copy and this is the only step where the data is changed. There is no data corruption !

  5. TrueCrypt volume is unmounted. We are looking now at the file used for the virtual drive. It is large and looks random. But because we have changed the content, its binary content has also changed. It is a fact that can be reproduced: If you have committed only small changes in the virtual TrueCrypt drive (at least under Serpent/Twofish) the binary blob normally has only small changes, too.

  6. Now I want to transmit the data as fast as possible. Because the internet connection is a bottleneck, I always have a copy of the old TrueCrypt file. Because I know now that normally only small parts are changed, the big binary Truecrypt file blob is split into smaller segments in the cloud. When changes occur, only the changed segments are then updated. Naturally because the segments have a constant size, I transmit more data than necessary, but it is ok. The cloud is also handled as virtual drive.

  7. So what did I use as file operations ? Essentially delete/copy/remove/binary compare/split. It's not a programming question. It's in my humble opinion neither a "home-growing solution to hack together things that weren't designed to integrate with each other in the first place" and it definitely is not a "business class" or "enterprise" problem. If anyone is not able to write such a script in less an hour him-/herself...


  8. Gilles has asked a very good question. The percentage of changed segments is 50%. I run a counter and found that 20,25% of the binary content has changed with (approximate) sizes

    less than 10 bytes: 590 000 times

    less than 100 bytes: 5 000 0000 times

    less than 1000 bytes: 11 000 000 times

    less than 10000 bytes: 350 000 times.

    No bigger changes occur.


    The number of segments is 320, by the way. The next step for me will be to find long byte chains and see if I can locate them on another place (so that my suspicion that it isn't change, only relocation, is confirmed).




I do not know the exact security implications, but if some operations on TrueCrypt files can trigger a large change in the data contained when normally only small changes are observed, it could be a possible weakness.





Aucun commentaire:

Enregistrer un commentaire