In the following recipe, we tamper with a binary file. We then compare it to the original to see that ssdeep determines that the two files are highly similar but not identical:
First, we download the latest version of Python, python-3.7.2-amd64.exe. I am going to create a copy, rename it python-3.7.2-amd64-fake.exe, and add a null byte at the end:
truncate -s +1 python-3.7.2-amd64-fake.exe
Using hexdump, I can verify that the operation was successful by looking at the file before and after:
hexdump -C python-3.7.2-amd64.exe |tail -5
This results in the following output:
018ee0f0 e3 af d6 e9 05 3f b7 15 a1 c7 2a 5f b6 ae 71 1f |.....?....*_..q.| 018ee100 6f 46 62 1c 4f 74 f5 f5 a1 e6 91 b7 fe 90 06 3e |oFb.Ot.........>| 018ee110 de 57 a6 e1 83 4c 13 0d b1 4a 3d e5 04 82 5e 35 |.W...L...J=...^5| 018ee120 ff b2 e8 60 2d e0 db 24 c1 3d 8b 47 b3 00 00 00 |...`-..$.=.G....|
The same can be verified with a second file using the following command:
hexdump -C python-3.7.2-amd64-fake.exe |tail -5
This results in the following output:
018ee100 6f 46 62 1c 4f 74 f5 f5 a1 e6 91 b7 fe 90 06 3e |oFb.Ot.........>| 018ee110 de 57 a6 e1 83 4c 13 0d b1 4a 3d e5 04 82 5e 35 |.W...L...J=...^5| 018ee120 ff b2 e8 60 2d e0 db 24 c1 3d 8b 47 b3 00 00 00 |...`-..$.=.G....| 018ee130 00 |.| 018ee131
Now, I will hash the two files using ssdeep and compare the result: