Just last night I found this educational mini game written in R and decided to have a go at it:
This was a fairly fun #Rstats game: https://t.co/0zGCQqCxvw (Only after brute-forcing it, I realised I didn't have to.)
— Dave Tang (@davetang31) December 5, 2015
I completed it but as I alluded to in my tweet, not in a very elegant manner. This post is on using the dplyr package in R to solve some of the problems. If you want to give the game a go first, then stop reading now.
To get started, first install the game/package, load the library, and use the proton() function:
install.packages("proton") # load library library(proton) # start game proton()
Install the dplyr package if you don't already have it:
install.packages("dplyr") # load library library(dplyr)
The first problem
Pietraszko uses a password which is very difficult to guess.
At first, try to hack an account of a person which is not as cautious as Pietraszko.
But who is the weakest point? Initial investigation suggests that John Insecure doesn't care about security and has an account on the Proton server. He may use a password which is easy to crack.
Let's attack his account first!
Problem 1: Find the login of John Insecure.
Bit has scrapped 'employees' data (names and logins) from the www web page of Technical University of Warsaw. The data is in the data.frame employees
.
Now, your task is to find John Insecure's login.
When you finally find out what John's login is, use proton(action = "login", login="XYZ")
command, where XYZ is Insecure's login.
# first check out the data frame head(employees) name surname login 1 Jorge Patrick j.patrick 2 Gerald Long gerald.long 3 Javier Mendoza j.mendoza 4 Roy Johnston rjoh 5 Annie Keith annie.keith 6 Nora Castro ncas # our task is to find John Insercue's login # let's use the filter() function in dplyr filter(employees, name == "John", surname == "Insecure") name surname login 1 John Insecure johnins # our solution proton(action = "login", login="johnins")
The second problem
Congratulations! You have found out what John Insecure's login is!
It is highly likely that he uses some typical password.
Bit downloaded from the Internet a database with 1000 most commonly used passwords.
You can find this database in the top1000passwords
vector.
Problem 2: Find John Insecure's password.
Use proton(action = "login", login="XYZ", password="ABC")
command in order to log into the Proton server with the given credentials.
If the password is correct, you will get the following message:
Success! User is logged in!
.
Otherwise you will get:
Password or login is incorrect!
.
# check out the vector head(top1000passwords) [1] "123456" "password" "12345678" "qwerty" "123456789" "12345" # loop through and try every typical password # and add an if block to find out the password for (i in top1000passwords){ x <- proton(action = "login", login="johnins", password=i) if(length(grep(pattern = "Success", x = x)) == 1){ print(i) } } [1] "q1w2e3r4t5"
The third problem
Well done! This is the right password!
Bit used John Insecure's account in order to log into the Proton server.
It turns out that John has access to server logs.
Now, Bit wants to check from which workstation Pietraszko is frequently logging into the Proton server. Bit hopes that there will be some useful data.
Logs are in the logs
dataset.
Consecutive columns contain information such as: who, when and from which computer logged into Proton.
Problem 3: Check from which server Pietraszko logs into the Proton server most often.
Use proton(action = "server", host="XYZ")
command in order to learn more about what can be found on the XYZ server.
The biggest chance to find something interesting is to find a server from which Pietraszko logs in the most often.
# This was the problem I brute-forced when it wasn't necessary # I forgot that I could get Pietraszko's login from the employee data frame # check out the logs head(logs) login host data 1 r.spencer 193.0.96.13.15 2014-09-01 09:01:12 2 isaac.arnold 193.0.96.13.9 2014-09-01 09:01:51 3 warren.dickerson 194.29.178.32 2014-09-01 09:08:08 4 c.lopez 194.29.178.4 2014-09-01 09:09:02 5 l.russell 194.29.178.162 2014-09-01 09:22:22 6 t.silva 194.29.178.102 2014-09-01 09:25:40 # find the login of Pietraszko filter(employees, surname == "Pietraszko") name surname login 1 Slawomir Pietraszko slap head(filter(logs, login == "slap")) login host data 1 slap 194.29.178.16 2014-09-02 02:32:48 2 slap 194.29.178.16 2014-09-03 15:10:41 3 slap 194.29.178.108 2014-09-04 22:52:26 4 slap 194.29.178.108 2014-09-05 11:22:20 5 slap 194.29.178.108 2014-09-06 13:35:41 6 slap 194.29.178.16 2014-09-06 17:33:06 # use group and summarise to find the # host most commonly logged in slap_log_group <- group_by(filter(logs, login == "slap"), host) summarise(slap_log_group, count = n()) Source: local data frame [5 x 2] host count (fctr) (int) 1 194.29.178.16 112 2 193.0.96.13.20 33 3 194.29.178.155 6 4 193.0.96.13.38 1 5 194.29.178.108 74 # solution proton(action = "server", host="194.29.178.16")
The fourth and last problem
It turns out that Pietraszko often uses the public workstation 194.29.178.16.
What a carelessness.
Bit infiltrated this workstation easily. He downloaded bash_history
file which contains a list of all commands that were entered into the server's console.
The chances are that some time ago Pietraszko typed a password into the console by mistake thinking that he was logging into the Proton server.
Problem 4: Find the Pietraszko's password.
In the bash_history
dataset you will find all commands and parameters which have ever been entered.
Try to extract from this dataset only commands (only strings before space) and check whether one of them looks like a password.
# check out the history head(bash_history) [1] "mcedit /var/log/lighttpd/*" "pwd" "vim /var/log/mysql.*" [4] "rm /bin" "cat ~/.Xauthority" "ls /srv" # how many unique commands? length(unique(bash_history)) [1] 489 # the password should only be a string without spaces unique(grep(pattern = " ", x = bash_history, value = TRUE, invert = TRUE)) [1] "pwd" "ps" "whoiam" "top" [5] "mc" "DHbb7QXppuHnaXGN" # now we can login as Slawomir Pietraszko to complete the game # remember from problem two how we can log on to the server? # proton(action = "login", login="XYZ", password="ABC") proton(action = "login", login="slap", password="DHbb7QXppuHnaXGN") Congratulations! You have cracked Pietraszko's password! Secret plans of his lab are now in your hands. What is in this mysterious lab? You may read about it in the `Pietraszko's cave` story which is available at http://biecek.pl/BetaBit/Warsaw Next adventure of Beta and Bit will be available soon. proton.login.pass "Success! User is logged in!"
Summary
I thought that was rather fun. I finally tested out dplyr and it is definitely much easier than typing:
employees[employees$name=="John" & employees$surname=="Insecure",] name surname login 217 John Insecure johnins
The first time I went through the game I wanted to finish it as quickly as possible, hence for problem three, I just tried all the unique hosts. I only realised later during problem four, that I needed Pietraszko's login and that I could get that from the employee table.
For my regular readers, I have a bigger post on DNA sequencing that is still in the works. I'll try to finish that post in the coming week.

This work is licensed under a Creative Commons
Attribution 4.0 International License.