Looking for a job? Check out our open positions. You can also take a look at our engineering blog to learn more about the way we work.
- Clone this repo (do not fork it)
- Solve the levels in ascending order
- Only do one commit per level and include the
.git
when submiting your test
Please do the simplest thing that could work for the level you're currently solving.
For higher levels we are interested in seeing code that is:
- Clean
- Extensible
- Reliable
The challenge needs to be resolved in Python.
Each level depends on one python 3.7 executable and one to many libraries that you'll have to use.
You can't modify them.
Your solution to each level needs to live in the level_{N}
directory.
Launch the level_file
program.
It will write log messages into ./logs/#{id}.txt
.
Each file will contain one messsage log. The log looks like this:
id=0060cd38-9dd5-4eff-a72f-9705f3dd25d9 service_name=api process=api.233 sample#load_avg_1m=0.849 sample#load_avg_5m=0.561 sample#load_avg_15m=0.202
You need to write a program that will parse the messages, write the result to a JSON file in ./parsed/#{id}.json
and deletes the original message.
You need to write a JSON in the following format:
{
"id": "2acc4f33-1f80-43d0-a4a6-b2d8c1dbbe47",
"service_name": "web",
"process": "web.1089",
"load_avg_1m": "0.04",
"load_avg_5m": "0.10",
"load_avg_15m": "0.31"
}
When you launch the levels_http
program it will send the same log messages to a local HTTP server at http://localhost:3000/.
The HTTP server listens to POST requests on the port 3000.
The POST requests will timeout after 100ms.
You need to write a simple HTTP server that will listen to this requests, parse the logs and write the result to a JSON file in ./parsed/#{id}.json
in the same format than Level 1.
To write a simple HTTP server look at Flask or Bottle.
Launch the levels_http
program.
This time your HTTP server need to parse the logs and send them to a Redis LIST
on a local Redis instance (redis://localhost:6379).
Launch the levels_http
program.
Your HTTP server, after parsing the logs, needs to enrich them with a library called slow_computation
.
To use this library:
import slow_computation
new_dict = slow.compute(new_dict)
print(new_dict)
# {
# "id": "2acc4f33-1f80-43d0-a4a6-b2d8c1dbbe47",
# "service_name": "web",
# "process": "web.1089",
# "load_avg_1m": "0.04",
# "load_avg_5m": "0.10",
# "load_avg_15m": "0.31",
# "slow_computation": "0.0009878"
# }
As in level 3 you’ll send the resulting JSON in a redis LIST. Again, the HTTP call will timeout after 100ms.
We provide some (fake) data to play with. You will work with cars and rentals. You can download the CSV files here:
cars: https://cl.ly/eDaw rentals: https://cl.ly/eDUn
Cars:
id
: the car IDcity
: the city where the car is availablecreated_at
: date when the car was made available on the platform
Rentals:
id
: the rental IDcar_id
: the ID of the car used for this rentalstarts_at
: the datetime when the rental startsends_at
: the datetime when the rental ends
Remarks regarding data quality: rentals starts_at
and ends_at
columns only contain 00:00:00
or 12:00:00
time components providing a half-day (AM/PM) level of detail. rentals
are clean: there is no overlap between rentals. car_id
and dates are also clean: no NULL value or erroneous value. cars: all fields are clean except created_at
, which can be NULL
You need to solve this in SQL. You are free to choose any kind of database engine.
We first want to fix the NULL
created_at for cars
. For each car with a NULL created_at, we will consider that they were created on the same date as the previous car (ie. the car with the closest id before with a non null created_at
). Assume that cars can be more than 1 ID apart.
Then, for each month, find how many cars reach their 3rd rental since their registration. Use the starts_at
to determine the month to attribute.
We hope you'll have fun doing this challenge. It shouldn't take more than a few hours. Enjoy and be reliable <3