Looking for a flat the modern way
Posted on August 06, 2017 in Dev • 4 min read
I recently moved home and had to go through the painful process of looking for a new flat to rent. You usually want to optimize various criteria, such as the commuting time, but housing websites and their search form are not fit for these. You also end up doing a lot of deduplication manually between multiple housing websites. Plus you want to crawl the websites as often as possible as in large cities such as Paris interesting posts only stay a couple of hours online.
Most people just crawl the housing posts websites regularly, manually, to look for the perfect flat. These include Leboncoin (French equivalent to Craigslist), SeLoger (a website centralizing posts from real-estate agencies), Pap (similar to Craigslist, but focused on housing) and many many real-estate agencies websites. This is a really painful process as there are a lot of duplicate posts to filter by hand and they tend to return broader results than you asked for (typically, including results with a location close to the asked locations, or flats with slightly smaller areas).
I was going to do this manual crawling process, when I realized Weboob actually has modules for a bunch of these websites!
Weboob, Web Outside Of Browsers, is a Python tool which can be used to scrap websites and fetch info from them, from weather info to housing posts. I could then just use it to fetch all the available posts from the websites, and make a JSON export. From this building block, I could build a tool to help me (and you!) find a new place: let me introduce you Flatisfy!
Flatisfy is a Python tool made of: - a backend to fetch, filter out housing posts according to some criteria and “augment” the results with new machine-readable metadata found by analyzing the post, such as postal code, nearby public transport stations and travel time to specific place. - a frontend which is just a web app to explore the results, usable with any regular browser.
Disclaimer: This tool is focused on France, and in particular Paris and Lyon (it has been successfully used there). It may not work directly with other places, but should be customizable enough to do the job.
Note: Throughout this post, I use the word “flat”. It should be considered as a synonym to housing, as Flatisfy should also work if you look for a house.
Too long, didn’t read, show me pictures!
Ok, here are some screenshots. Note that the UI might change a lot in the near future, this is basically just a proof of concept.
Home page
Home page shows a map of found flats and places you are interested in, and a table showing the complete list of flats by postal codes.
Details page
For each flat, the details page shows all the available details in a uniform way. No more searching through ugly websites to find some piece of information you have on another website, everything is rendered the same, for easy comparison!
Tracking and annotating flats
You can annotate flats of interest, to easily sort them and manage a “to contact first” list:)
Sounds nice, how do I try it?
You can have a look at the getting started doc to get started quickly with a manual installation. A docker image is also available to test Flatisfy.
How does it work?
Flatisfy is basically a super-wrapper on top of Weboob. It is using Weboob to fetch info, and post-process them afterwards. There are three filtering passes implemented for now.
First pass removes obvious duplicates (same URL or same ID) from the list of
fetched flats. Then, it tries to match a postal code and eventually some
public transport stations from the text fields (location
and stations
)
provided by Weboob. Finally, it does a gross refine of the flats list based on
your criterias (remove flats with are obviously out of your lookup zone, or
too expensive, or too small for instance), as housing websites tend to return
broader results than you asked for.
This first pass aims at doing a basic filtering from the “results” page of the housing websites, in order to limit the number of queries to get detailed informations about each flat that might match.
Using the additional details loaded for each flat, second pass tries to confirm previously guessed infos such as postal code and stations. Then, for flats having matched public transport stations, it computes the travel time to location you defined in your config, and further filter flats according to your criteria.
Finally, third pass is a deep duplicate detection pass. It tries to fetch duplicates across websites, and to automatize what you used to do manually. It compares price, area and photos of posts to merge duplicates together.
What about the future?
Flatisfy is currently in a working state and can already be used. Actually, it was successfully used a couple of times already, to find either a place to rent or a place to buy.
If you don’t live in Paris or Lyon, you might have to import extra data to benefit from all the advanced features. This is quite easy to do if your city provides opendata. Please, fill in a merge request with the extra data, so that it could benefit to everyone! :)
I already have some ideas for the future, mainly comparing prices with prices nearby, providing an ICS feed automatically generated from visits schedules and a better and more responsive web UI. :)