This session was recorded in NYC on October 22nd, 2019 and can be viewed here: https://youtu.be/4YjN8G6uEko
Machine Learning Apps at PropertyGuru
PropertyGuru is the largest prop tech company in South-east Asia. We enable our customers to find their dream homes and add value to the agents who trust our platform to match them to the right property seekers. In this session, I will talk about how we are using machine learning to build products and experiences that help people make confident property decisions. I will cover how we guide property seekers and agents with innovative ways to search listings and personalised recommendations, and how we build models to maintain the quality of the listings that they interact with.
Bio: Gautam Borgohain is a Data Scientist and Software engineer with over 7 years of experience building and leading data science products in various industries and projects like recommendation systems, image-classification and object detection services, NLP, property valuation and credit risk evaluation among others. He obtained his Master Degree in Analytics from Nanyang Technological university in Singapore. Before joining PropertyGuru, Gautam gained cross-industry experience with previous stints in a fintech start-up , an university and a software company. He loves spending hours analysing data and developing smarter applications with machine learning.
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Gautam Borgohain, PropertyGuru - Machine Learning Apps at PropertyGuru - H2O World 2019 NYC
1. Machine Learning at PropertyGuru
Gautam Borgohain
Data Scientist
PropertyGuru
2. PropertyGuru Group Portal Asia Real Estate Summit Asia Property Awards
>2.4 Million
Property Listings
>23 Million
Property Seekers Monthly
60%* mkt. share
*SimilarWeb: Consumer engagement market share
45,000
Agent Customers Monthly
5
Leading Marketplaces
SINGAPORE
INDONESIA
MALAYSIA
THAILAND
VIETNAM
CAMBODIA
PHILIPPINES
MYANMAR
HONGKONG
MACAU
SRI LANKA
CHINA
MONGOLIA
JAPAN
Sources: Internal
Singapore
Malaysia
Thailand
Indonesia
Vietnam
Entry in 2007 Entry in 2011
Entry in 2011 Entry in 2016
Entry in 2011
PropertyGuru Group - Overview
Southeast Asia’s Leading Property Technology Company
PropertyGuru Group Portal Asia Real Estate Summit Asia Property Awards
>2.4 Million
Property Listings
>23 Million
Property Seekers Monthly
60%* mkt. share
*SimilarWeb: Consumer engagement market share
45,000
Agent Customers Monthly
5
Leading Marketplaces
SINGAPORE
INDONESIA
MALAYSIA
THAILAND
VIETNAM
CAMBODIA
PHILIPPINES
MYANMAR
HONGKONG
MACAU
SRI LANKA
CHINA
MONGOLIA
JAPAN
Sources: Internal
Singapore
Malaysia
Thailand
Indonesia
Vietnam
Entry in 2007 Entry in 2011
Entry in 2011 Entry in 2016
Entry in 2011
>2.4 Million
Property Listings
4. Our Mission
Medium-term measurable inspiration linkedto personal goals
To help people make confident
property decisions through relevant
content, actionable insights and world-
class service.
5. ML at PG
• Team of 5 Data Scientists and ML
Engineers with different
backgrounds
• Mechanical, Chemical engineering,
Computer Science, Bio-informatics
• ML to enable better discovery of
listings and uphold a standard of
quality of the listings
6. PropertyGuru Lens
2MB Model Size
Extensively optimized deep
learning object detection model.
6ms Latency
CoreML optimizations to increase
the speed of responses
8000 Condos
Real-time on-device location based
identification of condos
7. Personalized Listings Recommendations
>2 Million Active property listings
across countries
Listings
>20 Million Active users across
countries
Users
95 ms API response time
Latency
11. Text Moderation
bahasa allow
beijing allowed
china choice
chinaman choices
chindian denied
chinese deny curry except
expat no
expatriates non
expatriats not
expats omit
filipines omitted
filipino only
filipinos reject
indian sorry
india rejected
indians uninvited
malay unwelcome
prc no
prcs non
skin omitted
locals against
vietnam only
locals welcome
singaporeans required
indians included
indian idealexpat choice
pinoys preferred
expats favoured
china advantageous
Bias Detection
10xsmaller model
400xfaster inference
12. Model Deployment (Example)
10,000 Images / second
With AWS Lambda concurrency, to
moderate the images.
10X Smaller models
We get to ~150MB of model & code,
lambda constraints.
500ms
Evaluation
Smaller models with high accuracy and
low latencies.
As compared to deploying the models
on the low end GPUs.
97% Lower cost
We are an online property market place, we match property seekers with property agents. We have been around for a while, we started back in 2007 in Singapore and now we are operating in 5 countries. We are the leading prop tech company in South-east Asia. We have more thatn 2.4 million property listings across the different regions which we serve to a user base of >23 MIllion
We enable our customers to find their dream homes and add value to the agents who trust our platform to match them to the right property seekers
As a company our vison is to be a trusted advisor to property seekers
and we think that means to help people make confident property decisions…
So as you can image, machine learning forms a big part in our strategy to achieve those goals.
We are currently a team of 5 DS and ML engineers we all have different backgrounds - ranging from {}
Of course, we rely on various tools to help with our work and finding the best possible models, including driverless ai
In today’s session, I am going to briefly speak about some of the areas where we are using machine learning to build better products that that help people make confident property decisions
More specifically, I am going to focus on how we are enabling better discovery of listings and how we maintain the quality of those listings
First up, lets look at how we enable the discovery process...
Obviously, property seekers use search with filters on our site and mobile apps for the properties they know about but in order to look for available units in building they just saw and liked, they used to have to depend on their detective skills. Remember where they saw the building, what color etc.
So we recent launched the PropretyGuru Lens. It’s the 1st app in se asia to be powered by augmented reality that allows property seekers to discover properties just by pointing their phones at them. The app then presents a list of all units in that property that are listed for rent/sale. It is like having your personal expert on real estate on your phone. This is particularly valuable in high density cities like Singapore, where there are a lot buildings in close proximity and there are new projects coming up regularly.
All the processing happens on the device, so its private a
nd it has been optimized so that it runs on minimal power. The model is just 2MB in size and the latency of the scoring engine over 8000 condos is just 6ms.
Condo names work even without mobile connectivity
On device- privacy protection – low power / heat processing..
15deg field of view error re-adjustment
To further help the discovery process, we use a personalized recommendation engine that matches the over 2 million listings to over 20 Million users. These recommendations are personalized in real time.
A key factor for us while building the recommendations is interpretability and its relevance to the user with respective to his/her property search journey.
Therefore we customize the recommendations based on what stage of property search the user is in, whether they are in the initial discovery phase or are later in their search hunting for best deals.
Also, to help discovery, we give property agents insights on how their listings are performing and tools to promote high quality listings.
For example , here we are recommending the agent to promote his listings (by a feature we call "Boost" ) to have a higher return.
We know from having estimating the listing performance given the current demand and competition for that listings and then estimating the propensity of the performance to increase if promoted
We also take the quality of the listings we are presenting seriously, so next, we look at how we maintain the quality of the listings.
So since we are an online business, agents , while creating listings are basically uploading content- > Images and Text (the descriptions)
The property seekers expect certain level of quality of content on our site, this means relevant and high quality images and informative and productive text.
Up until April, 2019 , we allowed images that did not meet some of our guidelines to be uploaded regardless, so a lot of our listings used to look like this... With a lot of overlays on the images uploaded. For property seekers, this is a bad experience because they cannot see the actual image of the unit with all the overlays and distractions.
So we started to moderate the images that are uploaded and given the volume, we have to use multiple ML models to the various checks before listings go live.
Our moderation engine does a variety of models. From object detection of face, text, banners and watermarks. To room type, whether it is a indoor kitchen or outdoor dining area. We do this to ensure the coverage of the unit by images.
We even evaluate the quality of the images with various deep learning model to identify their aesthetic score. We also use deep learning to prevent any NSFW images from being uploaded
Moderating the descriptions that get uploaded to site is also very important to be moderated. We do various text analysis on the descriptions from topic extraction , bias detection etc.
Bias detection in particular involves detecting phrases in the text that hint or indicate a preference of ownership or rental based on race. For example , specifying the description that only a particular race is welcome to be considered for rental or sale is not allowed. So we, have a model to detect any inherit bias against races in the descriptions. We initially started with a Bi-LSTM model with word2vec embeddings. But we soon realized that newer compound words were starting to get used (for example, chindian, a person of chineses and Indian decent) . This lead us to using BERT. Due to the size of the model however, we applied the concept of knowledge distillation to train a 10x smaller model to do inference with the same accuracy.
As you can see, we have lot of use case of ML at PG, and there are many more that we didn’t have the time to cover.
Deploying these models in a manner that is scalable and performant is very important to us.
Due to the small size of our team, we prefer serverless deployments. In the example above, we used AWS Lambda to serve our deep learning models eg face detection. Making our deployments serverless allows us to be not only be able to scale on demand, but also huge reduction in cost compared to hosting dedicated instances. Event our driverless ai's one click deployments are in AWS lambda so it fits in into our stack perfectly.