Mehrab Mashrafi

UX

AI

DATA

Loading time...

Real Estate Market Analysis of UAE

Topics Covered: Data ScrapingData PreprocessingData CleaningData SegmentationGISTableauMarket AnalysisData Analysis

This was a comprehensive data analysis project that involved various stages. We initiated by scraping data from the web, specifically from PropertyFinder.ae. After gathering the data, we meticulously cleaned and organized it using Python. Additionally, we conducted segmentation to enhance our analysis by categorizing the data in various ways. To further enrich our insights, we incorporated GIS (Geographic Information System) data for more in-depth analysis. Finally, we wrapped up the project by creating interactive dashboards to visualize our findings, making the data easily accessible and comprehensible for presentation purposes.

Project Presentation Video

Data Scraping

Step-by-step description of the key tasks and accomplishments are as follows:

Challenges

Solutions

To address the challenges, we developed a tailored algorithm that allowed us to handle the data irregularities effectively. We conducted numerous iterations and testing to ensure that the maximum number of rows could be scraped during the process.

Additionally, we incorporated a new tab-opening and tab-closing system within the same script. This innovation enabled us to access information on property pages, facilitating the extraction of property and agent details.

Data Preprocessing

Step-by-step description of the key tasks and accomplishments are as follows:

Challenges

Solutions

We created a blueprint of how we needed to structure the data, and we decided to convert the dataset into a third normal form. This way, each table would only contain a single element with a primary key. We identified these unique elements in the dataset:

1. Data Normalization Process

We organized the raw data into the five unique entities that constitute the core of our dataset. This involved identifying and separating these fundamental elements.

Further, we examined the relationships and affiliations of other data with these core entities, aiming to establish clear and structured connections.

The outcome of this process was the normalization of a single, complex table into seven distinct tables. Five of these tables represent the original entities, and the remaining two are relational tables that capture the relationships between the core elements.

2. Ensuring Data Integrity

With the dataset's structure clarified, we focused on ensuring data integrity and establishing unique primary keys for each table. We systematically examined each table and looked for columns that could serve as primary keys. If none were found, we sought candidate keys to guarantee uniqueness for each row.

To facilitate this process, we utilized Python's UID (Unique Identifier) library. This enabled us to generate unique primary keys, ensuring data consistency and referential integrity within the database.

By having unique primary keys in place for each table, we gained the flexibility and agility required to advance to subsequent stages of our analysis.

3. GIS Data Incorporation

Initially, our dataset lacked clear latitude and longitude data for each property. However, we possessed detailed street addresses for each property listing. To leverage this information, we harnessed the power of Google Maps' API for automated geocoding.

Through this process, we were able to automatically generate latitude and longitude data for every property listing in our dataset. This transformation opened up a new dimension of analysis that was previously unavailable to us, enabling more comprehensive geographical insights and visualization.

4. Data Segmentation

Following the initial data collection and cleansing phases, we moved on to data segmentation, a crucial step in organizing and categorizing our dataset. Our segmentation strategy revolved around key attributes to enhance analysis:

By applying these segmentation criteria, we gained a deeper understanding of our dataset, which facilitated more targeted analysis and visualization.

Entity-Relationship Diagram (ERD)

ERD Diagram
Prepared ERD Model for our dataset

Why we did this?

Interesting Findings:

Please be advised that the dashboards are interactive and there are a many ways insights can be achieved besides these points mentioned above.