This project scrapes real estate data from melketabriz.ir
using unique property codes and compiles the extracted information into a structured CSV file. This tool can be useful for users who want to collect detailed data on multiple properties from the website for further analysis or reference.
- Data Extraction: Extracts key property details like code, region, neighborhood, area, total price, floor, room count, and additional attributes directly from the website.
- CSV Storage: Consolidates all extracted data into a CSV file,
combined_property_info.csv
, for easy access and further processing. - Error Handling: Skips invalid codes and handles missing data by filling in blank fields as needed.
Ensure you have the following installed:
- Python 3.x
- Requests:
pip install requests
- BeautifulSoup4:
pip install beautifulsoup4
- Pandas:
pip install pandas
- Clone this repository or download the code files.
- Make sure
combined_property_info.csv
exists in the same directory as the script.
To fetch data for a specific property code, use the extract_data_of_given_code()
function. For example:
df = extract_data_of_given_code(25923)
add_to_excel(df)
Each time you run the add_to_excel()
function with a DataFrame, the extracted data will be appended to combined_property_info.csv
.
extract_data_of_given_code(given_code)
: This function generates a URL using the providedgiven_code
, then fetches and parses the data. If the code is invalid, it returnsNone
.add_to_excel(df)
: Appends the given DataFramedf
tocombined_property_info.csv
.
# Specify a valid property code
property_code = 25923
# Extract data and append to CSV
df = extract_data_of_given_code(property_code)
if df is not None:
add_to_excel(df)
- The scraper is designed to work only with
melketabriz.ir
and may not function correctly if the website structure changes. - The code assumes UTF-8 encoding for Persian (Farsi) text. Ensure your environment supports UTF-8 to avoid encoding issues.
- Support for extracting data from multiple property codes at once.
- Enhanced error handling and logging.