Visual Web Ripper Logo Visual Web Ripper Logo

Highlighted features

Project Summary


Extracting data from a property website

requested: 2/6/2010 version: 2.30.11

Demonstrates how to find and select elements in tree view.

Target URL: http://www.trulia.com/

Download demo project and sample data extract Trulia.rar

Request

real estate website that I have trouble creating elements for, i enter a city and I tried to follow some of your video examples of creating a page template then creating additional template off the original page template in order to add content elements but no success..

 

Would like to capture address, price, square footage that is displayed in the first page listing all the properties, then follow the "view details" link and capture the full description and the zip code, bed/baths listed in table below the description.

 

seems like its written in a RIA web app that makes extraction very difficult.  Would love to see a breakdown of each step and the reasoning behind the steps in order to capture this data, obviously this would help in getting better at using Visual Web Ripper

 

thank you for any help in using your software to its fullest potential!

Solution

I assume your difficulties are related to the search result on the target website. Visual Web Ripper depends on web page events to mark elements in the browser, but this website prevents these events from firing on the search result. This means you'll need to use the tree view to select elements in the browser.

To use the tree view, first select the closest element you can mark (in this case the whole search result). Then use the tree view toolbar button to open the tree view.

The tree view will automatically open the node corresponding to the selected element (the whole search result). Now you need to examine the tree view to find appropriate elements to select. Right click on a tree node to select the corresponding element in the browser. Once you have selected an appropriate node, you can use the list options tab to create an element list.

The attached demo project is optimized to work in combined WebBrowser and WebCrawler mode. This is much faster than pure WebBrowser mode. The switch to WebCrawler mode occurs on the property details page, so when designing the property details template, you'll need to switch to WebCrawler mode to make sure you see what the WebCrawler sees. You use the WebCrawler toolbar button to switch to WebCrawler mode.

The demo project only submits a single location (New York, NY) to the search form, but you can define an input data source to provide a list of locations, or you can simply enter a list of locations for the location form field.

Download demo project and sample data extract Trulia.rar

Discuss this project

  Required Field - required field
Comment Required Field
Attachement
Loading...
Add
  • Very user friendly visual project designer.
  • Extract complete data structures, such as product catalogues.
  • Repeatedly submit forms for all possible input values.
  • Extract data from highly dynamic web sites including AJAX web sites.
  • Web data extraction scheduler with email notifications and logging.
  • Custom post-processing and comprehensive API.
  • Only $299 including 1 year maintenance.

© 2009-2010 Sequentum  |  Terms & Conditions  |  Privacy Statement  |  Login