An Easy Way to Extract Website Content
Visual Web Ripper makes it easy to extract website content. You can extract most content without any programming or custom scripting, but you still need to learn how to use our web scraper software.
How To Extract Content From a Website
Follow these three easy steps to extract data from a website with Visual Web Ripper.
Design a project in the Visual Editor
Navigate to a web site and design a template for each different type of page you want to extract content from
A template defines how content should be extracted from a particular webpage and all other web pages with a similar content structure
You design a template by clicking on the page content you want to extract, and then you
select the links and forms you want to activate to open new pages
Powerful tools are available to help you design the templates. You can repeat content
selections through a whole list, follow all links in an area, or repeatedly submit a form
for all possible input values.
Run the project directly from the designer or setup a schedule to run the project.
Data will be saved to your chosen data store (databases, spreadsheets, XML or CSV files).
The visual designer
The Visual Designer is used to create Visual Web Ripper projects. A project defines the rules Visual Web Ripper uses when extracting content from a website.
The Visual Designer contains a built-in Internet browser where you can use your mouse to select the content elements you want to extract and save. You only need to select the content elements on one webpage and Visual Web Ripper will automatically use this information to extract content from all other web pages with a similar content structure.
Once you have created a project you can run the project directly from the Visual Designer to extract data from web, or you can setup a schedule to run the project on a scheduled basis. If you have programming skills you can use the Visual Web Ripper API to run and modify projects directly from your own applications.
Important: Professional web scraping still requires certain skills. While you can use point and click to define and extract most content, you may still need basic XPATH and scripting skills to extract some difficult content.
The scheduler is an important part of Visual Web Ripper when you extract data from web. If you want to extract data from a website and make sure the data stays up-to-date, then you need to use the scheduler.
You simply tell the scheduler when it should start extracting data and with which frequency it should update the data. For example, you could setup the scheduler to extract data from web every night.
When you have setup the scheduler to collect content from a website, you can leave it all to Visual Web Ripper, and you will automatically have up-to-date data in your database. Visual Web Ripper uses very advanced selection techniques to make sure it can continue to extract data from a webpage even if the webpage changes over time. However, months or years later you might find that the website has changed completely and Visual Web Ripper is unable to collect data from the website anymore. In such an event Visual Web Ripper will send out an email notification to let you know you need to modify your project in order to continue to collect data from the website.
The programming interface
If you have .NET programming skills you can bring the full power of Visual Web Ripper straight into your own applications. The Visual Web Ripper API provides full access to every single feature in Visual Web Ripper.
The API lets you create, modify and run Visual Web Ripper projects. For example, you could modify the start URL just before running a project and thereby use the same project for many different start URLs. You could also set form field values at runtime to change how content is extracted from a website.
The API gives you direct access to extracted content, so you could built a post-processing module that processes the collected data after a project has run, but before it is saved to an external data store. You could also tell Visual Web Ripper not to save data to an external data store at all, and then use a post-processing module to write the data to you own custom data store.